Service Outage Oct 24, 2013 – Maintenance Today

At approximately 2:14pm PT on Oct 24, 2013, Tddium’s  DB master server experienced a CPU usage spike that cascaded into to a server stoppage.  No data was lost.

Examining data (thanks New Relic!) and logs, our conclusion is that though average usage hovers around 20-30%, our DB master has burst CPU usage close to 100%.  Once postgres crosses into “queue backup” territory, it never comes back.

Tonight, we will upgrade our DB cluster to use faster servers.  This upgrade should only take a few minutes, but it will require the app to be down.

We appreciate your patience as we address these infrastructure issues.

Thanks,

– The Solano Labs Team

Post a Comment