Wednesday, 16 October 2013

Downtime

Yesterday for about 5 hours between 3am and 8am PST the Insightly web app was only intermittently available to some customers. During that time our Android and iPhone apps on phones and tablets were always available, as was our web based mobile app, Gmail gadget, Outlook app, and our API.

Now that we've fully restored functionality of the web app, I'd like to share some of the technical details with you. The issue was one that surfaced only under our peak load of thousands of concurrent users, and was not evident during testing or in our quality assurance environment. A database connection was not being closed in the correct way in one critical place in our code, and at scale this caused connection resources within the system to reach their maximum.

We will audit our change process and increase the automation to prevent this mistake from happening in the future. We're also putting in place further system checks and simulated loads to uncover occurrences of this type, and we're re-evaluating our deployment process to move to more of a staged rollout so we can pick up and rectify any such issues much more rapidly.

Last, but certainly not least, I want to apologize sincerely. I know just how critical our service is to our customers and we will do everything we can to learn from this event and use it to drive improvements across the business. As with any customer impacting operational issue, we will spend a lot of time over the coming days and weeks improving our understanding of the details of this event and determining how to make changes to improve our services and processes.

Thanks,
Anthony Smith
CEO.