Operational: Solved - Outage on EU Platform - AskCody Services & Visitor module
Incident Report for The AskCody Platform
Postmortem

Post-Mortem Report: AskCody Database Incident (Nov 17, 2023 - Nov 20, 2023)

At AskCody, we believe in maintaining a transparent and responsible relationship with our customers. This report is intended to provide a comprehensive overview of the recent database incident and our response to it.

Incident Overview

  • Start: November 17, 2023, 16:00 CET
  • Resolution Applied: November 20, 2023, Afternoon
  • Status Update: November 21, 2023, Morning

Incident Description

The incident involved a 'lock wait timeout exceeded' error in our database system, caused by complex interactions within our database infrastructure that led to transaction delays and timeouts.

Timeline of Events

  • Initial Detection: The issue was first identified on November 17, 2023.
  • Investigation Period: Between November 17 and 20, 2023, we worked closely with Microsoft support to analyze and address the issue.
  • Resolution Implementation: A fix was successfully applied in the afternoon of November 20, 2023.
  • Status Update: The status page was updated the following morning, on November 21, 2023.

Root Cause Analysis

The precise root cause remains undetermined, but it appears to be related to previously unencountered behaviors within the MS Azure database infrastructure.

Remedial Actions and Prevention

  • Collaboration with Microsoft: We have intensified our collaboration with Microsoft to enhance our database management practices.
  • System Adjustments: Various system improvements have been implemented to mitigate the risk of similar issues.

Current Status

Following the application of the fix, we have been diligently monitoring our systems. As of now, we have seend no further issues.

Our Commitment

We apologize for any inconvenience this incident may have caused. Ensuring the reliability and efficiency of our services is our top priority, and we are committed to continuous improvement.

Thank you for your understanding and continued support.

Sincerely,
The AskCody Team

Posted Nov 30, 2023 - 14:58 CET

Resolved
After monitoring the implemented solution, the major outage has been resolved and the Post Mortem will follow shortly
Posted Nov 30, 2023 - 14:57 CET
Update
Affected Users: All Services users and Service Portal users, and all Visitor users and Visitor Portal users
Region: Outside of North America

The platform has been running since the implemented solution, without any measured outage or degraded performance. We will keep this in a monitoring state, until we can be sure the incident is fully solved.

We will provide an update whenever there are relevant news or the status changes.
Posted Nov 23, 2023 - 09:06 CET
Update
Affected Users: All Services users and Service Portal users, and all Visitor users and Visitor Portal users
Region: Outside of North America

The platform has been running since the implemented solution and we will continue to monitor the progress.
For you as a user, this means you should be able to use the Portal, potentially experiencing a little delay.

We will provide an update whenever there are relevant news or the status changes.

Degraded Performance Definition: The affected component is working but is slow or otherwise impacted in a minor way. Does not affect downtime.
Posted Nov 21, 2023 - 13:42 CET
Update
Affected Users: All Services users and Service Portal users, and all Visitor users and Visitor Portal users
Region: Outside of North America

The platform has been running since the implemented solution and are continuing to be available. We will keep this in monitoring until we are certain before we put it back to operational.
For you as a user, this means you should be able to use the Portal, potentially experiencing a little delay.



The next update will be provided tomorrow at the latest at 8 am CEST

Major Outage Definition: A component is unavailable for all users. Affects the up-time calculation 100%.
Posted Nov 21, 2023 - 10:15 CET
Monitoring
Affected Users: All Services users and Service Portal users, and all Visitor users and Visitor Portal users
Region: Outside of North America

We are monitoring our implemented solution, to see if it can hold the full load of a mornings operations. We are closely following along, and will update this according to the way it develops.

Users may experience timeouts and that login is unavailable


The next update will be provided tomorrow at the latest at 8 am CEST

Major Outage Definition: A component is unavailable for all users. Affects the up-time calculation 100%.
Posted Nov 21, 2023 - 08:03 CET
Update
Affected Users: All Services users and Service Portal users, and all Visitor users and Visitor Portal users
Region: Outside of North America


The cause of the major outage in the Service and Visitor module has been identified, and we are still in dialogue with Microsoft to resolve the matter.
Users may experience timeouts and that login is unavailable


The next update will be provided tomorrow at the latest at 8 am CEST

Major Outage Definition: A component is unavailable for all users. Affects the up-time calculation 100%.
Posted Nov 20, 2023 - 21:59 CET
Update
Affected Users: All Services users and Service Portal users, and all Visitor users and Visitor Portal users
Region: Outside of North America


The cause of the major outage in the Service and Visitor module has been identified, and we are still in dialogue with Microsoft to resolve the matter.
Users may experience timeouts and that login is unavailable


The next update will be provided at latest: 10 pm CEST

Major Outage Definition: A component is unavailable for all users. Affects the up-time calculation 100%.
Posted Nov 20, 2023 - 19:24 CET
Update
Affected Users: All Services users and Service Portal users, and all Visitor users and Visitor Portal users
Region: Outside of North America


The cause of the major outage in the Service and Visitor module has been identified, and we are in dialogue with Microsoft to resolve the matter.
Users may experience timeouts and that login is unavailable


The next update will be provided at latest: 8 pm CEST

Major Outage Definition: A component is unavailable for all users. Affects the up-time calculation 100%.
Posted Nov 20, 2023 - 15:02 CET
Identified
Affected Users: All Services users and Service Portal users, and all Visitor users and Visitor Portal users
Region: Outside of North America


The cause of the major outage in the Service and Visitor module has been identified, and we are now working to find a solution.
Users may experience timeouts and login being unavailable


The next update will be provided at latest: 3 pm CEST

Major Outage Definition: A component is unavailable for all users. Affects the up-time calculation 100%.
Posted Nov 20, 2023 - 12:44 CET
Update
We are experiencing the major outage to be ongoing and that the implemented solution were inefficient. We are working towards a new solution as quickly as we can.

The next update will be provided once we know more, or at the latest at 1 pm CEST
Posted Nov 20, 2023 - 11:16 CET
Monitoring
All modules are fully operational again.
We are monitoring the situation and will update with any relevant information.

The next update will be provided once we know more, or at latest: November 20th
Posted Nov 17, 2023 - 18:30 CET
Investigating
Affected Users: All Services users and Service Providers
Region: Outside of North America

We are currently experiencing a major outage in the AskCody Services module and we are now investigating the impact.
Users may experience this issue as being unable to place or edit Service requests in the Outlook add-in, or create ad-hoc requests directly in the AskCody Management Portal.

The next update will be provided once we know more, or at latest: November 20th

Major Outage Definition: A component is unavailable for all users. Affects the up-time calculation 100%.
Posted Nov 17, 2023 - 16:00 CET
This incident affected: Visitor Management (Europe) (Outlook Add-in, Visitor Management Portal, Check-in kiosk) and Meeting Services (Europe) (Outlook Add-in, Meeting Services Portal).