Europe: Monitoring - Degraded Performance

Incident Report for The AskCody Platform

Postmortem

AskCody DNS Zone take-over - Post Mortem

‌

This Postmortem and technical timeline follows the public announced communication by CEO Allan Mørch on Monday, May 27, describing the fraudulent take-over on a sub-domain in AskCody, and its impact on AskCody Customers. You can see the full response to the cyber-attack from AskCody CEO Allan Mørch here.

Summary

On May 24-27 2024, AskCody experienced a DNS incident impacting our European domains. This incident involved unauthorised redirection of traffic to unwanted content. Our internal teams conducted thorough investigations and quickly identified and resolved the issue.
This report outlines the timeline, technical investigation, and measures taken to prevent future occurrences. All data and application layers, customer data and access, are intact and untouched with no customer or personal data being compromised. Also, during the attack, no applications or customer data was compromised as the attack only lead to unusual, fraudulent content appearing on app.onaskcody.com based on hijacked redirects in the network.

Incident overview

On May 24, 2024, at 13:29 CEST, our team noticed unusual content appearing on app.onaskcody.com. This content was not consistent with our services and was also found on other European subdomains such as eu.onaskcody.com and portal.onaskcody.com. Our US domains remained unaffected. For a full overview of the incident in details, we refer to your statuspage where all updates were provided during the incident, or to the previously mentioned announcement from our CEO.

Root cause analysis

The root cause of the incident was a bad actor exploiting a naming convention mistake in a public IP resource. This mistake allowed DNS requests intended for the West Europe cluster to be directed to a service hosted by the bad actor.

Investigation and actions taken

Initial response

May 24, 13:29 CEST: Detected unusual content on European domains

May 24, 14:00 CEST: Our development team began investigating the nature of the attack, considering possibilities such as cross-site scripting or DNS manipulation

Deep dive

May 24, 14:30 CEST - May 25, 04:30 CEST: Investigation of network traffic using tools like Wireshark. We monitored DNS queries and packet sniffing but did not find immediate discrepancies.

May 24, 15:30 CEST: Verified third-party integrations and redeployed services to ensure no malicious content was being served from our systems.

Critical discovery

May 25, 11:00 CEST: Created a new subdomain to redirect traffic and monitor the situation while mitigating customer impact.

May 25, 12:30 CEST: Noticed redirection to a sports streaming site, flagging it as unwanted content.

May 25, 18:10 CEST: Identified a misconfigured public IP resource due to a naming error during deployment, leading to unauthorised traffic redirection.

Resolution

May 25, 19:55 CEST: Corrected the misnamed public IP resource, ensuring all traffic was properly routed to AskCody servers. Continuous monitoring was maintained to ensure stability.

May 27, 09:32 CEST: Confirmed full resolution of the incident and notified stakeholders.

Preventative and supplementary measures

To prevent similar incidents in the future, we have:

Corrected the naming error
Enhanced alerting systems for DNS configurations

Conclusion

This DNS incident was significant event that we managed with urgency and thorough investigations. We apologize for any inconvenience caused and assure you that our systems are now stable and secure. We remain committed to maintaining the highest security standards, following Microsoft best practices and will continue to take proactive measures to safeguard our services.

If you have any questions or need further information, please do not hesitate to contact our support team.

Thank you for your understanding and continued trust in AskCody.

Sincerely,
AskCody

Posted Jun 06, 2024 - 15:36 CEST

Resolved

This incident has been resolved.

Posted May 27, 2024 - 09:32 CEST

Monitoring

Following the first response to the incident we experienced and reported Friday 1.51 pm CEST, where we posted a network incident caused disruption in access; over the course of the next 20 hours, we worked our way through the trail of this disruption leading to us learning about a subdomain takeover in one of our Azure zone in EU West. With external assistance from Microsoft all the way through this, including multiple other external experts, we have ensured what we continuously confirmed: that we were secure and intact in all data and application layers, while removing uncertainty everywhere else.

By Saturday, today, at 2.30 pm CEST, all systems were back being fully operational, and will continue to be. Since our finding of the the hijacked domain and sub domain takeover, we have added multiple supplementary measures to prevent the system from being impacted further, while we continue to work on mitigating and hardening our infrastructure further.

In more details, we've narrowed the attack down to a few steps in our internal routing in Azure Front Door (still depending on routing and ingress in other sub-services though). With this knowledge, we have redirected some traffic and removed the redirection rule in Front Door to render both eu, portal and app.onaskcody.com functional.

What happened was a subdomain takeover. More specifically on our EU West subdomain, therefor only being users on EU West affected. A set of our subdomains was taken over by a malicious actor and are currently serving content from another Azure resource, owned by another tenant - the attacker. We are in close contact with Microsoft for assistance in requesting these subdomains/endpoint host names back. This has no risk of any users, as no traffic is longer being routed to EU West and the routes and resources are being shut down.

This continues to confirm our assumptions from yesterday on DNS spoofing, infiltration of routing, and hijacking domains. Also, this explain why you could reload yourself out of the issue on Friday, as traffic is routed between EU West and EU North. If you hit North, you were fine, as this zone wasn't affected by the subdomain takeover. If EU West, the content of a response was being hijacked.

We are setting up alerts to notify us if this situation changes. I.e. if we are defaced on our site again. We will continue to work with Microsoft (via our premium support case) to establish the nature of the attack, the foothold and further our mitigation plan to hardening the system further. By now, we've already implemented multiple supplementary measures to prevent anything from happening again, while having completed a full test of our system.

We will continue to keep this in a monitoring state over the weekend, while we continue to work, and if all proceeds as listed and expected we will resolve this issue by Monday the 27th of May.

Following resolving the incident, a full post mortem will be made available.

Posted May 25, 2024 - 20:24 CEST

Update

We have been working all day and night and will continue to do so throughout this incident.

The status so far is that we can continuously confirm that nothing inside AskCody is infected or contaminated, there is no insider threat to our application layer and all data is intact and untouched. No data has been compromised.

We have redirected and rerouted traffic, so nothing lands on it spoofed domain, but we have still to ensure it is removed entirely.

We still need to understand how the attack works/worked, the root cause, and where the domain is hijacked/swapped and make a mitigation plan in terms of any vulnerability to avoid this can happen again. This documentation is key to get us back to fully operational, as this is the documention that customers need for their risk assesment.

Technically the platform is fully operational, apart from Central and Maps, which was the hijacked domain, and we have shut that down. If you go to eu.onaskcody.com, you simply get redirected to the dashboard in AskCody.

Right now, we are working on finding the root cause and preventing a similar risk from happening again, all while redeploying to a different domain server.

To achieve this, we are considering creating a new domain, redeploying the necessary components, and ensuring the traffic's security to and from it.
Microsoft plays a significant role in our solution, and we currently have two open cases with them. They are actively collaborating with us to resolve these issues.
We continue to do what we can, run through our lists and validate everything, we continue to monitor our application for any threats or potential threats in relation to this.

We will update you as soon as we can and have more information to share,

Posted May 25, 2024 - 10:33 CEST

Update

Affected Users: Workplace Central

We are still all-hands on deck and we will continue to be until resolved.

All data and application layers are intact and untouched and we are simultaneously monitoring on the rest of the platform. Our application has been redeployed and is fully functional as the issue is only related to the network layer and traffic to our platform, that has been compromised.

This will be the last update, unless something resolut appears between now and tomorrow morning. The next update will be at 9 am CEST tomorrow.

Posted May 24, 2024 - 23:54 CEST

Update

Affected Users: Workplace Central

We are still working on resolving this and are in close collaboration with Microsoft, and have been all through the day and evening. We also have our external consultants in on advising and assisting us in troubleshooting this to find the root cause and make sure we close any potential gaps.

While that is happening, we are constantly doing what we can to validate that All data and application layers are intact and untouched and we are simultaneously monitoring on the rest of the platform. Our application has been redeployed and is fully functional as the issue is only related to the network layer and traffic to our platform, that has been compromised.

We will provide the next update as soon as we can

Posted May 24, 2024 - 22:49 CEST

Update

Affected Users: Workplace Central

We have identified that traffic out of AskCody, meaning responses to requests send to our Portal, gets intercepted and overwritten before received, leading to bad requests or displaying hijacked websites and content accessing AskCody Workplace Central. The routing and network issues means that when a request is send to AskCody, the response is getting hijacked after the response is send from AskCody, but before being received, meaning it's outside of the AskCody application, that the traffic is hijacked and potentially compromised.

All data, files, and application layers are intact and uncompromised and we are simultaneously monitoring on the rest of the platform. Our application has been redeployed and is fully functional as the issue is only related to the network layer and traffic to our platform, that has been potentially compromised.

The issue has now been limited to Workplace Central meaning AskCody Bookings (add-in, mobile, displays, dashboards), Services, Visitors and Insights is again fully functioning.

We are right now working together with Microsoft Support to solve the issue as fast as possible.

Posted May 24, 2024 - 17:19 CEST

Update

Affected Users: Workplace Central

We are still experiencing a routing and network incident that caused access to the AskCody Portal and add-ins to be disrupted and are working to fix the issue. We have identified that the cause is an outage in Microsoft Azure Front Door, confirmed by Microsoft, that has been experiencing an unplanned degradation causing degraded functionality since 11:55 AM UTC, Friday may 24.

The access to the portal and add-ins are restored, but Workplace Central is still impacted.

All data and application layers are intact and untouched and we are simultaneously monitoring on the rest of the platform. Our application has been redeployed and is fully functional as the issue is only related to the network layer and traffic to our platform, that has been compromised.

Users may experience this issue as other pages blocking access to the portal and add-ins. As a temporary workaround, users can refresh add-ins and portal a few times to gain access again.
The next update will be provided at latest: as soon as we know more
Degraded Performance Definition: The affected component is working but is slow or otherwise impacted in a minor way. Does not affect downtime.

Posted May 24, 2024 - 16:00 CEST

Update

Affected Users: Management Portal Users and Add-in users
Region: Europe

We are still experiencing a routing and network incident causing access to the AskCody Portal and add-ins to be disrupted and are working to fix the issue. We have identified that the cause is an outage in Microsoft Azure Front Door, confirmed by Microsoft, that has been experiencing an unplanned degradation causing degraded functionality since 11:55 AM UTC, Friday may 24.

By refreshing a couple of times you still gain access, but we are working on removing the incident.

All data and application layers are intact and untouched and we are simultaneously monitoring on the rest of the platform. Our application has been redeployed and is fully functional as the issue is only related to the network layer and traffic to our platform, that has been compromised.

Users may experience this issue as other pages blocking access to the portal and add-ins. As a temporary workaround, users can refresh add-ins and portal a few times to gain access again.
The next update will be provided at latest: as soon as we know more
Degraded Performance Definition: The affected component is working but is slow or otherwise impacted in a minor way. Does not affect downtime.

Posted May 24, 2024 - 14:57 CEST

Update

We are continuing to work on a fix for this issue.

Posted May 24, 2024 - 14:57 CEST

Identified

Affected Users: Management Portal Users and Add-in users
Region: Europe

We are experiencing a network incident causing access to the AskCody Portal and add-ins to be disrupted. By refreshing a couple of times you still gain access, but we are working on removing the incident. All data and applikation layers are intact and untouched and we are simultaneously monitoring on the rest of the platform.
Users may experience this issue as other pages blocking access to the portal and add-ins.
As a temporary workaround, users can refresh add-ins and portal a few times to gain access again.

The next update will be provided at latest: as soon as we know more

Degraded Performance Definition: The affected component is working but is slow or otherwise impacted in a minor way. Does not affect downtime.

Posted May 24, 2024 - 13:51 CEST

This incident affected: Visitor Management (Europe) (Outlook Add-in, Visitor Management Portal), Meeting Services (Europe) (Outlook Add-in, Meeting Services Portal), Room Booking (Europe) (Outlook Add-in, Room Management Portal, Workplace Central), and Workplace Insights (Europe) (Power BI Dashboard).