Resolved -
All failed playbooks have been replayed.
Thank you for your patience throughout this incident.
May 16, 11:49 UTC
Update -
We are currently re-running the playbooks that failed during the incident. Once these executions are complete, the incident will be marked as fully resolved.
May 16, 10:00 UTC
Monitoring -
The backlog of safely queued events has been fully processed, and all systems (ingestion, search, and alerting) have returned to normal performance. We appreciate your patience and apologize for any inconvenience this disruption may have caused.
May 16, 09:39 UTC
Update -
We are continuing to work on a fix for this issue.
May 16, 08:43 UTC
Identified -
We have identified the root cause of the service disruption and applied a fix. Event ingestion, search, and alerting functionalities are now resuming. However, you may experience degraded performance and slower response times as our systems process the backlog of safely stored events.
May 16, 08:38 UTC
Investigating -
We have been experiencing an issue affecting event search, alerting and processing. Users may encounter errors when trying to search or access events. Please be assured that our data reception systems are fully operational and no events are being lost. All incoming data is safely stored and will be processed as soon as the issue is resolved. Our engineering team is actively working on a fix.
May 16, 07:52 UTC