The issue occurred after a database patch cycle. During the patching process, an unforeseen complication arose when the services connected to the database underwent an unregistration and subsequent re-registration with both the API gateway and a registry service. Unfortunately, this simultaneous re-registration triggered an issue within the underlying GraphQL schema, leading to the incident and subsequent disruption.
To address the incident and prevent its recurrence, our team has taken the following actions:
Improved Connection Monitoring: We have enhanced the monitoring of connections to the service registry to proactively identify any anomalies or irregularities during the unregistration and re-registration process. This improvement will allow us to detect and mitigate issues promptly.
Schema Registration Process Review: Our team is reviewing the schema registration process to identify and rectify problems that contributed to the incident. This proactive approach will help ensure a more robust and resilient system moving forward.
We apologize for any inconvenience caused by this incident and understand the impact it may have had on your experience with our cloud service. Please be assured that we are committed to continuously improving our service reliability and taking the necessary steps to prevent similar disruptions.