LIS hub is unreachable after ingress migration
| Field | Value |
|---|---|
| Impact Time | Mar 12 at 15:51 to Mar 12 at 16:12 |
| Duration | 21m 3s |
Overview¶
A change to the LIS ingress was performed as one of several clusters. A prior step to migrate the DNS records was missed or unsuccessfully applied, leading to an inability to resolve the hub at its domain name.
What Happened¶
A migration from ingress-nginx to nginx-ingress was performed, routing traffic through the nginx-ingress service. This was intended to incur little downtime, as the DNS records were previously supposed to have been migrated from the specific ingress service external IP to one that survives the decommissioning of the ingress-nginx deployment. However, this step was missed or failed, which meant that the act of migrating the ingress rendered traffic routed via the IP address resolved by DNS unable to reach the hub.
Resolution¶
To revert the ingress migration, and prepare for repeating it by reattempting the DNS migration.
Where We Got Lucky¶
We had both good alerts and good engagement with the technical contact.
The engineer had repeatedly tested forward and backward migration to feel confident in making the change immediately.
What Went Well¶
The act of reverting the change was quick and error-free.
What Didn’t Go So Well¶
The engineer missed only the LIS hub when checking each migration was successful.