Skip to article frontmatterSkip to article content

JupyterHub at temple.2i2c.cloud unreachable

FieldValue
Impact TimeAug 20 at 10:02 to Aug 20 at 10:15
Duration13m 46s

Overview

A DNS wildcard entry (.temple.2i2c.cloud) negated a different Yuvi Panda DNS wildcard entry (.2i2c.cloud), causing temple.2i2c.cloud to not resolve. This happened just before the community was about to do a demo.

Resolution

13m 46s Explicitly adding an A record fixed the issue.

Where We Got Lucky

. than right during. 2. Someone with access to namechap to be able to fix DNS was around during the outage

What Went Well

  1. Our automated alerts caught this issue just before the community reported it to us

What Didn’t Go So Well

  1. The original DNS entry itself was a game of telephone because the initial engineer setting up the new temple cluster did not have access to Namecheap, where our DNS is set up.

  2. The default TTL on DNS records in namecheap is 30minutes, which means any fixes we do may take that long to propagate.

Action Items

  1. Fix access to namecheap so everyone in the team can access it. If not, move DNS providers https://github.com/2i2c-org/ infrastructure/issues/6487

  2. Set our TTL to be 5min instead of 30min for all new DNS entries 2i2c-org/infrastructure#6646

  3. Document that if we add a *.X.2i2c.cloud DNS entry, we must also add a X.2i2c.cloud DNS entry https://github.com/2i2c-org/ infrastructure/issues/6646

Timeline

Aug 20, 2025

TimeEvent
9:21 AMDuring setup of new cluster for Temple, a DNS entry is made for *.temple.2i2c.cloud to point to the new cluster’s Ingress IP. Existing DNS entries are not touched, so no change is expected 10:02 AM Description:An uptime check on two-eye-two-see Uptime Check URL labels {project_id=two-eye-two-see, host=temple.2i2c.cloud} is failing.
10:02 AMA dig temple.2i2c.cloud returned a NOERROR but no entries. An explicit temple.2i2c.cloud A record is added via NameCheap DNS. The hypothesis is that adding *.temple.2i2c.cloud negated the *.2i2c.cloud (that temple.2i2c.cloud) was resolving to, thus resulting in the outage. The effect was delayed due to DNS caching
10:09 AMhttps://2i2c.freshdesk.com/a/tickets/3770 comes in 10:15 AM An uptime check on two-eye-two-see Uptime Check URL labels {project_id=two-eye- two-see, host=temple.2i2c.cloud} is failing.
10:32 AMDNS TTL was changed to 5min, as the default was 30min. It’s unclear if this actually had any real effects in propagation, but we tried it because the community had a demo coming up quickly.
10:47 AMDNS fully resolves properly for engineer working on the issue. An /etc/hosts entry workaround is offered to the community member trying to do the demo.
10:55 AMCommunity reports it works well for them.