UToronto: Users who have never logged in before can't start servers
UToronto: Users who have never logged in before can’t start servers¶
| Field | Value |
|---|---|
| Impact Time | Oct 3 at 09:11 to Oct 3 at 09:34 |
| Duration | 23m 24s |
Overview¶
Azure and University of Toronto is using Azure File as home Yuvi Panda directory storage, and needs the chowning initcontainer. We had removed it earlier, causing new server startups for users who had never logged in before to fail. Restoring it just for utoronto fixed it.
Resolution¶
23m 24s
Where We Got Lucky¶
. accidentally (otherwise this would’ve persisted for at least 3 full days)
What Went Well¶
We were able to restore service pretty quickly once the report was acknowledged
What Didn’t Go So Well¶
Our alerting didn’t catch this, so we had to wait for the community to catch it and report it to us. This also slowed down our investigative work, because we don’t know exactly where the 500 error was from
Our logs had no mention of this particular username, and it is unclear why
Action Items¶
Understand why this didn’t trigger our server startup failure alert 2i2c
-org /infrastructure #6888
Timeline¶
Oct 2, 2025¶
| Time | Event |
|---|---|
| 8:00 AM | 2i2c |
Oct 3, 2025¶
| Time | Event |
|---|---|
| 7:00 AM | https:// |
| 9:11 AM | Acknowledged as an outage and created pagerduty P1 incident Description:UToronto: Users who have never logged in before can’t start servers (View Message) UToronto: Users who have never logged in before can’t start servers |
| 9:15 AM | Checking hub logs, both existing and in jupyterhub.log on the persistent dir for the username of the user who had issues turns up nothing. Issue with login service is considered - it is an ‘internal server error’, but without clear idea of which service it’s coming from. |
| 9:20 AM | An engineer is able to recreate the issue by deleting their own home directory and trying to start a server (details in 2i2c |
| 9:30 AM | 2i2c |