Improved Error Handling in nbgitpuller

Background¶

nbgitpuller is a Jupyter Server extension that exposes a mechanism for synchronising remote content with the server’s local file-system. In the wild, its primary application lies in connecting JupyterHub users with hub-adjacent content through a simple distributable, user-friendly interface (URLs). By virtue of pulling remote content within an individual user’s server, it is often used to facilitate the separation of content from compute-environment definitions in contexts like JupyterHub and BinderHub.

There are two distinct personas that use nbgitpuller:

Link-author: People creating content that can be shared via an nbgitpuller link.
Link-consumer: People that use an nbgitpuller link to access shared content.

Between fetching remote content and merging conflicts with local edits, there are many ways in which nbgitpuller users can encounter errors during normal operation. Fixing these errors is neither the responsibility of link-author nor link-consumers. Instead, there is a third persona:

nbgitpuller expert: People with the technical expertise to debug problems encountered during nbgitpuller usage. It is an established nbgitpuller devlopment goal that this role dissapears in the future.

The existing UX for error handling confuses the persona of ngitpuller-experts with that of the link-consumer and link-author personas. As such, it leaves room for improvement, such as through the addition of error recovery mechanisms, or designing error responses that consider the needs of the link-consumer and link-author personas in addition to nbgitpuller-expert.

Technical details¶

nbgitpuller operates as a Jupyter Server extension that exposes a number of request handlers:

GET /git-pull/api — an API service endpoint
GET /git-pull/ — a user-facing UI for triggering and following a git pull operation.

The UI served at /git-pull/ communicates with the API backend from the front-end using server-sent-events.

When used alongside a JupyterHub, there is a strong separation of concerns between provisioning of the compute environment (JupyterHub and e.g. KubeSpawner) and provisioning of the file-system (nbgitpuller). Using the /hub/user-redirect/ endpoint, content authors can craft user-agnostic URLs that invoke the nbgitpuller service.

The nbgitpuller URL handler (e.g. GET /git-pull?repo=...) implements several operations to fulfil a request:

Remote content is fetched from a Git repository scoped to a specific branch (fetch).
Fetched content is merged with the local file-system, resolving any conflicts in an opinionated manner to minimise user-input (merge).
Redirect user to given URL path once (1) and (2) have been completed (open).

Deliverables¶

Add access to a Jupyter frontend following an `nbgitpuller` error¶

Overview¶

For some users, nbgitpuller links are the only way that they are familiar with to access a deployed JupyterHub. At present, when such a user encounters an error after following an nbgitpuller link, e.g. because the link is malformed, they find themselves without any navigation links or buttons that will take them to the “preferred”^[1] frontend e.g. JupyterLab. These users need a way to access the preferred frontend application without modifying the URL bar of their browser or otherwise navigating to the JupyterHub by themselves.

We will extend the existing error response of the nbgitpuller web UI to provide a means of accessing the preferred frontend, such as through the addition of a clickable link or button. An automatic redirect should not be used, as it hinders the ability of the link user to capture debugging information for the link author when the link fails.

Definition of done¶

Users can navigate through to the preferred frontend from any nbgitpuller error response.

Estimates¶

Task	Lower Estimate	Upper Estimate
Build routine to identify “preferred” UI application	2h	3h
Design and implement UI	1h	3h
Open pull-request and shepherd through to merge	2h	4h
Additional learning and refinement	1h	3h
Total	6h	13h

Redesign the error handling response for link-consumers¶

Overview¶

The existing error-handling response for nbgitpuller is a thin abstraction which exposes many of the error details to the user. In practice, many users may not be familiar with Git, and/or may have limited ability to interpret the error messages. When designing for the link-consumer persona, we should prioritise simple, readable error messages that provide sufficient scope for the link-author (e.g. Teaching Assistants, Lecturers). Crucially, it should be possible for the majority of nbgitpuller errors to be understood without the use of the existing console window to read error log output. A convenient way to share the log outputs should be added, such as a Copy to clipboard button.

Fundamental changes to the technology stack, such as introducing a new UI framework, are NOT in scope.

Definition of done¶

Users can encounter errors during nbgitpuller that provide a clear indication that an error occurred without the use of a console window.
Link authors can recover useful debugging information from screenshots and/or verbal descriptions of the error page.

Estimates¶

Task	Lower Estimate	Upper Estimate
Design and implement UI	5h	8h
Open pull-request and shepherd through to merge	2h	4h
Additional learning and refinement	1h	3h
Total	8h	15h

Identify common `nbgitpuller` errors¶

Overview¶

Within the space of possible errors that can occur during typical usage of nbgitpuller, there are several common classes, such as invalid links, renamed / deleted files, etc. Through inspection of logs from existing (large) nbgitpuller deployments, we will determine which nbgitpuller invocations failed, and the mechanism by which they failed (normalised by nbgitpuller URL). By analysing the resulting set of events, we will identify the most frequent failure modes normalised by link.

Definition of done¶

An array of structured nbgitpuller events has been generated from existing large JupyterHub deployments logs.
A set of common error types has been established from analysis of nbgitpuller event information.

Estimates¶

Task	Lower Estimate	Upper Estimate
Liaise with appropriate personas associated with existing JupyterHub deployments	4h	11h
Generate structured events from raw logs	3h	7h
Analyse nbgitpuller events to identify common error types	2h	4h
Open pull-request and shepherd through to merge	2h	4h
Additional learning and refinement	1h	3h
Total	12h	29h

Design and integrate dedicated error handlers¶

Overview¶

For the set of common error classes identified in the previous deliverable, we will design up to three bespoke responses that clearly articulate what went wrong to link users that encounter each error. Although it will be the responsibility of the link author to resolve these problems, improving the error message will help guide the user to useful documentation and/or provide more context for the link author when the error is reported.

The primary objective of this deliverable is to reduce the requirement for link authors to draw conclusions from inline console tracebacks. The approach taken in this work should naturally extend to alternative content providers, should they be added in future.

Once each error class has a dedicated response, nbgitpuller will be extended to return these responses when it identifies a particular error class has been encountered.

Definition of done¶

Users encountering one of three commonly-encountered errors are presented with a specialised error handler that provides useful context.

Estimates¶

Task	Lower Estimate	Upper Estimate
Build error-handling routines to process and identify common failure modes	3h	7h
Design and implement UI	7h	12h
Update nbgitpuller documentation	1h	2h
Open pull-request and shepherd through to merge	2h	4h
Additional learning and refinement	1h	3h
Total	14h	28h

Additional overheads¶

In addition to per-deliverable work, there is up-front work that may be paid by each developer:

Task	Lower Estimate	Upper Estimate
Become familiar with nbgitpuller architecture	2h	4h
Set up development environment	1h	2h
Total	3h	6h

We will assume that two separate developers incur this cost.

Relevant GitHub Issues¶

Listed below are pertinent GitHub Issues open in the jupyerhub/nbgitpuller repository:

People working on this¶

This project would require capacity from:

App Engineer (1 implementation, 1 review)

Timeline¶

Footnotes¶

Where “preferred” refers to either the pre-determined singleuser endpoint, or the application indicated in the urlPath query.
↩

Improved Error Handling in nbgitpuller

Background¶

Technical details¶

Deliverables¶

Add access to a Jupyter frontend following an nbgitpuller error¶

Overview¶

Definition of done¶

Estimates¶

Redesign the error handling response for link-consumers¶

Overview¶

Definition of done¶

Estimates¶

Identify common nbgitpuller errors¶

Overview¶

Definition of done¶

Estimates¶

Design and integrate dedicated error handlers¶

Overview¶

Definition of done¶

Estimates¶

Additional overheads¶

Relevant GitHub Issues¶

People working on this¶

Timeline¶

Add access to a Jupyter frontend following an `nbgitpuller` error¶

Identify common `nbgitpuller` errors¶