Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Candidate Initiatives

These are initiatives that we are considering for future work. As we complete initiatives on our roadmap, we pull from this list for what to do next.

Allow users to publish interactive content served by JupyterHub0/0Fund this 💰

As a data scientist, I often want a way to share my analyses and reports in a way that others can quickly interact with. I must share these with stakeholders that are non-technical, or that don’t have the time to fire up their own kernels. I want these reports to be very fast to access, and powered by the same environments that power my interactive kernels. I’d like them to be persistently available to view, and ideally persistently executable as well.

Currently, there is no way to easily share a viewable version of my analysis, much less an interactive one, on the same hub. I’d have to publish to github and then send a link to that, or use a third-party service like nbviewer, notebooksharing.space, jupyterbook.pub, etc.

In short, I want a fast way to:

  • Create a report that includes interactive outputs

  • Make that report publicly available via a URL that is served from the hub (requiring authenticated access)

  • Have that report powered by computation on the hub

GitHub Initiative »

JupyterLite for 2i2c Communities0/0Fund this 💰

As a instructor I want to share interactive content with learners but to do not need to keep or manage the storage for these users. I am prepared to accept potentially incompatible libraries and restrictions loading data from external websites in exchange for the benefit of not requiring any cloud managed compute. My learners have modest memory and compute needs to execute the content.

GitHub Initiative »

Performance tune and test JupyterHub for ~10k active concurrent users0/0Fund this 💰

As a JupyterHub admin, I want high levels of confidence that my infrastructure can handle large amounts of users (~10k active concurrent users with 100k total users). I would like to be able to test my configuration by simulating these many users, and tweaking my configuration until I’m satisfied with the overall performance of my infrastructure. This is particularly helpful for me when I’m trying to run a MOOC, as I expect many users to be accessing my platform simultaneously

GitHub Initiative »

Securely autograde notebooks from students built with otter-grader0/0Fund this 💰

As an instructor making educational content, I want to use a community supported way to do automatic grading of students’ notebooks. I want this to provide them instant feedback as they work through the content, as well as a secure way to automatically grade their end result and post the grades back to the LMS (Learning Management System) I use (like Canvas).

As a student, I want to have a simple button I can click in my interface to submit my notebook for grading, see the grading progress and have the score be submitted to my LMS where I can see it.

GitHub Initiative »

Support customizing user resources and environments when launching from an LMS (like Canvas)0/0Fund this 💰

As an instructor, I want to specify what environment (packages, etc) and resources (RAM, GPU, CPU, etc) my students launch into based on what course they are launching from, as well as their role (TA, student, instructor, etc). This allows students to have customized experiences that are specific to the exercise I want them to do, without them having to understand accidental complexities like images, resource selection, etc. It also allows me as an instructor or my TAs to have higher resource limits, so I can experiment with course content authoring more easily.

As a JupyterHub administrator, I want to restrict the resource profiles (RAM, CPU, GPU, etc) my instructors can have access to, as a way to control overall cloud spend.

GitHub Initiative »

Build a friendly interface to allow instructors to create nbgitpuller links from within an LMS (like Canvas)0/0Fund this 💰

As an instructor, I want to create assignments within my university’s LMS (like Canvas) that allow my users to land in interactive content (like notebooks I authored) as part of their learning experience. The most common way to distribute content on JupyterHub that I use is to put my content on a git repository (like GitHub), and use nbgitpuller to distribute the content to my users. In each repository, I may have different labs, homeworks and course sections that put my students into different notebooks. To create links that point to the correct content, I have to use the nbgitpuller link generator and understand how to use the ‘Launch from Canvas’ option works, and copy a long URL over. This is more complex than other tools I can use from within my LMS, and more error prone. I would like a simpler workflow.

GitHub Initiative »

Support recording discrete events analytics about user actions0/0Fund this 💰

As a JupyterHub admin for an educational institution, I want to have a record of various actions (such as starting of a server, opening a notebook, etc) performed by my students to help me better understand their educational performance. I would like these events to be ingestable into the existing analytics systems we have, in a well structured and documented format that we can tie in. Because I care about their privacy and the usefulness of my data, I don’t want to indiscriminately collect ‘everything’ - only an explicit set of things that I’m interested in and can disclose that I am collecting.

GitHub Initiative »

Build governance for a `jupyterhub-contrib` organization0/0Fund this 💰

As a maintainer of JupyterHub, I want to users of JupyterHub with a lot of ancillary projects (like authenticators, mixins, spawners, etc) that make their use of JupyterHub better. However, I don’t want to then sign up the (limited) JupyterHub maintainer team to maintain an indefinite array of new things. I want to find a balanced way to indicate to end users ‘hey, this project seems to follow good standards and has a decent chance of being maintained’ without taking on the full responsibility of actually maintaining these projects forever. I would like this way to also provide some social capital for projects, a marker to incentivize good governance & technical standards and attract multi-stakeholder maintainership.

GitHub Initiative »

Refactor how repositories are fetched in repo2docker and binderhub0/0Fund this 💰

As a user of various binderhub installations (both on mybinder.org and for dynamic image building in repos), I find it difficult to specify what repositories to fetch content from. I would like to just paste a URL of my repository, and have the software ‘figure it out’. Instead, I have to explicitly understand what kind of repository I’m trying to fetch, and enter that appropriately in the UI.

As a maintainer of repo2docker and binderhub, we have a lot of repeated code in both projects to support different repositories (like git, zenodo, etc). Adding support to a new repository provider requires PRs to both these projects, which implement things similarly but slightly differently. This leads to wasted effort, difficulty in landing new features (such as automatic repo detection 🟢 Awesome bar/landing page redesign), and also in increased maintainer load with contributors. For example, both mercurial and swhid were added to repo2docker by contributors, yet they never made it to mybinder.org because we did not have capacity to review the equivalent PRs to binderhub. I also constantly notice that new projects would benefit from this functionality to fetch repos (like binderlite - a binderhub for jupyterlite, or jupyterbook.pub - a binderhub for jupyterbook rendering) - but will have to reinvent this functionality. I’d like to refactor repo2docker and binderhub to solve this problem.

GitHub Initiative »

Generalise cost monitoring system configuration0/0Fund this 💰

As a hub admin, I want the a cost monitoring system that is flexible and easily configurable for my specific deployment scenario. For example:

  • I am running a cluster on GCP and want to set configs for #7 and #9

  • I have cloud costs managed by CloudBank and I want to customise resource tags that conform to CloudBank’s resource tagging schema

GitHub Initiative »

Support embedding interactive notebooks in LMSes (like Canvas)0/0Fund this 💰

As an instructor, I construct my courses in the LMS (like Canvas) my institution provides to me. As part of the course, I would like my students to be able to have access to an interactive notebook either embedded within the LMS, or available to them at the click of a button (with no other intermediate steps). This will provide a seamless experience for students whenever they need to work with interactive content, without having to use an entirely different set of authentication flow.

GitHub Initiative »

Reduce the number of conflict resolution failures when using nbgitpuller to distribute content0/0Fund this 💰

As a student, I am expected to click links that are provided to me (through my LMS, course website, slack or other medium) that will launch a jupyter notebook pre-populated with content related to my assignment or class. This mostly works fine, and preserves any changes I make in my content, and am happy! But in some rare cases, it does not work, and throws me a scary black error box with messages about git that I don’t really understand. Usually reaching out to my TAs can fix this, but it causes me stress and lost time.

As a TA, I often have to use the JupyterHub admin interface to run git commands to fix errors faced by some students when using nbgitpuller to distribute materials. I would very much rather spend my time on helping them learn, so more automatic ways to handle errors here would save me a lot of time.

GitHub Initiative »

Support pulling content from private non-git sources with nbgitpuller0/0Fund this 💰

As an instructor, I want to distribute content to my students who are working on a JupyterHub easily. Other instructors use nbgitpuller with git to do so, and generally have a favorable experience, particularly with respect to merging content. However, I don’t use git or github for anything, and I do not have time to learn and use it correctly for just this one purpose. It doesn’t fit with how I develop content. I would like to be able to use the same supported mechanisms, without having to learn to use git or github. I also don’t want my content to be public - I want it only to be accessible to the students who are part of the class. My students already have access to an authenticated place where they can get data from, and I want to use the same workflow to distribute my content.

GitHub Initiative »

Support pulling content from public non-git sources with nbgitpuller0/0Fund this 💰

As an instructor, I want to distribute content to my students who are working on a JupyterHub easily. Other instructors use nbgitpuller with git to do so, and generally have a favorable experience, particularly with respect to merging content. However, I don’t use git or github for anything, and I do not have time to learn and use it correctly for just this one purpose. It doesn’t fit with how I develop content. I would like to be able to use the same supported mechanisms, without having to learn to use git or github. I don’t have an issue with making my content publicly available - just not with git.

GitHub Initiative »

Allow sharing my work selectively with non-hub users0/0Fund this 💰

As an end user, I’ve prepared content on the hub that I want to showcase interactively to people (specific decision makers who aren’t day to day users of the hub, the broad public, just a collaborator from a different org) who don’t necessarily have access to the hub. I want them to be able to simply click a link I share and have the experience I want them to have. I want this to be ephemeral, so they can come back multiple times and have the same experience from start to end, rather than polluted by previous times they have clicked this link.

As a JupyterHub admin responsible for cloud spend, I don’t want to spend an uncontrolled amount of money for people who are not my core users to access compute. I’m ok with having a specific amount of resources set aside for my users to share work, as long as it’s controlled and not open to the world. I would also like to have reports on what is being shared this way so I can justify the cloud spend.

GitHub Initiative »

Allow users to create shared folders with access control via a UI0/0Fund this 💰

As a user, I want to collaborate with other users on my hub on specific projects, via access to a shared directory that the users I collaborate with can have access to. This lets me have a quicker and more convenient way than to push externally to git and have them pull it to share work.

As a student, I am working on a group project with a few other students. I want to work together in a shared project directory, so we can minimize git overhead (which we are not yet comfortable with) on the same hub.

GitHub Initiative »

Allow admins to configure 'Start Server' page (profile list) via web UI0/0Fund this 💰

As a JupyterHub admin, I want to have control over what software environments and resource allocation options are available to my users when they try to start a server. I currently can interact with 2i2c support or make PRs myself, but this is cumbersome and due to timezone differences, can sometimes take days. This makes experimenting really difficult for me, as I need to often have several back and forths before we can make changes. Plus it is hard for me to exactly know what the changes will show without them being applied by an engineer. So I have to be very conservative about what I can ask for, and I don’t know what all the options are.

GitHub Initiative »

Provide per-user and per-group cloud cost reporting on GCP0/0Fund this 💰

As a jupyterhub admin, I am responsible for paying the cloud cost incurred by my hub. The policies I set and the information I provide my users can drastically alter how much money I have to spend. To understand how to best serve my users while staying within my budget, I want to know how much cloud cost each user is roughly responsible for. This allows me to reach out to them if necessary, as well as make reports to whoever is giving me money on who is using that money for what.

Since my hub may serve many distinct groups of users, I also want to have reports of cloud spend by the groups a user belongs to, so I can talk to the people responsible for those groups directly if needed, as well as justify my budget as the cloud cost is spent in service of the goals and accomplishments of these users and groups.

By having this information, I am better able to both:

  1. Nudge my users into better practices, through training and guidance

  2. Draw a direct line from the achievements of my users using the hub to the cloud cost I spend on them

This feature is already available for hub admins on AWS, but since my hubs are on GCP, I would like this feature too.

GitHub Initiative »

Provide per-hub and per-component cloud cost reporting on GCP0/0Fund this 💰

As a hub admin, I want to understand how much each hub my community is running costs me in cloud cost, so I can better advocate for their ongoing funding. I also want to understand how much each component (compute, storage, etc) costs so I can make intelligently discuss usage with my funders and users, as well as make informed choices about quotas and resource allocations.

This is currently already possible on AWS, but not so on GCP. Since my hubs run on GCP, I would like to be able to use this feature as well.

GitHub Initiative »

Allow archiving user home directories based on usage policies0/0Fund this 💰

As a hub admin, I have many users who are no longer using the hub (because they graduated, finished their projects, moved on to other infrastructure, etc) but still cost me money because I am continuously storing their home directories and paying for it. I want to not pay for those inactive users anymore.

GitHub Initiative »

Allow users to read / write from object storage like a filesystem0/0Fund this 💰

As an end user on the cloud, I have to use object storage (such as S3) for storing and accessing intermediate and final data products. While I use cloud native methods to do most of the work, in some cases it is very helpful to be able to access cloud object storage as if it was a traditional filesystem:

  1. When dealing with smaller intermediate and final data products produced by other systems (like an external job queue)

  2. As a way to use existing data exploration tools (including the Jupyterlab file browser) that work best with traditional filesystems

GitHub Initiative »