Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

2i2c Platform Roadmap

The table below shows the platform initiatives in our near-term roadmap. They are pulled from the 2i2c-org/initiatives repository. We invite you to comment on anything there! See the About the roadmap page for context on our workflow and funding model.

🚀 In-Flight Initiatives

TITLESUMMARYWORKPLAN
Refactor and improve the jupyterbook.org documentation to make it more accessible and easier to maintain

As a Jupyter Book maintainer, I want to be able to update our community-facing documentation independently from our user-facing documentation. I want those updates to be immediately available to our community members without waiting for release cycles to update. I also want to know where to go to update different kinds of documentation.

As a Jupyter Book community member and user, I want an easy-to-remember place to refer to for important community documentation. For anything with a single source of truth (e.g., an events calendar), I don’t want to have multiple versions of the same page available because it’s tied to multiple “versions” of software in ReadTheDocs.

Allow verifying continual executability of notebook content on binder with CI/CD

As an educator in the scientific python space, I want to write and publish notebooks that demonstrate how to perform various tasks for people in my community. However, the ecosystem moves fast and it is critical that these notebooks continue to function with these changes, rather than only at time of writing. I would like to use a CI/CD system to make sure these work fine in the long term. Since we use GitHub for content authoring, I could use GitHub actions for verifying some of these. However, since we run things on binder, we would like to be able to automatically use and verify that things work on Binder.

5 sub-issues
Run an experimental mybinder.org federation member on OVH

As the leader of an organization that benefits from the broader scientific python ecosystem, I would like to find ways to contribute back on behalf of my institution. I know we benefit from capacity and smooth operations on mybinder.org, and I would like to put some money towards sustaining that. However, handling procurement and getting my institution to pay out a few hundred dollars consistently every month comes with very high overhead and not worth it. Since we have a contractual relationship with 2i2c, I would like to support mybinder.org through cloud spend through 2i2c.

As a volunteer member of the team running mybinder.org, I would like to run our infrastructure on a diverse set of cheap hosting providers. This gives us extra capacity but also resiliency, as a single provider kicking us out would not shut us down completely. Since we provide arbitrary code execution, we are prime targets for cryptominers and other forms of network abuse that hosting providers don’t like. Handling abuse reports from hosting providers is stressful, and if I knew that being kicked out (even temporarily) will not cause a full outage, I’ll be a lot less stressed.

6 sub-issues
Implement a way to quota users based on their compute usage over time

As a JupyterHub admin who supports a broad swath of different kinds of users, I want ways to control costs by controlling how much resources (RAM, CPU) each user can use. I want to be able to make this determination based on the groups the user is in, and have user friendly ways of having the users know how much they can use, and how much they have used.

6 sub-issues
Improve nbgitpuller error handling

As a student, I am expected to click links that are provided to me (through my LMS, course website, slack or other medium) that will launch a jupyter notebook pre-populated with content related to my assignment or class. This mostly works fine, and preserves any changes I make in my content, and am happy! But in some rare cases, it does not work, and throws me a scary black error box with messages about git that I don’t really understand. Usually reaching out to my TAs can fix this, but it causes me stress and lost time.

As a TA, I often have to use the JupyterHub admin interface to run git commands to fix errors faced by some students when using nbgitpuller to distribute materials. I would very much rather spend my time on helping them learn, so more automatic ways to handle errors here would save me a lot of time.

9 sub-issues
Allow end users to view their storage usage

As an end user, I have a maximum amount of home directory storage I can use on a hub. However, there is no easy way for me to know how much I have used, so I bump into storage quota limitations as a surprise.

📋 Upcoming Initiatives

TITLESUMMARYWORKPLAN
Allow limiting access to dask-gateway based on group membership

As a JupyterHub admin, I want to control how much cloud spend my hub uses. One of the important components is how much resources people can use at a time. While JupyterHub lets me control that by restricting resource allocations to different groups, dask-gateway does not. So a user can intentionally or accidentally spin up a huge dask-gateway cluster and use all the cloud spend I have budgeted for a month in a few hours. As a first step measure to protect from this, I want to control who can access dask-gateway in my hub, so I can grant access only to people who have been vetted.

Support canvas authentication

As an IT department at a large university, I would like to use Canvas as our single source of truth for all sorts of user management information (such as class enrollment, active status, etc). Since our JupyterHub is exclusively provided for student use, I would like to use Canvas as the source of authentication and authorization information for our JupyterHub. I would like to pull in group information based on class enrollment and similar information, so I can get insight into various features provided by JupyterHub based on group membership (access control, cost usage, usage monitoring, etc)

Provide opinionated guidance on how communities can structure groups from their authenticator

As a JupyterHub admin, I want to put users in different kinds of ‘groups’ for different purposes:

  1. For access to the hub

  2. For cost monitoring and reporting

  3. For access to different size of resources (more RAM, GPUs, etc)

  4. For admin access (to other users’ servers and home directories and shared directories)

  5. For potential collaboration

All these are groups under different ‘axes’, and I don’t know how to structure groups from my authenticator to support these use cases. There are many different ways to do this, and I don’t have any guidance or expertise to figure out what the options are, and how to choose them.

Support running hubs on my NRP allocation

As a JupyterHub admin, I want to leverage all possible kinds cloud resources available to me to serve my users, rather than just commercial cloud providers (like AWS, GCP, etc). I’ve limited budget that goes towards everything I do, and being able to reduce the cloud spend $$$ lets me put that money towards other things that add value. Due to my affiliation, I have access to an allocation of resources on the National Research Platform. I want something a little more customized than NRP’s hosted jupyterhub, and I don’t have the skillset (or time) to deploy my own JupyterHub there. I would like to be able to use NRP’s resources (particularly GPUs) to serve my users.

Move to Helm v4 for our infrastructure

We use helm to deploy the JupyterHub chart z2jh. The helm project released v4 in September of 2025. We need to structurally keep up with these releases in time, so we don’t have to scramble when the older version gets end-of-life. We currently don’t know if zero to jupyterhub works with helm v4, so we can not ‘simply’ upgrade.

4 sub-issues
Allow creating links to specific hub server options

As a user on the hub, I want to have someone else get into the exact same kind of server I am on (environment, resource allocation, content pulled, etc) so we can work together with less accidental complexity caused by underlying server differences. Currently to do so, I have to explicitly give them verbal instructions on what options to choose or type in their ‘Start Server’ page (‘Select the JupyterLab instance, and pick the 14.1 GB resource allocation, then click start’), which is error prone. It is particularly error prone when those instructions use features such as ‘Unlisted choice’ (‘Select Other..., and type in this exact image, and avoid spaces’) or ‘Build your own image’. I want an easier and more succint way to share this that doesn’t involve verbal instructions.

Allow users to read / write from object storage like a fileystem

As an end user on the cloud, I have to use object storage (such as S3) for storing and accessing intermediate and final data products. While I use cloud native methods to do most of the work, in some cases it is very helpful to be able to access cloud object storage as if it was a traditional filesystem:

  1. When dealing with smaller intermediate and final data products produced by other systems (like an external job queue)

  2. As a way to use existing data exploration tools (including the Jupyterlab file browser) that work best with traditional filesystems

6 sub-issues
Support running GPUs on Jetstream2

As a hub admin, I want to provide GPU access for my instructors & students so they can experiment with modern code that requires GPU. However, GPUs in commercial cloud are expensive and I want to balance how much $$$ I spend on the cloud. Due to my institutional affiliation, I have access to credits on Jetstream2, and I would like to run a hub with GPU access there.

Support running fixed cost hubs on Hetzner / OVH

As an instructor in a smaller or under-resourced college or university, I want to use JupyterHub to provide interactive instructional help for my students. However, we have a limited and fixed budget, and I must know beforehand how much it is going to cost each month. I am told that commercial cloud providers with autoscaling will have cloud spend ‘based on usage’, and the estimates I am given are beyond my current reach. My class sizes are reasonably small, and I would love to have a fixed cost pathway towards running a hub. This should also help me prove to my institution that this is a valuable service that we should pay for as an organization.

Support providing cost effective % of GPUs for students on a hub

As an instructor, I want to teach content that requires use of a not very powerful GPU some of the time. It’s integrated into my teaching syllabus, where there is a lot of CPU bound work and some GPU bound work. While I could bounce to a different tool that offers just GPUs (like Google Colab) that has its own set of compliance, pedagogy and UX problems that detract from my teaching and I don’t want my students to go through that.

As a student, I want to do all my course content in one familiar hub where I have access to similar tools and my home directory is the same over time for any particular course, without bouncing between different tools. I also don’t want to have to understand the differences between various GPUs, etc before I do my work, as I’m just doing light GPU work to understand fundamentals.

As a hub admin, I want to support my instructors using GPUs but without blowing through my budget entirely.

Support pulling content from public non-git sources with nbgitpuller

As an instructor, I want to distribute content to my students who are working on a JupyterHub easily. Other instructors use nbgitpuller with git to do so, and generally have a favorable experience, particularly with respect to merging content. However, I don’t use git or github for anything, and I do not have time to learn and use it correctly for just this one purpose. It doesn’t fit with how I develop content. I would like to be able to use the same supported mechanisms, without having to learn to use git or github. I don’t have an issue with making my content publicly available - just not with git.

Support pulling content from private non-git sources with nbgitpuller

As an instructor, I want to distribute content to my students who are working on a JupyterHub easily. Other instructors use nbgitpuller with git to do so, and generally have a favorable experience, particularly with respect to merging content. However, I don’t use git or github for anything, and I do not have time to learn and use it correctly for just this one purpose. It doesn’t fit with how I develop content. I would like to be able to use the same supported mechanisms, without having to learn to use git or github. I also don’t want my content to be public - I want it only to be accessible to the students who are part of the class. My students already have access to an authenticated place where they can get data from, and I want to use the same workflow to distribute my content.

Improve monitoring of GPU usage by users

As a JupyterHub admin, I want to provide GPU access to my users, but want to know if they are using it efficiently. GPUs are expensive, and we pay for them when we provision them, regardless of how much they are used. So I want to be able to tell if users are heavily underutilizing them, so I can talk to them to provide more educational resources or investigate if their needs are met by CPU-only tooling already. I also want to tell if someone is maxing out their GPU, so I can offer them bigger resources if needed. This reporting also helps me justify my cloud spend budget, as I can point to end users utilizing the resources I provide.

Reduce the number of conflict resolution failures when using nbgitpuller to distribute content

As a student, I am expected to click links that are provided to me (through my LMS, course website, slack or other medium) that will launch a jupyter notebook pre-populated with content related to my assignment or class. This mostly works fine, and preserves any changes I make in my content, and am happy! But in some rare cases, it does not work, and throws me a scary black error box with messages about git that I don’t really understand. Usually reaching out to my TAs can fix this, but it causes me stress and lost time.

As a TA, I often have to use the JupyterHub admin interface to run git commands to fix errors faced by some students when using nbgitpuller to distribute materials. I would very much rather spend my time on helping them learn, so more automatic ways to handle errors here would save me a lot of time.

Allow admins to browse all user's home directories via a UI

As a JupyterHub admin, I sometimes need to manually perform operations on a user’s home directory. Some examples are:

  1. They are no longer here, and we want to clean up their directory after sending them a copy to reduce our cloud spend

  2. They have run up on their storage limit, and their server won’t start due to the image we use. I wanna go manually clean some files out so they can start their server

I want to be able to perform these rare operations without risk of accidentally destroying user data.

Allow archiving user home directories based on usage policies

As a hub admin, I have many users who are no longer using the hub (because they graduated, finished their projects, moved on to other infrastructure, etc) but still cost me money because I am continuously storing their home directories and paying for it. I want to not pay for those inactive users anymore.

Provide per-hub and per-component cloud cost reporting on GCP

As a hub admin, I want to understand how much each hub my community is running costs me in cloud cost, so I can better advocate for their ongoing funding. I also want to understand how much each component (compute, storage, etc) costs so I can make intelligently discuss usage with my funders and users, as well as make informed choices about quotas and resource allocations.

This is currently already possible on AWS, but not so on GCP. Since my hubs run on GCP, I would like to be able to use this feature as well.

Provide per-user and per-group cloud cost reporting on GCP

As a jupyterhub admin, I am responsible for paying the cloud cost incurred by my hub. The policies I set and the information I provide my users can drastically alter how much money I have to spend. To understand how to best serve my users while staying within my budget, I want to know how much cloud cost each user is roughly responsible for. This allows me to reach out to them if necessary, as well as make reports to whoever is giving me money on who is using that money for what.

Since my hub may serve many distinct groups of users, I also want to have reports of cloud spend by the groups a user belongs to, so I can talk to the people responsible for those groups directly if needed, as well as justify my budget as the cloud cost is spent in service of the goals and accomplishments of these users and groups.

By having this information, I am better able to both:

  1. Nudge my users into better practices, through training and guidance

  2. Draw a direct line from the achievements of my users using the hub to the cloud cost I spend on them

This feature is already available for hub admins on AWS, but since my hubs are on GCP, I would like this feature too.

Support real time collaboration between users on JupyterLab

As a researcher, I want to be collaborate with another user (or users) on my hub by temporarily granting them real time access to the notebook I’m working on in a secure way.

As a student, I want to temporarily grant real time access to my notebook to a TA so they can help me with some conceptual or coding problems I am having.

Educate users on using the dynamic image building feature to make image management easier

As a JupyterHub admin, I want to support my end users needing their own software environments. However, many of them don’t know how to handle docker and image management, and I don’t want to spend all my time managing environments for them. The “Build your own image” feature on 2i2c JupyterHubs solves this problems very well, but there isn’t enough documentation for end users for me to point to. I would love for a single location I can point them to that will guide them through using that feature, and why they can.

As a researcher, I want to have a consistent image with all the packages I need that will work over the course of my work, without being stuck with just the images my admin has available for me. However, I don’t know enough about docker to set that up, and I don’t want to. My JupyterHub admin told me I could use the ‘build your own image’ feature, which ‘works just like mybinder.org’. But I don’t know what any of that means. I would like to have a series of tutorials, and how-tos on how to set up and use images this way.

Allow users to create shared folders with access control via a UI

As a user, I want to collaborate with other users on my hub on specific projects, via access to a shared directory that the users I collaborate with can have access to. This lets me have a quicker and more convenient way than to push externally to git and have them pull it to share work.

As a student, I am working on a group project with a few other students. I want to work together in a shared project directory, so we can minimize git overhead (which we are not yet comfortable with) on the same hub.

Allow sharing my work selectively with non-hub users

As an end user, I’ve prepared content on the hub that I want to showcase interactively to people (specific decision makers who aren’t day to day users of the hub, the broad public, just a collaborator from a different org) who don’t necessarily have access to the hub. I want them to be able to simply click a link I share and have the experience I want them to have. I want this to be ephemeral, so they can come back multiple times and have the same experience from start to end, rather than polluted by previous times they have clicked this link.

As a JupyterHub admin responsible for cloud spend, I don’t want to spend an uncontrolled amount of money for people who are not my core users to access compute. I’m ok with having a specific amount of resources set aside for my users to share work, as long as it’s controlled and not open to the world. I would also like to have reports on what is being shared this way so I can justify the cloud spend.

Allow admins to configure 'Start Server' page (profile list) via web UI

As a JupyterHub admin, I want to have control over what software environments and resource allocation options are available to my users when they try to start a server. I currently can interact with 2i2c support or make PRs myself, but this is cumbersome and due to timezone differences, can sometimes take days. This makes experimenting really difficult for me, as I need to often have several back and forths before we can make changes. Plus it is hard for me to exactly know what the changes will show without them being applied by an engineer. So I have to be very conservative about what I can ask for, and I don’t know what all the options are.