Collaborating with Development Seed to deliver cyberinfrastructure for NASA VEDA
Thank you to Sajjad Anwar and Sanjay Bhangar for contributing to this post.
The 2i2c team are proud to continue our strong working collaboration with Development Seed , following our previous work on launching the US GHG center (also see the Development Seed blog post ). Together with scientists at NASA in our regular sync touchpoints, we have recently delivered a tranche of improvements to the Visualization, Exploration and Data Analysis (VEDA) project .
This platform is designed to thread open-source components together to consolidate GIS delivery mechanisms, processing, analysis and visualization tools, and presented in a collaborative interactive computing environment. All code repositories and associated resources stemming from this work are available on the VEDA GitHub page .
In the spirit of fully open development, you can see the objectives the combined 2i2c and Development Seed team had for the last quarter. In this blog post, we will describe some of the significant ones!
Better image management and testing #
The
repo2docker-action
is a GitHub action simplifying image building and testing for use with JupyterHub, using either a Dockerfile
or various
configuration files
(like requirements.txt
, environment.yml
, etc) supported by
repo2docker
. We migrated our image building pipeline from a somewhat homegrown solution to this upstream action, making image updates and testing much easier. In particular, we can
automatically run test notebooks
on every change we make to the image! This way, we can easily catch any breaking changes in library versions or other package installs without disrupting users. We also debugged and
contributed upstream
fixes to the testing infrastructure so everyone could benefit from this, rather than just us.
Automatically pulling example notebooks on startup #
When a user logs into a JupyterHub, it is very helpful if we could have a bunch of example notebooks and other content pre-populated for them so they can get started right away. nbgitpuller is heavily used for this particular use case. However, it requires that nbgitpuller is installed inside the image the user is using - and not all images have it installed. In particular, we wanted to continue using the (wonderful) Rocker images maintained upstream for R users, however they do not have nbgitpuller installed. To solve this problem we built jupyterhub-gitpuller-init , which can be used as an init container to pre-populate user content on persistent home directories regardless of the image used. We also made sure to build this in a way that anyone can use it, and it is not tied into either 2i2c or VEDA infrastructure!
Opening specific visualizations in QGIS via URL #
QGIS is the world’s most used open source GIS software, and previously 2i2c had worked with Openscapes and QGreenland to bring this desktop software to JupyterHub. We had previously worked on a container image that allows users to access large datasets stored in the cloud directly through QGIS on the JupyterHub, allowing users to work with much larger datasets than they could on their desktops by bringing cloud compute adjacent to the data. As a continuation of this work, we developed jupyter-remote-qgis-proxy , which builds QGIS specific features on top of jupyter-remote-desktop-proxy . In particular, it allows creation of shareable links that when clicked, opens specific datasets and layers in QGIS in a JupyterHub! You can see this in action:
This opens up exciting future possibilities. Imagine this exploration of the Camp Fire having an ‘Open in QGIS’ button that enables further exploration of the data without the user needing to download or install anything! Work will continue in the coming quarter towards achieving this vision.
We are also excited to see recent work in this space from QuantStack and Simula Labs , and will follow up to ensure an orderly transition to more web native workflows for existing users of QGIS in due time.
Better Profile Selection #
This is a continuation of our GESIS collaboration . In the path to deploying dynamic image building to end users, we wanted to stabilize jupyterhub-fancy-profiles enough to deploy to users of VEDA (and eventually everyone else). This is the primary interface users see after they log in to JupyterHub, and was ripe for UX improvements. The default interface looks like this:
The revamped one is much more streamlined and looks like this:
This is currently deployed to a staging hub and has helped us shake out a lot of bugs! We expect the improved interface will be rolled out to all users in the near future. We are also planning further development to make the user experience even better and smoother for everyone.
Supporting workshops #
End users benefiting from our work is what ultimately gives meaning to our work. To that end, we were very happy to support running workshops during this collaboration – see our related blog post US Greenhouse Gas Center supports summer school at CIRA for more information.
Ongoing Collaboration #
Delivering on these objectives in a timely way heavily depended on the success of the team collaboration. Sanjay Bhangar of Development Seed commented
Working closely with the 2i2c team on growing features to support users on the VEDA and GHG Center hubs has been absolutely amazing. With 2i2c’s deep experience in the Jupyter ecosystem, we have been able to implement some fairly complex features quite easily, and their strong open-source roots have ensured that whatever we work on is broadly useful to the wider Jupyter and scientific computing communities.
Take a look at the companion Development Seed blog post of this work.
This collaboration continues, and we have now published our objectives for the coming quarter . Watch this space!
Acknowledgements #
- Development Seed
- NASA IMPACT
- Tarashish Mishra , Julia Signell , Oliver Roick , Slesa Adhikari and Sanjay Bhangar for various code contributions towards these objectives