Testing and Upgrading New Packages
It is helpful to test package additions and upgrades for yourself before they are installed for all users. You can make sure the change behaves as you think it should, and does not break anything else. Once tested, request that the change by installed for all users by by creating a new issue in github,contacting cirriculum support staff, or creating a new pull request. Ultimately, thoroughly testing changes locally and submitting a pull request will result in the software being rolled out to everyone much faster.
Install a python package in your notebook
When testing a notebook with new version of the package, add the following line to a cell at the beginning of your notebook.
!pip install --upgrade packagename==version
You can then execute this cell every time you run the notebook. This will ensure you have the version you think you have when running your code.
To avoid complicated errors, make sure you always specify a version. You can find the latest version by searching on pypi.org.
Find current version of a python package
To find the current version of a particular installed package, you can run the following in a notebook.
!pip list | grep <name-of-package>
This should show you the particular package you are interested in and its current version.
Install/Update a R package in your RStudio
When the required version of package is missing in the R Studio, Try the following command to check whether the default installation repo contains the package (and the version) required.
install.packages("packagename")
This should install the particular package you are interested in and its latest version. You can find the latest version of a R package by searching on CRAN.
Find current version of a R package ===============================
To find the current version of a particular installed package, you can run the following in RStudio.
packageVersion("<name-of-package>")
This should show you the particular package you are interested in and its current version.
Submitting a pull request
Familiarize yourself with pull requests and repo2docker , and create a fork of the the image repo.
Set up your git/dev environment by following the instructions here.
Create a new branch for this PR.
Find the correct
environment.yml
file for your class. This should be in the root of the image repo.In
environment.yml
, packages listed underdependencies
are installed usingconda
, while packages underpip
are installed usingpip
. Any packages that need to be installed viaapt
must be added to eitherapt.txt
orDockerfile
.Add any packages necessary. We typically prefer using
conda
packages, andpip
only if necessary. Please pin to a specific version (no wildards, etc).- Note that package versions for
conda
are specified using=
, while inpip
they are specified using==
- Note that package versions for
Test the changes locally using
repo2docker
, then submit a PR tomain
.- To use
repo2docker
, be sure that you are inside the image repo directory on your device, and then runrepo2docker .
.
- To use
Commit and push your changes to your fork of the image repo, and create a new pull request at https://github.com/berkeley-dsep-infra/
<image-name>
.After the build passes, merge your PR in to
main
and the image will be built again and pushed to the Artifact Registry. If that succeeds, then a commit will be crafted that will update thePLACEHOLDER
field inhubploy.yaml
with the image’s SHA and pushed to the datahub repo. You can check on the progress of this workflow in your root image repo’sActions
tab.After 4 is completed successfully, go to the Datahub repo and click on the New pull request button. Next, click on the
compare: staging
drop down, and you should see a branch named something likeupdate-<hubname>-image-tag-<SHA>
. Select that, and create a new pull request.Once the checks has passed, merge to
staging
and your new image will be deployed! You can watch the progress here.
Tips for Upgrading Package
- Conda can take an extremely long time to resolve version dependency conflicts, if they are resolvable at all. When upgrading Python versions or a core package that is used by many other packages, such as requests, clean out or upgrade old packages to minimize the number of dependency conflicts.