Testing and Upgrading New Packages
It is helpful to test package additions and upgrades for yourself before they are installed for all users. You can make sure the change behaves as you think it should, and does not break anything else. Once tested, request that the change by installed for all users by by creating a new issue in github,contacting cirriculum support staff, or creating a new pull request. Ultimately, thouroughly testing changes locally and submitting a pull request will result in the software being rolled out to everyone much faster.
Install a python package in your notebook
When testing a notebook with new version of the package, add the following line to a cell at the beginning of your notebook.
!pip install --upgrade packagename==version
You can then execute this cell every time you run the notebook. This will ensure you have the version you think you have when running your code.
To avoid complicated errors, make sure you always specify a version. You can find the latest version by searching on pypi.org.
Find current version of a python package ===============================
To find the current version of a particular installed package, you can run the following in a notebook.
!pip list | grep <name-of-package>
This should show you the particular package you are interested in and its current version.
Install/Update a R package in your RStudio
When the required version of package is missing in the R Studio, Try the following command to check whether the default installation repo contains the package (and the version) required.
install.packages("packagename")
This should install the particular package you are interested in and its latest version. You can find the latest version of a R package by searching on CRAN.
Find current version of a R package ===============================
To find the current version of a particular installed package, you can run the following in RStudio.
packageVersion("<name-of-package>")
This should show you the particular package you are interested in and its current version.
Submitting a pull request
Familiarize yourself with pull requests and repo2docker , and create a fork of the datahub staging branch.
Set up your git/dev environment by following the instructions here.
Create a new branch for this PR.
Find the correct
environment.yml
file for your class. This should be underdatahub/deployments/<class or hub name>/image
In
environment.yml
, packages listed underdependencies
are installed usingconda
, while packages underpip
are installed usingpip
. Any packages that need to be installed viaapt
must be added to eitherdatahub/deployments/<class or hub name>/image/apt.txt
ordatahub/deployments/<class or hub name>/image/Dockerfile
.Add any packages necessary. We typically prefer using
conda
packages, andpip
only if necessary. Please pin to a specific version (no wildards, etc).- Note that package versions for
conda
are specified using=
, while inpip
they are specified using==
- Note that package versions for
Test the changes locally using
repo2docker
, then submit a PR tostaging
.- To use
repo2docker
, you have to point it at the right Dockerfile for your class. For example, to test the data100 datahub, you would runrepo2docker deployments/data100/image
from the base datahub directory.
- To use
Commit and push your changes to your fork of the datahub repo, and create a new pull request at https://github.com/berkeley-dsep-infra/datahub/.
Once the PR is merged to staging, you can test it out on
class-staging.datahub.berkeley.edu
.Changes are only deployed to datahub once the relevant Travis CI job is completed. See https://circleci.com/gh/berkeley-dsep-infra/datahub to view Travis CI job statuses.
Tips for Upgrading Package
- Conda can take an extremely long time to resolve version dependency conflicts, if they are resolvable at all. When upgrading Python versions or a core package that is used by many other packages, such as requests, clean out or upgrade old packages to minimize the number of dependency conflicts.