Create a New Single User Image
You might need to create a new user image when deploying a new hub, or changing from a shared single user server image. We use repo2docker to generate our images.
There are two approaches to creating a repo2docker image:
- Use a repo2docker-style image template (environment.yaml, etc)
- Use a Dockerfile (useful for larger/more complex images)
Generally, we prefer to use the former approach, unless we need to install specific packages or utilities outside of python/apt as root
. If that is the case, only a Dockerfile
format will work.
As always, create a feature branch for your changes, and submit a PR when done.
Use an existing image as a template
Browse through our image repos to find a hub that is similar to the one you are trying to create. This will give you a good starting point.
Create the image repos
Create a new image repo from the hub-user-image-template. Click “Use this template” > “Create a new repository”.
Be sure to follow convention and name the repo <hubname>-user-image
, and the owner needs to be berkeley-dsep-infra
. When that is done, create your own fork of the new repo.
Configuring the root image repo
There are now a few steps to set up the CI/CD for the new image repo. In the berkeley-dsep-infra
image repo, click on Settings
, and under General
, scroll down to Pull Requests
and check the box labeled Automatically delete head branches
.
Scroll back up to the top of the settings, and in the left menu bar, click on Secrets and variables
, and then Actions
.
From there, click on the Variables
tab and then New repository variable
. We will be adding two new variables:
HUB
: the name of the hub (eg: datahub)IMAGE
: the Google Artifact Registry path and image name. The path will always beucb-datahub-2018/user-images/<image-name>
and the image name will always be the same as the repo:<hubname>-user-image
.
Configure your fork
Now you will want to disable Github Actions for your fork of the image repo. If you don’t, whenever you push PRs to the root repo the workflows in your fork will attempt to run, but don’t have the proper permissions to successfully complete. This will then send you a nag email about a workflow failure.
To disable this for your fork, click on Settings
, Actions
and General
. Check the Disable actions
box and click save.
Add the root image repo to the list of allowed repos in the berkeley-dsep-infra
secrets.
Now, go to the berkeley-dsep-infra
Secrets and Variables. You will need to give your repo permissions to push to the Artifact Registry, as well as to push a branch to the datahub repo.
Edit both DATAHUB_CREATE_PR
and GAR_SECRET_KEY
, and click on the gear icon, search for your repo name, check the box and save.
Update your deployment’s hubploy.yaml
and add the image to the primary list of repos.
You need to let hubploy
know the specifics of the image. Change the name
of the image in deployments/<hubname>/hubploy.yaml
to point to your new image name, and after the name add :PLACEHOLDER
in place of the image sha. This will be automatically updated after your new image is built and pushed to the Artifact Registry.
Example:
images:
images:
- name: us-central1-docker.pkg.dev/ucb-datahub-2018/user-images/fancynewhub-user-image:PLACEHOLDER
cluster:
provider: gcloud
gcloud:
project: ucb-datahub-2018
service_key: gke-key.json
cluster: spring-2024
zone: us-central1
Next, add the ssh clone path of the root image repo to repos.txt.
Create a PR and merge to staging. You can cancel the Deploy staging and prod hubs
job in Actions, or just let it fail.
Add a github bot notification in Slack
Go to the #ucb-datahubs-bots channel, and run the following command:
/github subscribe berkeley-dsep-infra/<your repo name>
Modify the image configuration as necessary
This step is straightforward: create a feature branch, edit/modify/delete/add any files in the image repo to configure the image as needed.
We also strongly recommend copying README-template.md
over the default README.md
, and modifying it to replace all occurrences of <HUBNAME>
with the name of your image.
Submitting a pull request
Familiarize yourself with pull requests and repo2docker , and create a fork of the datahub staging branch.
Set up your git/dev environment by following the instructions here. : - This guide is also located in your image repo!
- Test the changes locally using
repo2docker
, then submit a PR tostaging
. -
- To use
repo2docker
, be sure that you are inside the image repo directory on your device, and then runrepo2docker .
.
- To use
- Test the changes locally using
Commit and push your changes to your fork of the image repo, and create a new pull request at https://github.com/berkeley-dsep-infra/
. After the build passes, merge your PR in to
main
and the image will be built again and pushed to the Artifact Registry. If that succeeds, then a commit will be crafted that will update thePLACEHOLDER
field inhubploy.yaml
with the image’s SHA and pushed to the datahub repo. You can check on the progress of this workflow in your root image repo’sActions
tab.After 4 is completed successfully, go to the Datahub repo and click on the New pull request button. Next, click on the
compare: staging
drop down, and you should see a branch named something likeupdate-<hubname>-image-tag-<SHA>
. Select that, and create a new pull request.Once the checks has passed, merge to
staging
and your new image will be deployed! You can watch the progress here.