Create a New Single User Image
You might need to create a new user image when deploying a new hub, or changing from a shared single user server image. We use repo2docker to generate our images.
There are two approaches to creating a repo2docker image:
- Use a repo2docker-style image template (environment.yaml, etc)
- Use a Dockerfile (useful for larger/more complex images)
Generally, we prefer to use the former approach, unless we need to install specific packages or utilities outside of python/apt as root
. If that is the case, only a Dockerfile
format will work.
As always, create a feature branch for your changes, and submit a PR when done.
There are two approaches to pre-populate the image’s assets:
Use an existing image as a template. Browse through our image repos to find a hub that is similar to the one you are trying to create. This will give you a good starting point.
Fork hub-user-image-template. Click “Use this template” > “Create a new repository”. Be sure to follow convention and name the repo
<hubname>-user-image
, and the owner needs to beberkeley-dsep-infra
. When that is done, create your own fork of the new repo.
Image Repository Settings
There are now a few steps to set up the CI/CD for the new image repo. In the berkeley-dsep-infra
image repo, click on Settings
, and under General
, scroll down to Pull Requests
and check the box labeled Automatically delete head branches
.
Scroll back up to the top of the settings, and in the left menu bar, click on Secrets and variables
, and then Actions
.
From there, click on the Variables
tab and then New repository variable
. We will be adding two new variables:
HUB
: the name of the hub (eg: datahub)IMAGE
: the Google Artifact Registry path and image name. The path will always beucb-datahub-2018/user-images/<image-name>
and the image name will always be the same as the repo:<hubname>-user-image
.
Your Fork’s Repository Settings
Now you will want to disable Github Actions for your fork of the image repo. If you don’t, whenever you push PRs to the root repo the workflows in your fork will attempt to run, but don’t have the proper permissions to successfully complete. This will then send you a nag email about a workflow failure.
To disable this for your fork, click on Settings
, Actions
and General
. Check the Disable actions
box and click save.
Enable Artifact Registry Pushing
The image repository needs to be added to the list of allowed repositories in the berkeley-dsep-infra
secrets. Go to the berkeley-dsep-infra
Secrets and Variables. Give your repository permissions to push to the Artifact Registry, as well as to push a branch to the datahub repo.
Edit both DATAHUB_CREATE_PR
and GAR_SECRET_KEY
, and click on the gear icon, search for your repo name, check the box and save.
Configure hubploy
You need to let hubploy
know the specifics of the image by updating your deployment’s hubploy.yaml
. Change the name
of the image in deployments/<hubname>/hubploy.yaml
to point to your new image name, and after the name add :PLACEHOLDER
in place of the image sha. This will be automatically updated after your new image is built and pushed to the Artifact Registry.
Example:
images:
images:
- name: us-central1-docker.pkg.dev/ucb-datahub-2018/user-images/fancynewhub-user-image:PLACEHOLDER
cluster:
provider: gcloud
gcloud:
project: ucb-datahub-2018
service_key: gke-key.json
cluster: spring-2024
zone: us-central1
Next, add the ssh clone path of the root image repo to repos.txt.
Create a PR and merge to staging. You can cancel the Deploy staging and prod hubs
job in Actions, or just let it fail.
Subscribe to GitHub Repo in Slack
Go to the #ucb-datahubs-bots channel, and run the following command:
/github subscribe berkeley-dsep-infra/<your repo name>
Modify the Image
This step is straightforward: create a feature branch, and edit, delete, or add any files to configure the image as needed.
We also strongly recommend copying README-template.md
over the default README.md
, and modifying it to replace all occurrences of <HUBNAME>
with the name of your image.
Submit Pull Requests
Familiarize yourself with pull requests and repo2docker, and create a fork of the datahub staging branch.
Set up your git/dev environment by following the image templat’s contributing guide.
Test the image locally using
repo2docker
.Submit a PR to
staging
.Commit and push your changes to your fork of the image repo, and create a new pull request at https://github.com/berkeley-dsep-infra/
. After the build passes, merge your PR in to
main
and the image will be built again and pushed to the Artifact Registry. If that succeeds, then a commit will be crafted that will update thePLACEHOLDER
field inhubploy.yaml
with the image’s SHA and pushed to the datahub repo. You can check on the progress of this workflow in your root image repo’sActions
tab.After the previous step is completed successfully, go to the Datahub repo and click on the New pull request button. Next, click on the
compare: staging
drop down, and you should see a branch named something likeupdate-<hubname>-image-tag-<SHA>
. Select that, and create a new pull request.Once the checks has passed, merge to
staging
and your new image will be deployed! You can watch the progress in the deploy-hubs workflow.