Transition Single User Image to GitHub Actions

Single user images were originally maintained within the main datahub repo, however we moved them into their own repositories. It makes testing notebooks easier, and we can delegate write access to course staff if necessary.

This was the process for transitioning images to their own repositories.

Prerequisites

You will need to install git-filter-repo.

wget -O ~/bin/git-filter-repo https://raw.githubusercontent.com/newren/git-filter-repo/main/git-filter-repo
chmod +x ~/bin/git-filter-repo

Create the repository

  1. Go to https://github.com/berkeley-dsep-infra/hub-user-image-template. Click “Use this template” > “Create a new repository”.
  2. Set the owner to berkeley-dsep-infra. Name the image {hub}-user-image, or some approximation of there are multiple images per hub.
  3. Click create repository.
  4. In the new repository, visit Settings > Secrets and variables > Actions > Variables tab. Create new variables:
    1. Set HUB to the hub deployment, e.g. shiny.
    2. Set IMAGE to ucb-datahub-2018/user-images/{hub}-user-image, e.g. ucb-datahub-2018/user-images/shiny-user-image.
  5. Fork the new image repo into your own github account.

Preparing working directories

As part of this process, we will pull the previous image’s git history into the new image repo.

  1. Clone the datahub repo into a new directory named after the image repo.

    git clone git@github.com:berkeley-dsep-infra/datahub.git {hub}-user-image --origin source
  2. Change into the directory.

  3. Run git-filter-repo:

    git filter-repo --subdirectory-filter  deployments/{hub}/image --force
  4. Add new git remotes:

    git remote add origin git@github.com:{your_git_account}/{hub}-user-image.git
    git remote add upstream git@github.com:berkeley-dsep-infra/{hub}-user-image.git
  5. Pull in the contents of the new user image that was created from the template.

    git fetch upstream
    git checkout main # pulls in .github
  6. Merge the contents of the previous datahub image with the new user image.

    git rm environment.yml
    git commit -m "Remove default environment.yml file."
    git merge staging --allow-unrelated-histories -m 'Bringing in image directory from deployment repo'
    git push upstream main
    git push origin main

Preparing continuous integration

  1. In the berkeley-dsep-infra org settings, visit Secrets and variables > Actions. Edit the secrets for DATAHUB_CREATE_PR and GAR_SECRET_KEY, and enable the new repo to access each.

  2. In the datahub repo, in one PR:

    1. remove the hub deployment steps for the hub:

      • Deploy {hub}
      • hubploy/build-image {hub} image build (x2)
    2. under deployments/{hub}/hubploy.yaml, remove the registry entry, and set the image_name to have PLACEHOLDER for the tag.

    3. In the datahub repo, under the deployment image directory, update the README to point to the new repo. Delete everything else in the image directory.

  3. Merge these changes to datahub staging.

  4. Make a commit to trigger a build of the image in its repo.

  5. In a PR in the datahub repo, under .github/workflows/deploy-hubs.yaml, add the hub with the new image under determine-hub-deployments.py --only-deploy.

  6. Make another commit to the image repo to trigger a build. When these jobs finish, a commit will be pushed to the datahub repo. Make a PR, and merge to staging after canceling the CircleCI builds. (these builds are an artifact of the CircleCI-to-GitHub migration – we won’t need to do that long term)

  7. Subscribe the #ucb-datahubs-bots channel in UC Tech slack to the repo.

    /github subscribe berkeley-dsep-infra/<repo>