Git

Help! We’ve ran into a DockerHub rate limit!

About

Yes, it is still happining. In 2025! Here you will find:

Podman Dockerhub Mirror Configuration
K8s Quickfix: Rewriting Existing K8s Resources
Permanent Mirror Configuration for containerd
K8s Admission Webhook to do the same

Podman Dockerhub Mirror Configuration

~/.config/containers/registries.conf.d/dockerhub-mirror.conf:

[[registry]]                                                                                              
prefix = "docker.io"                                                      
insecure = false                                                                              
blocked = false
location = "public.ecr.aws/docker"  

[[registry.mirror]]
location = "mirror.gcr.io"

[[registry.mirror]]
location = "gitlab.com/acme-org/dependency_proxy/containers"

[[registry.mirror]]
location = "registry-1.docker.io"                                                              

[[registry.mirror]]
location = "123456789012.dkr.ecr.us-east-1.amazonaws.com/docker-io"

I hope you are using ecr-login for your ECR registries ;)

export REGISTRY_AUTH_FILE=$HOME/.config/containers/auth.json

{
  "auths": {
    "docker.io": {
      "auth": "eGw4ZGVwXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXem40VQ=="
    },
    "gitlab.com": {
      "auth": "cmVXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXSYQ=="
    },
    "registry.gitlab.com": {
      "auth": "cmVXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXSYQ=="
    }
  },
  "credHelpers": {
    "*": "",
    "123456789012.dkr.ecr.us-east-1.amazonaws.com": "ecr-login",
    "345678901234.dkr.ecr.us-east-1.amazonaws.com": "ecr-login"
  }
}

K8s Quickfix: Rewriting Existing K8s Resources


$ cd $(mktemp -d)

$ (
  kubectl get pods --field-selector=status.phase=Pending -A -ojson | jq -c '.items[]';
  kubectl get deployments -ojson -A | jq -c '.items[]';
  kubectl get replicasets -ojson -A | jq -c '.items[]';
  kubectl get daemonsets -ojson -A | jq -c '.items[]';
) > /tmp/cluster.jsonl

$ cat /tmp/cluster.jsonl \
  | jq -r '
    def parse_into_parts:
      . as $i
      |capture(
        "^((?<host>[a-zA-Z0-9-]+\\.[a-zA-Z0-9.-]+)/)?"  
        + "(:(?<port>[0-9]+))?"
        + "((?<path>[a-zA-Z0-9-._/]+)/)?"
        + "(?<image>[a-zA-Z0-9-._]+)"
        + "((:(?<tag>[a-z0-9_.-]+))|(@(?<digest>sha256:[a-z0-9]+)))?$"
      ) // error("couldnt parse \($i)");

    def qualify_oci_image:
      if (.host==null) then .host="docker.io" end
      |if (.path==null and .host=="docker.io") then .path="library" end
      # |if (.tag==null and .digest==null) then .tag="latest" end
      ;

    def glue_parts:
      [
        if (.host) then .host else "" end,
        if (.port) then ":\(.port)" else "" end,
        if (.host) then "/" else "" end,
        if (.path) then "\(.path)/" else "" end,
        .image,
        if (.digest) then "@\(.digest)" elif (.tag) then ":\(.tag)" else "" end
      ]|join("")
      ;

    def fix_oci_image:
      . as $i
      |parse_into_parts
      |qualify_oci_image
      |if (.path=="bitnami") then .path="bitnamilegacy" else . end
      |if (.host=="docker.io") then (.host="123456780123.dkr.ecr.us-east-1.amazonaws.com"|.path="docker-io/\(.path)") else . end
      |glue_parts;
    
    [
      ..|objects|(.initContainers[]?,.containers[]?)
      |(.image|fix_oci_image) as $newImage
      |select(.image!=$newImage)
      |"\(.name)=\($newImage)"
    ] as $p
    |select($p|length > 0)
    |"kubectl set image \(.kind) -n \(.metadata.namespace) \(.metadata.name) \($p|join(" "))"

Permanent Mirror Configuration for `containerd`

(
	# patch /etc/containerd/config.toml for automatically picking dockerhub mirror

	containerd_config_version="$(grep -oP '^\s*version\s*=\s*\K\d+' /etc/containerd/config.toml)"
	p=""
	case "$containerd_config_version" in
		2) p="io.containerd.grpc.v1.cri";;
		3) p="io.containerd.cri.v1.images";;
		*) echo "unsupported"; return;;
	esac
	cat <<-EOM >> /etc/containerd/config.d/dockerhub-mirrors.toml
[plugins]

  [plugins."$p".registry]

    [plugins."$p".registry.mirrors]

      [plugins."$p".registry.mirrors."docker.io"]
        endpoint = [
          "public.ecr.aws/docker",
          "mirror.gcr.io",
          "gitlab.com/acme-org/dependency_proxy/containers",
          "123456789012.dkr.ecr.us-east-1.amazonaws.com/docker-io",
          "docker.io",
        ]

    [plugins."$p".registry.configs]
      [plugins."io.containerd.grpc.v1.cri".registry.configs."gitlab.com".auth]
      	# https://gitlab.com/groups/acme-org/-/settings/access_tokens?page=1
        username = "dependency-proxy"
        password = "glpat-XXXXXXXXXXXXXXXXXXXX"

      [plugins."$p".registry.configs."docker.io".auth]
        username = "acme-org"
        password = "dckr_pat_3Xi_XXXXXXXXXXXXXXXXXXXXXXX"
        auth = "dckr_pat_3Xi_XXXXXXXXXXXXXXXXXXXXXXX"
EOM		
	fi
)

if ! containerd config dump 1>/dev/null; then
   echo "exiting since containerd config is bad" >&2
   exit 1
fi

Multi, Mono, Meta, Manifest – Composite Repository?

There was this discussion about whether to use Mono- or Multi-Repositories? I won’t pick it up again.

Some cool people suggested: why not use the best of both worlds and use meta-repos?!

I Was Interested in “Meta-Repo” Tooling

I asked myself how these tools would solve problems and gave some of them a try:

There are multi-repository management tools that work by tagging or grouping repos, like gr or mu-repo.
Tools that just do subdirectory iteration over your repos, like gitbatch.
Many git-extras repositories with subdirectory iteration scripts that do the same, like git multi (the one I am using ;)).
Tools which combine multi-repo manifests with different VCS systems, like myrepos (old and mature, often found within your Linux system package management).
Tools that try to standardize the directory layout for managing your repositories, like ghq (you should definitely use such a layout!).
Tools which basically reassemble what git submodule does, like mu-repo, git-metarepo, or meta (see below).
Tools that just check out all your organization’s repositories from GitHub, GitLab, or BitBucket, like ghorg.
And Git itself, which added git for-each-repo which also does simple iteration (experimental status!).
And then there was git-slave where its master decided it was an anachronistic project and gave it up.

In the end, most tools mentioned here do not focus on workflow problems; rather, they introduce new manifests or config files. :(

Techstack n - 1 is dead!

TL;DR TechStack n-1 is dead. It ended with the rise of the clouds and software release cycles going down to weeks due to containerized CIs.

Against ‘it’s stable and mature so let it run’

The Death of Sophocles (Creative Commons)

Beeing OpenSource-based, Ubuntu already had the concept of point releases every 6 months when the Docker and K8s hit the world and gave automated CIs a big boost in making system containers. Some years after Docker itself switched to a 3-month release cycle. So did the Linux Kernel with 2-3 months. Firefox 4-weeks.

Download an LFS backed file from GitLab.com without `git` and `git-lfs` installed

It is possible to download a Git LFS-backed file from GitLab.com without having git or git-lfs installed by using the GitLab API directly. This article provides two shell scripts that demonstrate how to do this.

Well, the API is there and you can do it already!

Just dig into what git is doing by a test-clone with any LFS repo:

export GIT_CURL_VERBOSE=1
export GIT_TRACE_CURL=1
git clone <my-repo> 2>&1 | tee git-clone.log

From there you are able to figure out what is happening on SSH.

Connect to GitLab via SSH

Start an SSH Agent

If you haven’t already done so, add the following command to your shell’s RC file (such as .bashrc or .zshrc) to start the ssh-agent:

$ eval $(ssh-agent)

Add Your Generated Key

Use the ssh-add command to add your private SSH key (assuming it is the default id_rsa file) to the agent:

$ ssh-add ~/.ssh/id_rsa

List Keys

You can list the keys currently loaded by the ssh-agent using the following command:

git: reducing repository size (gc and destructive)

This article discusses two approaches to reducing the size of a Git repository: non-destructive garbage collection and destructive history rewriting. It provides a script for running git gc on multiple repositories and links to resources for more advanced techniques.

Garbage Collection (non-destructive)

This works especially well when removing a file added in the most recent unpushed commit. Git Garbage Collection automates some of these cleanup jobs.

I ran the following over my source folders:

AWS sync is not reliable!

While migrating from s3cmd to the AWS S3 CLI, I noticed that files did not reliably sync when using the AWS CLI.

I tested this behavior with different versions, and they all exhibited the same issue:

python2.7-awscli1.9.7
python2.7-awscli1.15.47
python3.6-awscli1.15.47

Test Setup

Set up the AWS CLI utility and configure your credentials.
Create a testing S3 bucket.

Set up some random files:

# Create 10 random files of 10MB each
for i in {1..10}; do dd if=/dev/urandom of=multi/part-$i.out bs=1MB count=10; done;
# Then copy the first 5 files over
mkdir multi-changed
cp -r multi/part-{1,2,3,4,5}.out multi-changed
# And replace the content in the remaining 5 files (6-10)
for i in {6..10}; do dd if=/dev/urandom of=multi-changed/part-$i.out bs=1MB count=10; done;

Testing S3 sync with AWS CLI

Cleanup

$ aws s3 rm s3://l3testing/multi --recursive

Inital sync

$ aws s3 sync multi s3://l3testing/multi
upload: multi/part-1.out to s3://l3testing/multi/part-1.out       
upload: multi/part-3.out to s3://l3testing/multi/part-3.out     
upload: multi/part-2.out to s3://l3testing/multi/part-2.out     
upload: multi/part-4.out to s3://l3testing/multi/part-4.out     
upload: multi/part-10.out to s3://l3testing/multi/part-10.out   
upload: multi/part-5.out to s3://l3testing/multi/part-5.out     
upload: multi/part-6.out to s3://l3testing/multi/part-6.out     
upload: multi/part-8.out to s3://l3testing/multi/part-8.out     
upload: multi/part-7.out to s3://l3testing/multi/part-7.out     
upload: multi/part-9.out to s3://l3testing/multi/part-9.out

Update files

Only 5 files should now be uploaded. Timestamps for all 10 files should be changed.

GitLab: checkout all available repositories

This guide provides a set of shell commands to automate the process of checking out all available repositories from one or more GitLab instances. It leverages the GitLab API, jq, and parallel to efficiently clone projects.

Generate a private token

https://<GITLAB-SERVER1>/profile/personal_access_tokens
https://<GITLAB-SERVER2>/profile/personal_access_tokens

Checkout a list of all available repositories

QUERY=".[] | .path_with_namespace + "\t" + .ssh_url_to_repo" # JQ Query
curl --request GET --header "PRIVATE-TOKEN: <PRIVATE-TOKEN>" "<GITLAB-SERVER1>/api/v4/projects?simple=true&per_page=65536" | jq -r "$QUERY" > repo.list
curl --request GET --header "PRIVATE-TOKEN: <PRIVATE-TOKEN>" "<GITLAB-SERVER2>/api/v3/projects?simple=true&per_page=65536" | jq -r "$QUERY" >> repo.list

Create directories for repositories

cat repo.list | cut -f1 | xargs mkdir -p

Checkout projects (with GNU parallel)

parallel --colsep '\t' --jobs 4 -a repo.list git clone {2} {1}

Build list of git repositories

find . -type d -name ".git"  | xargs realpath | xargs dirname > path.list

Report repository branch or checkout branch

cat path.list | xargs -I{} sh -c "cd {}; echo {}; git branch"
cat path.list | xargs -I{} sh -c "cd {}; echo {}; git checkout master"
cat path.list | xargs -I{} sh -c "cd {}; echo {}; git checkout develop"

Note: when you are migrating repositories you should use git clone --mirror.

Git: Encrypt Credentials Within a Repository

This article explores the concept of encrypting credentials within a Git repository. It demonstrates a method using git smudge/clean filters but ultimately advises against it, advocating for the use of config servers instead.

Especially in the microservices era, you should use a config server and never store your credentials in a repository!

You should not use git smudge/clean filters for encryption. Why? Here’s an example!

Let’s create an example repository

% TMP=$(mktemp -d)
% cd $TMP
% git init
% echo 'Hello world!' > credentials

Add .gitattributes

/credentials filter=crypto

Add .git/config

[filter "crypto"]
smudge = openssl enc -aes-256-cbc -salt
clean = openssl enc -aes-256-cbc -salt
require

Note: require indicates that these commands need to exit with code 0, otherwise it could happen that these files are added without any encryption. You can test this by using smudge = gpg -d -q –batch –no-tty -r <SIGNATURE> and clean = gpg -ea -q –batch -no-tty -r <SIGNATURE> filters.

IP in VPN vs. LAN: Alias IP Address by iptables

Scenario: Using a Consistent IP Address

When you’re at work, you are on the LAN and use an IP address like 192.168.x.x. When you work from home, you connect via VPN to the same database (DB), and your IP address changes to 10.x.x.x. You want to avoid changing configuration files for your application every time you switch environments.

This problem can be easily worked around using iptables to create an IP address alias.

Laptop Performance: irqbalancer vs. intel_pstate

Today I uninstalled irqbalancer and noticed a performance gain on my GNOME desktop.

The CPUfreq control panel showed me IRQBALANCE DETECTED, and they state the following:

Why I should not use a single core for power saving

Modern OS/kernels work better on multi-core architectures.

You need at least 1 core for a foreground application and 1 for background system services.

Linux Kernel switches between CPU cores to avoid overheating, CPU thermal throttling, and to balance system load.

Many CPUs have Hyper-Threading (HT) technology enabled by default. So there is no reason to run half of a physical CPU core.

These points are stated very simply. I feel there are some contradictions here.

AutoFS: Automatically mount S3 using goofyfs or s3fs

About

In Linux basically everything can be turned into a mount helper.

(TBD) UPDATE: This article hardly needs a rewrite. I still keep it online until then.

Scripts

Save this as mount.autofs:

#!/bin/bash

##
# AutoFS user-folder indirect Automounter for S3 using either FUSE goofyfs or s3fs
#
# Requirements
# - AWS CLI installed
# - JQ installed
# - Either FUSE goofyfs or s3fs installed
#
# Usage
#  - place config to $S3FS_CONFIG directory using s3fs config format (ACCESS_KEY:ACCESS_SECRET)
#  - place this file to /etc/auto.s3 and make it executable
#  - add to /etc/auto.master: /home/<user>/Remote/S3 /etc/auto.s3 --timeout=3000
#  - choose backend by config section in this file (NOTE: goofyfs needs )
#  - cd <mountpoint>/<aws-profile>/<bucket>
#
# Debugging
# - Stop system service by:
#   systemctl stop autofs
# - Execute as process (use --debug to see mount commands)
#   automount -f -v
#
# Clean up mountpoints (when autofs hangs or mountpoints are still used)
#   mount | grep autofs | cut -d' ' -f 3 | xargs umount -l
#
# Logging
# - Logs go to syslog except you are running automount within TTY
#
# Notes
# - goofyfs makes sometimes trouble - use s3fs!
# - Daemon needs to run by root since we only root has access to all mount options
# - Additional entries can be defined with the -Dvariable=Value map-option to automount(8).
# - Alternative fuse style mount can be done by -fstype=fuse,allow_other :sshfs#user@example.com:/path/to/mount
# - We do not read out .aws/config since not all credentials do necessary have S3-access
# - https://github.com/kahing/goofys/pull/91/commits/07dffdbda4ff7fc3c538cb07e58ad12cc464b628
# - goofyfs catfs cache is not activated by default
# - chown/chmod is not that nice but works ;9
# - other backends not planned at the moment
#
# AWS Commands
# - aws s3api list-buckets
# - aws s3api list-objects --bucket <bucket>
#
# FAQ
# -  https://github.com/s3fs-fuse/s3fs-fuse/wiki/FAQ
#
# Autofs provides additional variables that are set based on the user requesting the mount:
#
#   USER   The user login name
#   UID    The user login ID
#   GROUP  The user group name
#   GID    The user group ID
#   HOME   The user home directory
#   HOST   Hostname (uname -n)
#
# From exports
#
#   AUTOFS_GID="1000"
#   AUTOFS_GROUP="ctang"
#   AUTOFS_HOME="/home/ctang"
#   AUTOFS_SHOST="refpad-16"
#   AUTOFS_UID="1000"
#   AUTOFS_USER="ctang"
#

# Strict mode
set -euo pipefail -o errtrace

# Config
S3FS_CONFIG="${AUTOFS_HOME:-$HOME}/.autofs/s3fs" # user directory
BACKEND="goofyfs" # s3fs|goofyfs - not goofyfs requires goofyfs-fuse!
DEBUG=0 # 0|1 where 1 is on - output will go to syslog or journald
UMASK="750" # Umask for mountpoint placeholder directories
OPTS="defaults,noatime" # mount options
if [[ -z "${GID:-}" ]]; then
    GID="$(id -g)"
fi

# We ensure every command output can be parsed in neutral form
export LC_ALL=C
export AWS_SDK_LOAD_CONFIG=0

# Const
PWD="$(pwd)"
SCRIPT_NAME=`basename "$0"`
LOGGER_CMD="logger -i -t ${SCRIPT_NAME}"
if test -t 1; then 
    # if tty
    LOGGER_CMD="${LOGGER_CMD}  --no-act --stderr"
fi
PROFILES=()

if ! which jq 1>/dev/null 2>&1; then
     $LOGGER_CMD "Cannot find jq binary"
     exit 1
fi

if ! which aws 1>/dev/null 2>&1; then
     $LOGGER_CMD "Cannot find aws binary"
     exit 1
fi

# If use is already in a mount point this script will be called by root
# so we need to remap some stuff
if [[ ! "${HOME:-}" == "${PWD}/"* ]] && [[ "${PWD}" =~ ^(/home/[^/]+) ]]; then
    S3FS_CONFIG=${S3FS_CONFIG/${AUTOFS_HOME:-$HOME}/${BASH_REMATCH[1]}}
    HOME="${BASH_REMATCH[1]}"
    USER="${HOME##*/}"
    AUTOFS_UID="$(id -u ${USER})"
    AUTOFS_GID="$(id -g ${USER})"
    $LOGGER_CMD "Initializing. Remapping home to ${HOME}, user=${USER}, config=${S3FS_CONFIG}"
fi

# Prevent errors
if [[ ! -d ${S3FS_CONFIG} ]]; then
     $LOGGER_CMD "Config directory ${S3FS_CONFIG} not found."
     exit 1
fi

# Mountpoint needs to be owned by user 
chown -R ${AUTOFS_UID:-$UID}:${AUTOFS_GID:-$GID} "${S3FS_CONFIG}"
chmod -R 700 "${S3FS_CONFIG}"

# Create indirect mount points for s3 profiles
PROFILES=($(ls -1 ${S3FS_CONFIG}))
if [[ -z "${PROFILES[*]}" ]]; then
    $LOGGER_CMD "No profiles found within ${S3FS_CONFIG}"
else
    for profile in "${PROFILES[@]}"; do
        chmod 600 ${S3FS_CONFIG}/${profile}
        if [[ ! -d "${PWD}/${profile}" ]]; then
            $LOGGER_CMD "Creating ${PWD}/${profile}"
            mkdir -p "${PWD}/${profile}"  || true > /dev/null
            chmod ${UMASK} "${PWD}/${profile}"
            chown ${AUTOFS_UID:-$UID}:${AUTOFS_GID:-$GID} "${PWD}/${profile}"
        fi
    done
fi

# Requested profile
PROFILE="${1:-}"
if [[ ! -e "${S3FS_CONFIG}/${PROFILE}" ]]; then
    $LOGGER_CMD "No valid profile=${PROFILE} given! "
    exit 1
fi
$LOGGER_CMD "Profile: $@"
if [[ -z "${PROFILE}" ]]; then
    $LOGGER_CMD "No profile given" 
    exit 1
fi

if [[ "${BACKEND}" == "s3fs" ]]; then
    if ! which s3fs 1>/dev/null 2>&1; then
        $LOGGER_CMD "Cannot find s3fs installation"
        exit 1
    fi
    OPTS="-fstype=fuse.s3fs,uid=${AUTOFS_UID:-${UID}},gid=${AUTOFS_UID:-${GID}},umask=000,${OPTS},_netdev,allow_other,default_permissions,passwd_file=${S3FS_CONFIG}/${PROFILE},use_cache=$(mktemp -d)"
    if [[ "$DEBUG" -eq 1 ]]; then
        OPTS="${OPTS},dbglevel=info,curldbg"
    fi
elif [[ "${BACKEND}" == "goofyfs" ]]; then
    if ! which goofyfs 1>/dev/null 2>&1; then
        $LOGGER_CMD "Cannot find goofyfs installation"
        exit 1
    fi
    OPTS="-fstype=fuse.goofyfs-fuse,${OPTS},_netdev,nonempty,allow_other,passwd_file=${S3FS_CONFIG}/${PROFILE},--file-mode=0666,nls=utf8"
    if [[ "${DEBUG}" -eq 1 ]]; then
        OPTS="${OPTS},--debug_s3,--debug_fuse"
    fi
else
    $LOGGER_CMD "Unsupported backend ${BACKEND}"
    exit 1
fi

read  -r -d '' CREDENTIALS < ${S3FS_CONFIG}/${PROFILE}
export AWS_ACCESS_KEY_ID="${CREDENTIALS%%:*}"
export AWS_SECRET_ACCESS_KEY="${CREDENTIALS##*:}"
BUCKETS=($(aws s3api list-buckets --output json | jq -r '.Buckets[].Name'))
printf "%s\n" "${BUCKETS[@]}" | awk -v "opts=${OPTS}" -F '|' -- \
    '\
    BEGIN { ORS=""; first=1 }
    { 
          if (first)
            print opts; first=0
          bucket = $1
          # Enclose mount dir and location in quotes
          # Double quote "$" in location as it is special
          gsub(/\$\$/, "\\\$\",loc)
          gsub(/&amp;/, "\\&amp;",loc)
          print " \
\t \"/" bucket "\"", ":"bucket "\""
          # print " \
\t " bucket, ":"bucket
        }
    END { if (!first) print "\n"; else exit 1 }'

Save this as mount.goofyfs:

Git

About

Podman Dockerhub Mirror Configuration

K8s Quickfix: Rewriting Existing K8s Resources

Permanent Mirror Configuration for containerd

I Was Interested in “Meta-Repo” Tooling

Against ‘it’s stable and mature so let it run’

Start an SSH Agent

Add Your Generated Key

List Keys

Garbage Collection (non-destructive)

Test Setup

Testing S3 sync with AWS CLI

Cleanup

Inital sync

Update files

Generate a private token

Checkout a list of all available repositories

Create directories for repositories

Checkout projects (with GNU parallel)

Build list of git repositories

Report repository branch or checkout branch

Let’s create an example repository

Add .gitattributes

Add .git/config

Scenario: Using a Consistent IP Address

Why I should not use a single core for power saving

About

Scripts

Permanent Mirror Configuration for `containerd`