Sunday, 31 August 2025

Fixing Our Package Server: Upload Flow, OCI Compliance, and Dynamic Mirroring

Running your own package server and mirror can be a powerful way to control your software supply chain, but it comes with sharp edges. Recently, we hit several breaking issues in our registry implementation that highlighted how easy it is to get things almost working — but not correctly enough for the ecosystem tools.

What Was Broken & Why

  • Image push 404/commit errors: Our upload flow mixed up upload UUIDs vs. blob digests, and paths weren’t repo-scoped. When finalizing with PUT …/uploads/<uuid>?digest=sha256:…, the server couldn’t find the temp file → 404.
  • crictl pull size validation: HEAD/GET for blobs didn’t return Content-Length/ETag/Accept-Ranges, so unpack failed.
  • Manifests vs indexes: Server always served one content type, but clients need the correct application/vnd.oci.image.index.v1+json vs application/vnd.oci.image.manifest.v1+json.
  • Nested repos & tags: Our path parsing assumed user/repo, but real-world needs include deeper hierarchies (e.g., kubecve/api/kubecve-api:…).

Server Changes (Registry API v2)

We implemented repo-aware routing with a new parseV2Path, allowing any depth of repository naming. Storage layout now separates blobs and manifests cleanly, while maintaining legacy fallbacks:

<repo>/
  blobs/<sha256-hex>
  manifests/
    by-digest/<sha256-hex>
    by-tag/<tag>              # JSON: {"digest":"sha256:<hex>"}
  manifests/<tag>.json        # legacy fallback

Upload Flow

  1. POST /blobs/uploads/ → 202 + Docker-Upload-UUID
  2. PATCH /blobs/uploads/<uuid> → append chunk
  3. PUT /blobs/uploads/<uuid>?digest=sha256:<hex> → verify, rename, return 201
  4. Monolithic PUT supported for entire blob

Blob GET/HEAD

  • Now return Docker-Content-Digest, Content-Length, ETag, Accept-Ranges: bytes.

Manifests

  • PUT stores under by-digest and links via by-tag.
  • GET/HEAD resolves by digest or tag with correct headers and media type.

Pull-Through Mirror (Dynamic)

We added dynamic mirror capability: if a requested manifest/blob is missing, the server fetches it upstream, caches it, and serves it immediately — without extra client config.

Mirror Logic

  • Uses containerd’s ?ns=<original-host> when available.
  • Else checks X-Registry-Host header.
  • Else tries fallback list: ["docker.io", "registry.k8s.io", "ghcr.io"].

Special-casing for Docker Hub normalizes single-component repos (alpinelibrary/alpine). Storage is host-namespaced:

__mirror__/<host>/<repo>/
  blobs/<hex>
  manifests/
    by-digest/<hex>
    by-tag/<tag>

Client Configs

  • Podman / CRI-O: Per-registry mirror configs in /etc/containers/registries.conf.d/.
  • containerd: Configured in /etc/containerd/config.toml with /etc/containerd/certs.d/_default/hosts.toml to route all pulls through the mirror.
  • Docker Engine: Limited to mirroring Docker Hub, so we don’t rely on it for dynamic routing.

CI & Versioning

In GitLab CI, we refined .release handling:

  • Increment/write .release only in the job workspace.
  • Publish it as an artifact for downstream jobs.
  • Never push it back to git.

This keeps tags consistent ($PACKAGE_SERVER_DOCKER, $NAME, $VERSION + .release) without polluting the repo.

Validated Fixes

  • Manifest misses now trigger upstream fetch instead of 404.
  • Blob HEAD/GET now return correct sizes → crictl pull succeeds.
  • Mirror logs show host resolution, upstream attempts, and cache writes.

Takeaway

Building a standards-compliant registry isn’t just about storing blobs — every detail of the API matters. By fixing path parsing, header responses, manifest media types, and adding dynamic mirroring, we now have a robust package server that integrates cleanly with modern container tooling. It’s a good reminder that correctness is the real feature.

Repository: https://gitlab.com/jlcox70/repository-server


Container: https://hub.docker.com/r/jlcox1970/package-server


Tags: containers, registry, oci, mirror, kubernetes, devops, supply-chain, security

Sunday, 20 July 2025

From 290 CVEs to Zero: Rebuilding the Repository Server the Hard Way

The container image backing my repository server had quietly accumulated over 290 CVEs. Each of those is not just a statistic—they’re potential entry points on the attack surface.

Let’s be clear: just because this service ran inside Kubernetes doesn't mean those vulnerabilities were somehow magically mitigated. Kubernetes may abstract deployment and orchestration, but it does nothing to shrink the surface exposed by the containers themselves. A vulnerable container in Kubernetes is still a vulnerable system.

This image was built on Rocky Linux 9. While updates were technically available, actually applying them was more difficult than it should have been. Patching wasn't just a matter of running dnf update—dependency entanglements and version mismatches made the process fragile.

I attempted a move to Rocky Linux 10, hoping for a cleaner slate. Unfortunately, that path was blocked: the DEB repo tooling I rely on couldn’t be installed at all. The package dependencies for the deb-dev utilities were broken or missing entirely. At that point, the problem wasn’t patching—it was the platform itself.

That left one real option: rebuild the entire server as a pure Go application. No more relying on shell scripts or external tools for managing Debian or RPM repository metadata. Instead, everything needed—GPG signing, metadata generation, directory layout—was implemented natively in Go.

The Result

  • Container size dropped from 260MB to just 7MB
  • Current CVE count: zero
  • Dependencies are explicit and pinned
  • Future updates are under my control, not gated by an OS vendor

In practical terms, the entire attack surface is now reduced to a single statically-linked Go binary. No base image, no package manager, no lingering system libraries to monitor or patch.

This is one of those changes that doesn’t just feel cleaner—it is objectively safer and more maintainable.

Lesson reinforced: containers don’t remove the need for security hygiene. They just make it easier to ignore it—until it’s too late.

Source on GitLab

Wednesday, 14 May 2025

Pitfalls of the Latest Tag in Deployments and How SBOM Tools Can Help

The Problem with Using the latest Tag

Using the latest tag in your deployments might seem convenient, but it brings a host of problems that can undermine stability and traceability. Here’s why:

  • Lack of Version Control: The latest tag automatically pulls the most recent version of an image. This means you might unknowingly deploy a new version without properly testing it, leading to unexpected failures.
  • Reproducibility Issues: Since the latest tag can change over time, reproducing a bug or incident becomes challenging. You might end up debugging a version that is no longer the same as the one originally deployed.
  • Deployment Drift: Multiple environments (development, staging, production) can end up running different versions even if they all reference latest. This drift breaks the consistency needed for reliable deployments.
  • Lack of Visibility: When things go wrong, it’s hard to know which version is actually running, as latest does not directly indicate a specific build or commit.

How SBOM Tools Like Grype Can Help

Software Bill of Materials (SBOM) tools, such as Grype, are invaluable for overcoming the challenges posed by the latest tag and for managing software throughout its lifecycle. These tools enhance visibility, security, and consistency from build to production.

1. Build Phase: Secure and Compliant Images

  • Automated Vulnerability Scanning: Grype can be integrated into CI/CD pipelines to automatically generate SBOMs and identify vulnerabilities before deployment.
  • Dependency Management: Track dependencies and versions directly from the build process, allowing you to catch outdated or vulnerable libraries early.
  • Compliance Checks: SBOM tools ensure your builds meet internal and external security policies.

2. Deployment Phase: Verifying What You Ship

  • Image Verification: Grype helps confirm that the deployed image by checking hashes and versions.
  • Artifact Integrity: SBOMs can be signed and stored, providing verifiable evidence of what was deployed.
  • Version Locking: Using specific tags linked to SBOMs ensures consistency across environments.

3. Production Phase: Ongoing Monitoring and Maintenance

  • Continuous Vulnerability Scans: Regularly scan running containers to detect new vulnerabilities in your deployed software.
  • Lifecycle Management: SBOMs enable you to track when components reach end-of-life or become deprecated.
  • Audit and Compliance: Maintain an accurate record of all software versions and components running in production, helping with regulatory compliance.

Best Practices to Avoid the latest Pitfall

  • Use Specific Tags: Tag images with a version number or a commit hash to maintain consistency and traceability.
  • Automated SBOM Generation: Integrate tools like Grype in your CI/CD pipeline to automatically generate and store SBOMs for every build.
  • Regular Scanning: Continuously monitor your deployed containers with SBOM tools to catch vulnerabilities as they arise.

Conclusion: Gaining Control and Visibility

By avoiding the use of the latest tag and incorporating SBOM tools like Grype, you significantly improve the stability and security of your deployments. These tools not only mitigate the risks associated with version ambiguity but also enhance the entire software lifecycle—from build to production. With SBOMs, you gain control, maintain visibility, and ensure consistent, secure deployments.