Skip to content

possible blog post: Caching TUF metadata #2605

@jku

Description

@jku

I've spent far too long in the past week looking at CDN logs... I collected some notes from this, and wrote a first draft of a blog post or something. copy-pasting here so I don't lose it

TUF implementation details: Caching and content delivery networks

TUF metadata can be cached at various places during its lifetime, this post aims to describe the useful
methods of caching. The write-up assumes that the "consistent snapshot" feature of TUF is used:
this should be true for all reasonable implementations.

Client metadata cache

A TUF client stores downloaded metadata in an application cache as part of the
TUF Client Workflow. Note that caching metadata is subtly different from caching
artifacts: An artifact cache is a "pure" cache and can be purged at any time without
side-effects (other than possibly having to re-download). Purging the metadata cache
is also possible without service loss but does have minor security implications as
some rollback attack protection is lost.

Client HTTP cache

In addition to the actual metadata, a client could cache the ETag information
included in a timestamp.json response and use the If-None-Match header in
subsequent requests. This is not useful for other metadata or artifacts as they
should never change.

There is a minor information leak if this is done (as the server could now respond
maliciously to only some clients based on the content of the If-None-Match
header). Current client implementations are not known to cache ETag.

Content Delivery Network caching

One could imagine that caching something as simple as TUF metadata in a Content
Delivery Network (CDN) is a trivial feat but it turns out there are several pitfalls.

These are some of the lessons that have been learned while maintaining TUF repositories:

  • Uploading a new repository version to backend storage should be atomic (the metadata
    versions on the storage backend should always be consistent). If this is not technically possible,
    snapshot and all targets metadata should be uploaded before root and timestamp: this
    minimizes the window of potentially inconsistent metadata.
  • "Old" metadata (or artifact) versions should not be removed from backend storage immediately: this can break clients that are in the middle of an update process
  • CDN frontends should avoid serving any stale responses: TUF requires even 404 responses to
    not be stale, otherwise the repository state may be inconsistent.
  • CDN frontends may cache versioned metadata responses (root, snapshot, targets) with
    long lifetimes.
  • There are two valid alternatives to caching other responses:
    1. CDN frontend may use "negative cache" (caching failure codes) and may cache
      timestamp metadata responses, if it is able to invalidate the cache immediately on
      upload of new repository versions to storage backend.
    2. CDN frontend should not cache timestamp metadata responses or use "negative caching"
      if it is unable to invalidate the cache on upload of new data

At first glance it may seem like the above advice is overly cautious, and that failures
would be rare. In practice especially testing and alerting systems have managed to
consistently find failing combinations of mistakenly cached content.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions