Skip to content

ngclient: support StorageBackendInterface? #2676

@woodruffw

Description

@woodruffw

Description of issue or feature request:

Right now, tuf.ngclient is heavily tied to local system I/O: it assumes a metadata directory on disk that can be read/written. For example:

def _persist_metadata(self, rolename: str, data: bytes) -> None:
"""Write metadata to disk atomically to avoid data loss."""
temp_file_name: Optional[str] = None
try:
# encode the rolename to avoid issues with e.g. path separators
encoded_name = parse.quote(rolename, "")
filename = os.path.join(self._dir, f"{encoded_name}.json")
with tempfile.NamedTemporaryFile(
dir=self._dir, delete=False
) as temp_file:
temp_file_name = temp_file.name
temp_file.write(data)
os.replace(temp_file.name, filename)
except OSError as e:
# remove tempfile if we managed to create one,
# then let the exception happen
if temp_file_name is not None:
with contextlib.suppress(FileNotFoundError):
os.remove(temp_file_name)
raise e

This is problematic in distributed worker setups like Warehouse (PyPI), where each worker has its own container/entire VM and thus can't easily share on-disk TUF repos. In particular, this causes both reliability and security concerns:

  • Reliability: an unfortunate corruption in a single worker's TUF repo results in a hard-to-diagnose flaky worker, since each worker has its own copy of the repo.
  • Security: each worker's TUF repo is independently stored on a (machine-local) disk, making them harder to audit.

This problem was noted a few years back, before tuf.ngclient was created: #1009. The solution then was to add a filesystem abstraction to the tuf.metadata APIs, which was done via secure-systems-lab/securesystemslib#232 and #1009. However, this abstraction wasn't added to the ngclient APIs, only to the low-level metadata ones.

Current behavior:

tuf.ngclient currently assumes that it can perform persistent local I/O for its repository.

Expected behavior:

tuf.ngclient should support an I/O abstraction (such as the pre-existing StorageBackendInterface, if suitable) for persistent repo operations, enabling use in distributed deployments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions