-
Notifications
You must be signed in to change notification settings - Fork 280
Description
Description of issue or feature request:
Right now, tuf.ngclient
is heavily tied to local system I/O: it assumes a metadata directory on disk that can be read/written. For example:
python-tuf/tuf/ngclient/updater.py
Lines 293 to 312 in 4d2ff8d
def _persist_metadata(self, rolename: str, data: bytes) -> None: | |
"""Write metadata to disk atomically to avoid data loss.""" | |
temp_file_name: Optional[str] = None | |
try: | |
# encode the rolename to avoid issues with e.g. path separators | |
encoded_name = parse.quote(rolename, "") | |
filename = os.path.join(self._dir, f"{encoded_name}.json") | |
with tempfile.NamedTemporaryFile( | |
dir=self._dir, delete=False | |
) as temp_file: | |
temp_file_name = temp_file.name | |
temp_file.write(data) | |
os.replace(temp_file.name, filename) | |
except OSError as e: | |
# remove tempfile if we managed to create one, | |
# then let the exception happen | |
if temp_file_name is not None: | |
with contextlib.suppress(FileNotFoundError): | |
os.remove(temp_file_name) | |
raise e |
This is problematic in distributed worker setups like Warehouse (PyPI), where each worker has its own container/entire VM and thus can't easily share on-disk TUF repos. In particular, this causes both reliability and security concerns:
- Reliability: an unfortunate corruption in a single worker's TUF repo results in a hard-to-diagnose flaky worker, since each worker has its own copy of the repo.
- Security: each worker's TUF repo is independently stored on a (machine-local) disk, making them harder to audit.
This problem was noted a few years back, before tuf.ngclient
was created: #1009. The solution then was to add a filesystem abstraction to the tuf.metadata
APIs, which was done via secure-systems-lab/securesystemslib#232 and #1009. However, this abstraction wasn't added to the ngclient
APIs, only to the low-level metadata
ones.
Current behavior:
tuf.ngclient
currently assumes that it can perform persistent local I/O for its repository.
Expected behavior:
tuf.ngclient
should support an I/O abstraction (such as the pre-existing StorageBackendInterface
, if suitable) for persistent repo operations, enabling use in distributed deployments.