-
Notifications
You must be signed in to change notification settings - Fork 280
Description
Description of issue or feature request:
The TUF repository tool provides a functionality to distribute a large number of target files over multiple delegated roles (see delegate_hashed_bins
).
To add a target to the corresponding bin, identified by the hash of the target path, or remove one from a bin the methods remove add_target_to_bin
and remove_target_from_bin
are provided. These methods are wrappers around add_target
and remove_target
, which don't call those methods directly but via a dispatcher helper _locate_and_update_target_in_bin
, which also implements common functionality for the wrappers.
These methods should be refactored for the reasons and as outlined below:
Current behavior:
-
The current architecture doesn't allow to pass through different arguments to the wrapped methods, e.g. the
custom
argument only supported byadd_target
:
https://github.com/theupdateframework/tuf/blob/4fe29d138df02035eed9716657cfe0a76e17e178/tuf/repository_tool.py#L2672 -
_locate_and_update_target_in_bin
doesn't seem to be designed to dispatch to any other methods thanadd_target
andremove_target
to justify the dispatcher architecture -
_locate_and_update_target_in_bin
replicates functionality from the wrapped methods, e.g. that the targets path exists as file in the current directory
https://github.com/theupdateframework/tuf/blob/4fe29d138df02035eed9716657cfe0a76e17e178/tuf/repository_tool.py#L2737-L2739
(see Targets'add_target(s)
methods do not correctly check passed paths #957 for reasons why that might not even be necessary). -
_locate_and_update_target_in_bin
makes risky assumption, e.g. it infers thehash_prefix_length
(i.e.len(hex(number_of_bins - 1)[2:])
), by taking the length of the first element in thepath_hash_prefixes
field of the first delegation of the targets object on which it is called.
https://github.com/theupdateframework/tuf/blob/4fe29d138df02035eed9716657cfe0a76e17e178/tuf/repository_tool.py#L2719-L2733 -
_locate_and_update_target_in_bin
makes aO(1)
searchO(n)
.
Expected behavior:
- simplify architecture to allow adding a
custom
argument toadd_target_to_bin
that gets passed through - remove redundant code
- pass
hash_prefix_length
ornumber_of_bins
instead of inferring it from the metadata (the calling code should know this). - make bin search
O(1)
, e.g.:
def find_bin_for_hash(hash_prefix, bin_size):
prefix_len = len(hash_prefix)
hash_prefix = int(hash_prefix, 16)
low = hash_prefix - (hash_prefix % bin_size)
high = low + bin_size - 1
return f"{low:0{prefix_len}x}-{high:0{prefix_len}x}"