Skip to content

Commit 6703b51

Browse files
committed
Improve performance of delegate_hashed_bins
Due to the performance overhead of deepcopy(), as used extensively in roledb, the delegate function is rather slow. This is especially noticeable when we have a large number_of_bins when calling delegate_hashed_bins. In order to be able to easily reduce the number of deepcopy() operations we remove direct calls to delegate() and instead use the newly added helper functions to replicate the behaviour, only with a single call update to the roledb. This improves the performance of a 16k bins delegation from a 1hr 24min operation on my laptop to 33s. Ideally once Issue #1005 has been properly fixed this commit can be reverted and we can once again just call delegate() here. Signed-off-by: Joshua Lock <[email protected]>
1 parent 9be4634 commit 6703b51

File tree

1 file changed

+50
-6
lines changed

1 file changed

+50
-6
lines changed

tuf/repository_tool.py

Lines changed: 50 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2546,12 +2546,56 @@ def delegate_hashed_bins(self, list_of_targets, keys_of_hashed_bins,
25462546
hash_prefix = _get_hash(target_path.replace('\\', '/').lstrip('/'))[:prefix_length]
25472547
ordered_roles[int(hash_prefix, 16) // bin_size]["target_paths"].append(target_path)
25482548

2549-
for bin_rolename in ordered_roles:
2550-
# Delegate from the "unclaimed" targets role to each 'bin_rolename'
2551-
self.delegate(bin_rolename['name'], keys_of_hashed_bins, [],
2552-
list_of_targets=bin_rolename['target_paths'],
2553-
path_hash_prefixes=bin_rolename['target_hash_prefixes'])
2554-
logger.debug('Delegated from ' + repr(self.rolename) + ' to ' + repr(bin_rolename))
2549+
keyids, keydict = _keys_to_keydict(keys_of_hashed_bins)
2550+
2551+
# A queue of roleinfo's that need to be updated in the roledb
2552+
delegated_roleinfos = []
2553+
2554+
for bin_role in ordered_roles:
2555+
# TODO: originally we just called self.delegate() for each item in this
2556+
# iteration. However, this is *extremely* slow when creating a large
2557+
# number of hashed bins, i.e. 16k as is recommended for PyPI usage in
2558+
# PEP 458: https://www.python.org/dev/peps/pep-0458/
2559+
# The source of the slowness is the interactions with the roledb, which
2560+
# causes several deep copies of roleinfo dictionaries:
2561+
# https://github.com/theupdateframework/tuf/issues/1005
2562+
# Once the underlying issues in #1005 are resolved, i.e. some combination
2563+
# of the intermediate and long-term fixes, we may simplify here by
2564+
# switching back to just calling self.delegate(), but until that time we
2565+
# queue roledb interactions and perform all updates to the roledb in one
2566+
# operation at the end of the iteration.
2567+
2568+
relative_paths = {}
2569+
targets_directory_length = len(self._targets_directory)
2570+
for path in bin_role['target_paths']:
2571+
relative_paths.update({path[targets_directory_length:]: {}})
2572+
2573+
# Delegate from the "unclaimed" targets role to each 'bin_role'
2574+
target = self._create_delegated_target(bin_role['name'], keyids, 1,
2575+
relative_paths)
2576+
2577+
roleinfo = {'name': bin_role['name'],
2578+
'keyids': keyids,
2579+
'threshold': 1,
2580+
'terminating': False,
2581+
'path_hash_prefixes': bin_role['target_hash_prefixes']}
2582+
delegated_roleinfos.append(roleinfo)
2583+
2584+
for key in keys_of_hashed_bins:
2585+
target.add_verification_key(key)
2586+
2587+
# Add the new delegation to the top-level 'targets' role object (i.e.,
2588+
# 'repository.targets()').
2589+
if self.rolename != 'targets':
2590+
self._parent_targets_object.add_delegated_role(bin_role['name'],
2591+
target)
2592+
2593+
# Add 'new_targets_object' to the 'targets' role object (this object).
2594+
self.add_delegated_role(bin_role['name'], target)
2595+
logger.debug('Delegated from ' + repr(self.rolename) + ' to ' + repr(bin_role))
2596+
2597+
2598+
self._update_roledb_delegations(keydict, delegated_roleinfos)
25552599

25562600

25572601

0 commit comments

Comments
 (0)