-
Notifications
You must be signed in to change notification settings - Fork 278
Sort roles
dict by keys when serializing
#2161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It is not. Python 3.7 and newer guarantee that dict ordering is preserved, just like OrderedDict. python-tuf supports python >= 3.7 |
This looks like an unnecessary change to me but reproducible output is definitely a priority to this project: so if there are any issues with that, let's handle them. |
I know, but may be different because of different reasons, like different insertion order in different launches of tools (I'm currently doing an own tool for making a repo and splitted the pipeline into different CLI commands, and noticed that keys are permutted on each call of my tool). We want line-based diffs between versions of the metadata be minimal (and no difference at all, when there is no meaningful changes), it is easier to review them for a human this way using tools like |
If your issue is that you are inserting items in non-reproducible order, then using OrderedDict won't help at all: it just preserves the insert order, just like Dict does. If this PR helps your case then it's actually the I'm making this point because most of the commit seems to be OrderedDict changes but they seem irrelevant to the proposed fix: this may not obvious to casual observer |
It came up in some of @MVrachev's work:
There was also an issue with not preserving the order in a list, for which we use a dict internally: And: Can't remember the details right now. |
I know. The order is enforced by sorting, and I think it's TUF lib responsibility to enforce ordering: almost everyone will need it, sorting will cost almost nothing (for small enough |
TL;DR: I think the sorting is worth considering. |
I am not sure I understand the need of using |
I think that the metadata files produced from a set of source files (i.e. keys and user files) should be as similar as possible to other (i.e. from different invocations in randomized conditions, or produced by different tools) metadata, because it (in the order of reducing importance)
While 1. and 2. are only important for developers, 3. would impact everyone. |
Cheers, the code changes look a lot easier to understand now without OrderedDict (it's such a painful API). *) current PR doesn't actually implement 100% alphabetical but what looks like a stable, almost alphabetical order |
I don't have a strong opinion on this. |
Let me try to summarize the discussion. IIUC the goal here is reproducible metadata, which can be achieved by:
(1) is already true for the internal representation as discussed above, and (2) is already true to some extent for the default json serializer (sorted keys, uniform indentation and whitespace). IMO:
TL;DR: I'm for closing this issue. |
So after lukas' comment I actually looked at the JSON serialization code and we do sort dicts by key! So now I'm confused. If I understand correctly the serialization result (the json) is already 100% sorted wherever it should be... @KOLANICH is this correct: JSON is already like you want but you are asking for the intermediate format (the python dicts) to also be sorted? Could you show actual code that is broken by the current functionality -- is this about Metadata eq comparisons or something? |
I just used an own serializer and parser because I have an own library of serializers and parsers from/into various serialization formats. |
So, yes. It can probably be dealt with in a serializer, but storing them sorted should have a benefit that re-sorting should be faster. |
Ok thanks for the details. I think in that case I agree with Lukas: When this was designed sorting was intentionally left to the serializer. Change of behaviour now is not worth the small risk when the benefit is also so small: serializers are supposed to choose their serialization format, this includes potentially sorting dicts. I'll close this based on votes from me and lukas |
Description of issue or feature request:
To reduce variation and achieve better diffs we probably should sort keys in
roles
dict before serializing.Current behavior:
The order is random.
Expected behavior:
The order should be alwwys the same.
The text was updated successfully, but these errors were encountered: