Skip to content

Commit 4b2f28c

Browse files
committed
ADR0007: How we are going to validate the new API codebase
Add a decision record about the different options which we can use to validate the new code under tuf.api. TODO: Make decision. Signed-off-by: Martin Vrachev <[email protected]>
1 parent 65005cf commit 4b2f28c

File tree

2 files changed

+147
-0
lines changed

2 files changed

+147
-0
lines changed
Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
# How we are going to validate the new API codebase
2+
3+
* Date: 2021-03-10
4+
5+
Technical Story:
6+
- [securesystemslib schema checker issues](https://github.com/secure-systems-lab/securesystemslib/issues/183)
7+
- [new TUF validation guidelines](https://github.com/theupdateframework/tuf/issues/1130)
8+
9+
## Context and Problem Statement
10+
11+
1. Some schemas sound more specific than they are.
12+
2. Some schemas are an odd replacement for constants.
13+
3. Schema validation is generally **overused**. Together with user input,
14+
we are validating input programmatically generated from our private functions.
15+
4. There are instances where some attributes are validated multiple times
16+
when executing one API call.
17+
5. Schema checking sometimes makes execution branches unreachable.
18+
6. The error messages from checking schemas are often not helpful.
19+
20+
## Decision Drivers and Requirements
21+
Some of the requirements we want to meet are:
22+
1. The ability to decide which functions to validate and which not.
23+
2. Allow for custom deeper validation beyond type check.
24+
3. As little as possible performance overhead.
25+
4. Add as minimal number of dependencies as possible.
26+
5. Support for all python versions we are using.
27+
28+
## Considered Options
29+
1. Usage of a `ValidationMixin`.
30+
2. Usage of a third-party library called `pydantic`.
31+
32+
## Pros, Cons, and Considerations of the Options
33+
34+
### Option 1: Usage of a ValidationMixin
35+
36+
**Note:** All pros, cons, and considerations are documented with the assumption
37+
we would implement the `ValidationMixin` the same way it is implemented in
38+
[in-toto](https://github.com/in-toto) until version 1.0.1 (the latest
39+
version at the time of writing.)
40+
41+
* Good, because it's shorter by calling one function and validating
42+
multiple fields.
43+
44+
* Good, because it allows reuse of the validation code through
45+
`securesystemslib.schemas` or another schema of our choice.
46+
47+
* Bad, because there could be different code paths and return statements, and as
48+
a consequence there could be a code path which doesn't call `validate()`.
49+
50+
Examle:
51+
```python
52+
class User(ValidationMixin):
53+
54+
def __init__(self, id: int, nickname: str) -> None:
55+
self.id = id
56+
self.nickname = nickname
57+
self.pro_user = False
58+
59+
self.validate()
60+
61+
def _validate_id(self):
62+
if not isinstance(self.id, int):
63+
raise FormatError(f'id should be from type int')
64+
65+
if self.id < 0:
66+
raise ValueError(f'id is expected to be a positive number')
67+
68+
def update_profile(self, new_id: int, new_nickname: str):
69+
self.id = new_id
70+
71+
if not self.pro_user:
72+
print(f'Standart users can only change their id! '
73+
f'If you want to change your nickname become a pro user.)
74+
75+
return
76+
77+
self.nickname = new_nickname
78+
# Be careful if you rely on _validate_id() to verify self.id!
79+
# This won't be called if new_name is "".
80+
self.validate()
81+
```
82+
83+
* *Personal opinion*: bad, because it's not a clean solution from an OOP
84+
perspective to inherit `ValidationMixin` from classes without a "IS A"
85+
relationship with it.
86+
87+
* Consideration: if we use this option, we are limited on what can be validated.
88+
With the `in-toto` implementation of the `ValidationMixin`, we can only validate
89+
class attributes inside class methods.
90+
If we want to validate functions outside classes or function arguments we would
91+
have to enhance this solution.
92+
93+
* Consideration: if we use this option, we would be responsible for the code
94+
and all identified issues related to `securesystemslib.schemas` should be
95+
resolved by us or replace the schema implementation with something else.
96+
97+
* Consideration: if we want to enforce assignment validation, this solution
98+
should be combined with custom "setter" properties.
99+
100+
### Option 2: Usage of a third-party library called "pydantic"
101+
102+
* Good, because it's flexible:
103+
1. There is a `@validate_arguments` decorator which allows us to decide which
104+
functions to validate and the ability to validate functions outside classes.
105+
2. There is a `@validator` decorator which allows us to make a deeper validation
106+
beyond type checking for our class attributes.
107+
3. We can use an embedded `Config` class inside our classes, which allows for
108+
even more customization (for example enforce assignment validation).
109+
110+
* Good, because (according to their documentation) `pydantic` is the fastest
111+
validation library compared to others (including our other third-party library
112+
option `marshmallow`).
113+
See: https://pydantic-docs.helpmanual.io/benchmarks/
114+
115+
* Good, because it uses the built-in types from `python 3.6` onwards.
116+
117+
* Bad, because this library **has not yet implemented** a `strict` mode and
118+
the default behaviour when validating a certain argument or field is to **try
119+
a cast to the expected type from the received type**.
120+
To enable strict mode, we would have to add this manually through
121+
`validators` that are called before the cast.
122+
See: https://github.com/samuelcolvin/pydantic/issues/1098
123+
124+
* Bad, because there is a learning curve when using `pydantic`.
125+
1. For example, when I had to handle the `_type` attribute in `Signed` it took me
126+
a lot of reading to understand that standard attributes whose name begin with
127+
"_" are ignored. The `_type` attribute can only be `PrivateAttr`
128+
(defined in `pydantic`) even though we don't handle it as a typical private
129+
attribute.
130+
2. Also, I had difficulties using pydantic when there is inheritance.
131+
The initialization and validation of new objects was tricky.
132+
133+
* Bad, because it adds `2` new dependencies: `pydantic` and `typing-extensions`.
134+
This was concluded by performing the following steps:
135+
1. Creating a fresh virtual environment with python3.8.
136+
2. Installing all dependencies in `requirements-dev.txt` from `tuf`.
137+
3. Install `pydantic` with `pip install pydantic`.
138+
139+
## Links
140+
* [in-toto ValidatorMixin](https://github.com/in-toto/in-toto/blob/74da7a/in_toto/models/common.py#L27-L40)
141+
* [ValidatorMixing usage](https://github.com/in-toto/in-toto/blob/74da7a/in_toto/models/layout.py#L420-L438)
142+
* [Pydantic documentation](https://pydantic-docs.helpmanual.io/)
143+
144+
## Decision Outcome
145+
146+
*TODO: Make and describe the decision*

docs/adr/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ This log lists the architectural decisions for tuf.
1111
- [ADR-0004](0004-extent-of-OOP-in-metadata-model.md) - Add classes for complex metadata attributes
1212
- [ADR-0005](0005-use-google-python-style-guide.md) - Use Google Python style guide with minimal refinements
1313

14+
TODO: ADD TO INDEX
1415
<!-- adrlogstop -->
1516

1617
For new ADRs, please use [template.md](template.md) as basis.

0 commit comments

Comments
 (0)