Skip to content
This repository was archived by the owner on Aug 25, 2024. It is now read-only.

Commit e7cf793

Browse files
committed
util: testing: manifest: shim: Initial commit
Signed-off-by: John Andersen <[email protected]>
1 parent e91235f commit e7cf793

File tree

1 file changed

+301
-0
lines changed

1 file changed

+301
-0
lines changed

dffml/util/testing/manifest/shim.py

Lines changed: 301 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,301 @@
1+
#!/usr/bin/env python
2+
"""
3+
Manifest/TPS Report Shim
4+
========================
5+
6+
Validate and parse a TPS Report (manifest). Execute something for the next stage
7+
of parsing.
8+
9+
This file is used as a shim to bridge the gap between the parsing for the
10+
TPS manifest format and the next action to taken after parsing. This file allows
11+
for registration of phase 2 parsers via environment variables.
12+
13+
The purpose of this script is to preform the initial validation and parsing of
14+
the TPS manifest. It's responsibility is to then call the appropriate next phase
15+
manifest parser. It will pass the manifest's data in a format the next phase
16+
understands, and execute the next phase using capabilities defined within this
17+
file.
18+
19+
Updates
20+
-------
21+
22+
This file has been vendored into multiple locations. Please be sure to track
23+
progress as the format evolves upstream. Upstream URL:
24+
https://github.com/intel/dffml/blob/manifest/dffml/util/testing/manifest/shim.py
25+
26+
Pull Request for discussion:
27+
28+
Contributing
29+
------------
30+
31+
This section is documentation for contributing to the TPS Report (manifest)
32+
shim.
33+
34+
We want this shim to be usable on a default format which we'll work to define as
35+
a community upstream.
36+
37+
Design Goals
38+
````````````
39+
40+
This shim MUST
41+
42+
- Work with arbitrary manifest formats
43+
44+
- Discover verification mechanisms
45+
46+
- Verify the manifest (think secure boot)
47+
48+
- Parse the manifest
49+
50+
- Discover phase 2 parsers
51+
52+
- Output the manifest in a format the phase 2 parser can understand
53+
54+
- Execute the phase 2 parser
55+
56+
Format
57+
``````
58+
59+
We need to come up with a format that allows us to evolve it as we move
60+
forward.
61+
62+
To make sure we have forwards / backwards compatibility we should
63+
include information which allows us to identify what format the document
64+
is in, and what version of that format it is. This will likely also feed
65+
into our input dataflow requirements as we'll need to have the ability
66+
to check an arbitrary input to see if we might have an applicable
67+
converter.
68+
69+
Let's learn from JSON Schema and include a URL where we might be able
70+
to find the schema for the document. We can double up on our previous
71+
needs by asking that the filename of the URL can help us identify our
72+
document format (we'll provide fallback for if we don't have control
73+
over the filename via the ``document_format`` and ``$document_version``
74+
keys). We'll parse the URL for the filename component. When we parse it
75+
we'll split on ``.``. If the first part is eff (Extensible Format
76+
Format) we'll treat the rest up until the semantic version as the format
77+
name. Then the semantic version is the version of the format. Then the
78+
rest should be the extension which is associated with the format which
79+
we can use to validate the contents of the document, such as JSON
80+
schema.
81+
82+
``$schema: "https://example.com/eff.my.document.format.0.0.0.schema.json"``
83+
84+
TODO
85+
----
86+
87+
- Verification of the manifest. Idea: Developer generates manifest.
88+
Signs manifest with public asymmetric key. Prepends base64 encoded
89+
signature as a valid key, ``$signature``. This means you have to
90+
parse the YAML before you have verified the signature, which is not
91+
ideal. However, it's one method available to us and a simple parse
92+
without the use of a full YAML parser could be done. Or we could
93+
distribute out of band and verify the document before the conversion
94+
stage, in the loading stage.
95+
96+
- Verification of references within manifest. Do we support public
97+
portion of CA key embedded in the document various places? We
98+
could then use it for things like verification of git repos where
99+
the CA must sign all developer keys which are in the repo history.
100+
This will apply to anything that is an external reference in the
101+
document. There should be a way for the document to include an HMAC or
102+
something like that or something more dynamic like a CA.
103+
104+
Notes
105+
-----
106+
107+
- https://github.com/mjg59/ssh_pki
108+
109+
- Should we use this? No. Are we going to? Yes.
110+
"""
111+
import os
112+
import sys
113+
import pathlib
114+
import importlib
115+
import contextlib
116+
import dataclasses
117+
from typing import Dict
118+
119+
with contextlib.suppress((ImportError, ModuleNotFoundError)):
120+
import yaml
121+
122+
123+
def parse(contents: str):
124+
r'''
125+
Given the contents of the manifest file as a string, parse the contents into
126+
a dictionary object.
127+
128+
:param str conents: string containing the manifest file's contents
129+
:return: a dictionary representing the manifest
130+
:rtype: dict
131+
132+
>>> import textwrap
133+
>>> from dffml.util.testing.manifest.shim import parse
134+
>>>
135+
>>> parse(
136+
... textwrap.dedent(
137+
... """\
138+
... $document_format: tps.manifest
139+
... $document_version: 0.0.1
140+
... testplan:
141+
... - git:
142+
... repo: https://example.com/my-repo.git
143+
... branch: main
144+
... file: my_test.py
145+
... """
146+
... )
147+
... )
148+
{'$document_format': 'tps.manifest', '$document_version': '0.0.1', 'testplan': [{'git': {'repo': 'https://example.com/my-repo.git', 'branch': 'main', 'file': 'my_test.py'}}]}
149+
'''
150+
try:
151+
return json.loads(contents)
152+
except Exception as json_parse_error:
153+
if "yaml" not in sys.modules[__name__].__dict__:
154+
raise
155+
try:
156+
return yaml.safe_load(contents)
157+
except Exception as yaml_parse_error:
158+
raise yaml_parse_error from json_parse_error
159+
160+
from pprint import pprint
161+
162+
# Known parser mapping
163+
parse = {
164+
(
165+
"tps.manifest",
166+
"0.0.0",
167+
"dataflow",
168+
): self.parse_my_document_format_0_0_0_dataflow
169+
}
170+
# Grab mapped parser
171+
document_format_version_output_mode = (
172+
doc.get("$document_format", None),
173+
doc.get("$document_version", None),
174+
doc.get("$document_version", None),
175+
)
176+
parser = parse.get(document_format_version, None)
177+
178+
if parser is None:
179+
raise Exception(
180+
f"Unknown document format/version pair: {document_format_version}"
181+
)
182+
183+
print()
184+
pprint(doc)
185+
print()
186+
parser(doc)
187+
188+
def parse_my_document_format_0_0_0_dataflow(self, doc):
189+
pass
190+
191+
192+
@dataclasses.dataclass
193+
class ManifestFormatParser:
194+
"""
195+
Read in configuration to determine what the next phase of parsing is.
196+
197+
args holds arguments passed to target.
198+
"""
199+
200+
format_name: str
201+
version: str
202+
output: str
203+
action: str
204+
target: str
205+
args: str = ""
206+
207+
208+
ENV_PREFIX = "TPS_MANIFEST_"
209+
210+
211+
def environ_discover_dataclass(
212+
dataclass,
213+
environ: Dict[str, str] = None,
214+
*,
215+
prefix: str = ENV_PREFIX,
216+
dataclass_key: str = None,
217+
):
218+
r"""
219+
>>> import dataclasses
220+
>>> from dffml.util.testing.manifest.shim import environ_discover_dataclass
221+
>>>
222+
>>> @dataclasses.dataclass
223+
... class MyDataclass:
224+
... name: str
225+
... version: str
226+
>>>
227+
>>> environ_discover_dataclass(
228+
... MyDataclass,
229+
... {
230+
... "MYPREFIX_NAME_EXAMPLE_FORMAT": "Example Format",
231+
... "MYPREFIX_VERSION_EXAMPLE_FORMAT": "0.0.1",
232+
... },
233+
... prefix="MYPREFIX_",
234+
... )
235+
{'example_format': MyDataclass(name='Example Format', version='0.0.1')}
236+
>>>
237+
>>> environ_discover_dataclass(
238+
... MyDataclass,
239+
... {
240+
... "MYPREFIX_VERSION_EXAMPLE_FORMAT": "0.0.1",
241+
... },
242+
... prefix="MYPREFIX_",
243+
... dataclass_key="name",
244+
... )
245+
{'example_format': MyDataclass(name='example_format', version='0.0.1')}
246+
"""
247+
if environ is None:
248+
environ = os.environ
249+
discovered_parsers = {}
250+
for key, value in environ.items():
251+
if not key.startswith(prefix):
252+
continue
253+
metadata_key, parser_name = (
254+
key[len(prefix) :].lower().split("_", maxsplit=1)
255+
)
256+
discovered_parsers.setdefault(parser_name, {})
257+
discovered_parsers[parser_name][metadata_key] = value
258+
# Ensure they are loaded into the correct class
259+
for key, value in discovered_parsers.items():
260+
if dataclass_key is not None:
261+
value[dataclass_key] = key
262+
discovered_parsers[key] = dataclass(**value)
263+
return discovered_parsers
264+
265+
266+
def shim(manifest: str, lockdown: bool, strict: bool):
267+
parsers = environ_discover_dataclass(
268+
ManifestFormatParser, dataclass_key="format_name", environ=os.environ
269+
)
270+
print(parsers)
271+
272+
273+
def make_parser():
274+
parser = argparse.ArgumentParser(
275+
prog="shim.py",
276+
formatter_class=argparse.RawDescriptionHelpFormatter,
277+
description=__doc__,
278+
)
279+
280+
parser.add_argument(
281+
"-l", "--lockdown", type=bool, action="store_true", default=False,
282+
)
283+
parser.add_argument(
284+
"-s", "--strict", type=argparse.FileType("r"), default=sys.stdin
285+
)
286+
parser.add_argument(
287+
"-i", "--input", type=argparse.FileType("r"), default=sys.stdin
288+
)
289+
parser.add_argument(
290+
"-o", "--output", type=argparse.FileType("w"), default=sys.stdout
291+
)
292+
parser.add_argument("-n", "--name", help="Name of function to replace")
293+
return parser
294+
295+
296+
def main():
297+
parser = make_parser()
298+
args = parser.parse_args()
299+
args.output.write(
300+
replace_function(args.input.read(), args.name, args.func.read()) + "\n"
301+
)

0 commit comments

Comments
 (0)