diff --git a/hypothetical_signature_attack.ipynb b/hypothetical_signature_attack.ipynb new file mode 100644 index 0000000..72cf67c --- /dev/null +++ b/hypothetical_signature_attack.ipynb @@ -0,0 +1 @@ +{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Copy of ITE-5: Hypothetical In-Toto Signature Attack.ipynb","provenance":[{"file_id":"https://github.com/MarkLodato/ITE/blob/ite-5/ITE/5/hypothetical_signature_attack.ipynb","timestamp":1601319956961}],"collapsed_sections":["yOiiQrZZSdlg"],"authorship_tag":"ABX9TyN4m90Onm73qJCQQe416IXO"},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"markdown","metadata":{"id":"ll0X3N1LtM_p"},"source":["##### Copyright 2020 Google LLC\n","\n","Licensed under the Apache License, Version 2.0 (the \"License\");\n","you may not use this file except in compliance with the License.\n","You may obtain a copy of the License at\n","\n","https://www.apache.org/licenses/LICENSE-2.0\n","\n","Unless required by applicable law or agreed to in writing, software\n","distributed under the License is distributed on an \"AS IS\" BASIS,\n","WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n","See the License for the specific language governing permissions and\n","limitations under the License."]},{"cell_type":"markdown","metadata":{"id":"XUErxuD6w0_W"},"source":["## Abstract\n","\n","This proof-of-concept attack shows the need for any signature scheme to have an authenticated \"context\" field indicating how to interpret the payload.\n","\n","_Author: Mark Lodato, Google, _ \n","_Date: September 2020_\n","\n","(To edit, [open this doc in Colab](https://colab.research.google.com/github/MarkLodato/ITE/blob/ite-5/ITE/5/hypothetical_signature_attack.ipynb).)"]},{"cell_type":"markdown","metadata":{"id":"_17nT3k6p19J"},"source":["## Overview\n","\n","In any cryptographic signature wrapper, the payload must be unambiguously interpreted, such that the signer and verifier are guaranteed to interpret the payload identically.\n","\n","Currently, in-toto and TUF achieved this by requiring that the payload be JSON and that the JSON have a `_type` key that indicates how it is used. Thus, there is only one way for the verifier to interpret the bitstream that the signer signed.\n","\n","However, there are ongoing discussions about (1) generalizing the signature wrapper so that it is no longer in-toto/TUF-specific, and (2) supporting in-toto payloads other than JSON. If either of these happen, then it will no longer be feasible to require the payload to be JSON. Instead, the signature wrapper **must** include some authenticated \"context\" indicator that describes how to interpret the payload.\n","\n","If the signature scheme does *not* include an authenticated context indicator, then an attacker can take a legitimate signed message of type X and get the victim to verify and interpret it as type Y.\n","\n","What follows is a worked example showing how it can happen in a realistic scenario.\n","\n","\n","\n","\n"]},{"cell_type":"markdown","metadata":{"id":"jKjOIpaez5J-"},"source":["## Scenario"]},{"cell_type":"markdown","metadata":{"id":"X4CrxasPz5_R"},"source":["This proof-of-concept assumes the following.\n","\n","(1) In-toto has been extended to support three different encodings of the link format: JSON, [CBOR](https://en.wikipedia.org/wiki/CBOR), and [Protobuf](https://github.com/grafeas/grafeas/blob/63aff549c1813170558b49e40f41147fd31ad1e3/proto/v1beta1/intoto.proto). In this scenario, the cryptographic wrapper has three fields:\n","\n","* `payload`: The serialized JSON, CBOR, or Protobuf byte stream.\n","* `payloadType`: How to interpret `payload`. One of \"JSON\", \"CBOR\", or \"Protobuf\".\n","* `signatures`: Cryptographic signatures over `payload` but **not** `payloadType`. **This is the problem.**\n","\n","Note: In this demo, the wrapper is always JSON, both `payload` and `signatures.sig` are encoded in base64, and the signature is over the raw bits prior to base64 encoding. However, this is immaterial to the attack.\n","\n","(2) There exists a trusted CI/CD service that allows callers to perform arbitrary build requests and returns a signed in-toto link file. This mirrors how system such as GitHub Actions or [Debian rebuilders](https://wiki.debian.org/ReproducibleBuilds) work. In our scenario, the build interface takes three user-defined parameters:\n","\n","* `command`: The shell command to run.\n","* `encoding`: The `payloadType` to return.\n","\n","**Problem:** An attacker can trick the CI/CD system to sign arbitrary messages. \n","\n","Suppose the following is a **legitimate** link file:\n","\n","```json\n","{\n"," \"command\": \"echo 'hello world'\",\n"," \"products\": { \"stdout\": { \"sha256\": \"a948904f2f0f479b8f8197694b30184b0d2ed1c1cd2a1ec0fb85d299a192a447\" } },\n"," \"materials\": {},\n"," \"_type\": \"link\"\n","}\n","```\n","\n","An attacker can instead get the CI/CD system to **falsely** sign:\n","\n","```json\n","{\n"," \"command\": \"echo 'hello world'\",\n"," \"products\": { \"stdout\": { \"sha256\": \"badbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadb\" } },\n"," \"materials\": {},\n"," \"_type\": \"link\"\n","}\n","```\n","\n","This can then be used by the attacker to get malicious file with sha256 hash \"badbad...\" to be accepted by an in-toto verifier."]},{"cell_type":"markdown","metadata":{"id":"CIF11AmzXJuQ"},"source":["## Outline of attack\n","\n","1. Construct a target payload T in protobuf format that we want the victim to consume.\n","2. Send a carefully crafted build request that results CI/CD returning a signed CBOR-type link file, such that the payload is interpreted as P when type is CBOR but T when type is protobuf.\n","3. Modify the `payloadType` field to say `Protobuf` instead of `CBOR`. This does not invalidate the signature because the `payloadType` is unauthenticated.\n","4. Send the modified link file to the victim. They will interpret the payload as T, even though the CI/CD system intended it to be interpreted as P."]},{"cell_type":"markdown","metadata":{"id":"qwTiWE8ARQVy"},"source":["## Mock implementations\n","\n","This demo uses the following mock implementations."]},{"cell_type":"markdown","metadata":{"id":"mFLCMDzrBNU5"},"source":["### Dependencies"]},{"cell_type":"code","metadata":{"id":"u-L_AAadzUWn"},"source":["!curl -o intoto.proto -sS https://raw.githubusercontent.com/grafeas/grafeas/63aff549c1813170558b49e40f41147fd31ad1e3/proto/v1beta1/intoto.proto\n","!protoc intoto.proto --python_out=."],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"eG7IQqXSri-U"},"source":["!pip install cbor pycryptodome"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"r1QZavg7KHIa"},"source":["### Crypto implementation\n"]},{"cell_type":"code","metadata":{"id":"9us8oiwyDIrZ"},"source":["from Crypto.Hash import SHA256\n","from Crypto.PublicKey import ECC\n","from Crypto.Signature import DSS\n","\n","secret_key = ECC.generate(curve='P-256')\n","public_key = secret_key.public_key()"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"jQQNYC5OJmXa"},"source":["def _PubkeySign(message: bytes) -> bytes:\n"," \"\"\"Returns the signature of `message`.\"\"\"\n"," h = SHA256.new(message)\n"," return DSS.new(secret_key, 'fips-186-3').sign(h)\n","\n","def _PubkeyVerify(message: bytes, signature: bytes) -> bool:\n"," \"\"\"Returns true if `message` was signed by `signature`.\"\"\"\n"," h = SHA256.new(message)\n"," try:\n"," DSS.new(public_key, 'fips-186-3').verify(h, signature)\n"," return True\n"," except ValueError:\n"," return False"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"ONTsGjB222si"},"source":["Tests to make sure it works correctly:"]},{"cell_type":"code","metadata":{"id":"A8Ip5uBNKLVe"},"source":["signature = _PubkeySign(b'good')\n","assert _PubkeyVerify(b'good', signature)\n","assert not _PubkeyVerify(b'bad', signature)\n"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"XPkPlVXaNIrN"},"source":["### CI/CD implementation"]},{"cell_type":"code","metadata":{"id":"pBAIZhaNosy0"},"source":["import base64, cbor, hashlib, json, subprocess, tempfile\n","\n","def Build(command, encoding):\n"," \"\"\"Runs `command` and returns a link file of the given `encoding`.\n"," \n"," WARNING: This isn't actually safe to do in a real CI/CD system. We're doing it\n"," here because it's just a demo where we trust the command.\n"," \"\"\"\n"," with tempfile.TemporaryDirectory() as directory:\n"," result = subprocess.run(command, shell=True, cwd=directory, check=True,\n"," stdout=subprocess.PIPE)\n"," link = {\n"," \"command\": command,\n"," \"materials\": {},\n"," \"products\": {\n"," 'stdout' : {\n"," 'sha256' : hashlib.sha256(result.stdout).hexdigest()\n"," }\n"," },\n"," \"byproducts\": {},\n"," \"_type\": \"link\",\n"," }\n"," if encoding == 'CBOR':\n"," payload = cbor.dumps(link)\n"," else:\n"," raise NotImplementedError('Encoding \"%s\" not implemented in this demo' % encoding)\n"," signature = _PubkeySign(payload)\n"," wrapper = {\n"," \"payload\": base64.b64encode(payload).decode('utf-8'),\n"," \"payloadType\": encoding,\n"," \"signatures\": [{\"sig\": base64.b64encode(signature).decode('utf-8')}],\n"," }\n"," return json.dumps(wrapper)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"LoShcMq3Ptkb"},"source":["Examples showing the wrapper and payload:"]},{"cell_type":"code","metadata":{"id":"rL_tZcvWVIkM","executionInfo":{"status":"ok","timestamp":1601045411246,"user_tz":240,"elapsed":11073,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"bb2bc132-2eca-47a5-bfae-924f4d6ba9bb","colab":{"base_uri":"https://localhost:8080/"}},"source":["link = Build('echo \"hello world\"', 'CBOR')\n","json.loads(link)"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":["{'payload': 'pWdjb21tYW5kcmVjaG8gImhlbGxvIHdvcmxkImltYXRlcmlhbHOgaHByb2R1Y3RzoWZzdGRvdXShZnNoYTI1NnhAYTk0ODkwNGYyZjBmNDc5YjhmODE5NzY5NGIzMDE4NGIwZDJlZDFjMWNkMmExZWMwZmI4NWQyOTlhMTkyYTQ0N2pieXByb2R1Y3RzoGVfdHlwZWRsaW5r',\n"," 'payloadType': 'CBOR',\n"," 'signatures': [{'sig': 'x5Ni6nWD6gaBHZSnN9tZHOGm3smSJY2ZAberyHHGa9WQepXOOb3UdqtJSuxyr7XgtZVZe/pCqk3xqxnhnIE8UQ=='}]}"]},"metadata":{"tags":[]},"execution_count":7}]},{"cell_type":"code","metadata":{"id":"mLR5sBpNrQsg","executionInfo":{"status":"ok","timestamp":1601045411248,"user_tz":240,"elapsed":11069,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"9b49510a-bb5d-4a61-cc5a-02f04f689cac","colab":{"base_uri":"https://localhost:8080/"}},"source":["cbor.loads(base64.b64decode(json.loads(link)['payload']))"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":["{'_type': 'link',\n"," 'byproducts': {},\n"," 'command': 'echo \"hello world\"',\n"," 'materials': {},\n"," 'products': {'stdout': {'sha256': 'a948904f2f0f479b8f8197694b30184b0d2ed1c1cd2a1ec0fb85d299a192a447'}}}"]},"metadata":{"tags":[]},"execution_count":8}]},{"cell_type":"code","metadata":{"id":"_FhQfZsqPZLd","executionInfo":{"status":"ok","timestamp":1601045411250,"user_tz":240,"elapsed":11064,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"d5048cbc-f707-440c-d094-4a55921b95f2","colab":{"base_uri":"https://localhost:8080/"}},"source":["link = Build('echo \"something else\"', 'CBOR')\n","cbor.loads(base64.b64decode(json.loads(link)['payload']))"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":["{'_type': 'link',\n"," 'byproducts': {},\n"," 'command': 'echo \"something else\"',\n"," 'materials': {},\n"," 'products': {'stdout': {'sha256': 'a1621be95040239ee14362c16e20510ddc20f527d772d823b2a1679b33f5cd74'}}}"]},"metadata":{"tags":[]},"execution_count":9}]},{"cell_type":"markdown","metadata":{"id":"IDcWni4RsHua"},"source":["### Verifier implementation"]},{"cell_type":"markdown","metadata":{"id":"CxjvWqjcsNkZ"},"source":["Instead of writing an actual layout, we simply have the verifier print out the payload. It is sufficient to demonstrate the attack if one signed payload can be interpreted in two different ways."]},{"cell_type":"code","metadata":{"id":"X4TdPm33sdOx"},"source":["import base64, cbor, json, intoto_pb2, pprint\n","\n","def VerifyAndPrint(link_serialized):\n"," \"\"\"Verifies the signature and then prints the payload.\n","\n"," NOTE: The schema differs slightly between JSON/CBOR and Proto formats.\n"," This function does not convert between them.\n"," \"\"\"\n"," wrapper = json.loads(link_serialized)\n"," payload_bytes = base64.b64decode(wrapper['payload'])\n"," signature = base64.b64decode(wrapper['signatures'][0]['sig'])\n"," if not _PubkeyVerify(payload_bytes, signature):\n"," print(\"Bad signature\")\n"," else:\n"," print(\"Good signature\")\n"," link = DECODERS[wrapper['payloadType']](payload_bytes)\n"," pprint.pprint(link)\n","\n","DECODERS = {\n"," 'JSON': json.loads,\n"," 'CBOR': cbor.loads,\n"," 'Protobuf': intoto_pb2.Link.FromString,\n","}"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"QjuytLUWN4Sk","executionInfo":{"status":"ok","timestamp":1601045411255,"user_tz":240,"elapsed":11060,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"a1b2b2c3-181e-48f1-c31e-0d9afb3d8726","colab":{"base_uri":"https://localhost:8080/"}},"source":["VerifyAndPrint(Build('echo \"hello world\"', 'CBOR'))"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Good signature\n","{'_type': 'link',\n"," 'byproducts': {},\n"," 'command': 'echo \"hello world\"',\n"," 'materials': {},\n"," 'products': {'stdout': {'sha256': 'a948904f2f0f479b8f8197694b30184b0d2ed1c1cd2a1ec0fb85d299a192a447'}}}\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"id":"xhL224vb7SCJ","executionInfo":{"status":"ok","timestamp":1601045411259,"user_tz":240,"elapsed":11057,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"03858e0d-d11c-45d6-fdf0-23dbbfc3573b","colab":{"base_uri":"https://localhost:8080/"}},"source":["VerifyAndPrint(Build('echo \"goodbye world\"', 'CBOR'))"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Good signature\n","{'_type': 'link',\n"," 'byproducts': {},\n"," 'command': 'echo \"goodbye world\"',\n"," 'materials': {},\n"," 'products': {'stdout': {'sha256': '8ef67e7cf7addbb1946c13778f51f8bfa3ee261b1016f6828796dd9fca632fc4'}}}\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"id":"9Uq0h9-y7aV4","executionInfo":{"status":"ok","timestamp":1601045411263,"user_tz":240,"elapsed":11054,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"acd2a893-406d-417e-e018-da8943714914","colab":{"base_uri":"https://localhost:8080/"}},"source":["orig = Build('echo \"hello world\"', 'CBOR')\n","link = json.loads(orig)\n","link['payload'] = 'x' + link['payload'][1:]\n","VerifyAndPrint(json.dumps(link))"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Bad signature\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"FvxotJilw9RA"},"source":["## Step 1: Construct target payload\n","\n","First, we construct our target payload in protobuf format. This is what we want the victim to accept."]},{"cell_type":"code","metadata":{"id":"H68z0n7TPD35","executionInfo":{"status":"ok","timestamp":1601045411265,"user_tz":240,"elapsed":11049,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"55d1992d-f8fd-42b9-b1a5-7248604c505a","colab":{"base_uri":"https://localhost:8080/"}},"source":["%%writefile payload.textproto\n","effective_command: 'echo \"hello world\"'\n","products {\n"," resource_uri: \"stdout\"\n"," hashes {\n"," sha256: \"badbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadb\"\n"," }\n","}"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Writing payload.textproto\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"id":"sHlqSDxYzszi","executionInfo":{"status":"ok","timestamp":1601045411457,"user_tz":240,"elapsed":11234,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"506c9941-6721-4e36-86e9-fff715a06e61","colab":{"base_uri":"https://localhost:8080/"}},"source":["import intoto_pb2\n","from google.protobuf import text_format\n","with open('payload.textproto') as f:\n"," target_payload = text_format.Parse(f.read(), intoto_pb2.Link()).SerializeToString()\n","target_payload"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":["b'\\n\\x12echo \"hello world\"\\x1aL\\n\\x06stdout\\x12B\\n@badbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadb'"]},"metadata":{"tags":[]},"execution_count":15}]},{"cell_type":"markdown","metadata":{"id":"yqfJa9-0RUak"},"source":["## Step 2: Construct build request\n","\n","Next, we need to craft a build command that results in the overall CBOR file being interpreted by the victim as our payload protobuf."]},{"cell_type":"markdown","metadata":{"id":"yOiiQrZZSdlg"},"source":["### Proto Parser tool\n","\n","The following tool will be useful for visualizing protobufs."]},{"cell_type":"code","metadata":{"id":"k7-BAdOPSUCt","executionInfo":{"status":"ok","timestamp":1601045411459,"user_tz":240,"elapsed":11229,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"12ff5d9b-f63e-4d32-dae8-317b96c2a594","colab":{"base_uri":"https://localhost:8080/"}},"source":["%%writefile proto_parser.py\n","#!/usr/bin/python3\n","\"\"\"Parses a raw proto wire-format file and shows how each byte is interpeted.\n","\n","USAGE: ./proto_parser.py [OPTIONS]\n","\n","Limitations (a.k.a. TODOs):\n","- Does not parse nested message types.\n","\"\"\"\n","\n","import io\n","import math\n","import shutil\n","import struct\n","\n","from typing import IO, Optional\n","\n","\n","class ProtobufValue:\n","\n"," def __init__(self, buf: bytes):\n"," self.buffer = buf\n","\n"," def format_buffer(self) -> str:\n"," a = []\n"," for value in bytearray(self.buffer):\n"," if value < 0x20 or value == 0x7f:\n"," # Unicode control code pictures\n"," # http://www.unicode.org/charts/nameslist/n_2400.html\n"," char = chr(0x2400 + value)\n"," elif value == 0x20:\n"," char = '\\u2423' # open box for space\n"," elif value >= 0x80:\n"," char = '\\u2426' # reverse question mark\n"," else:\n"," char = chr(value)\n"," a.append(char)\n"," return ''.join(a)\n","\n"," def type_name(self):\n"," return self.TYPE_NAME\n","\n","\n","class Varint(ProtobufValue):\n"," TYPE_NAME = 'varint'\n","\n"," def __init__(self, buf: bytes, value: int):\n"," super().__init__(buf)\n"," self.value = value\n","\n"," def format_value(self) -> str:\n"," # TODO: also print signed int, sint (zigzag), hex\n"," return '%d' % self.value\n","\n"," @classmethod\n"," def read(cls, f: IO[bytes], allow_missing=False) -> 'Optional[Varint]':\n"," buf = bytearray()\n"," value = 0\n"," while True:\n"," if len(buf) > 10:\n"," raise ValueError('varint exceeded maximum size')\n"," byte = f.read(1)\n"," if not byte:\n"," if allow_missing and not buf:\n"," return None\n"," raise ValueError('end of input while reading varint')\n"," buf.extend(byte)\n"," b = buf[-1]\n"," value |= (b & 0x7f) << (7 * (len(buf) - 1))\n"," if not (b & 0x80):\n"," break\n"," return cls(bytes(buf), value)\n","\n","\n","class FixedBase(ProtobufValue):\n"," # Subclasses must define: TYPE_NAME, byte_size, struct_int_code\n","\n"," def format_value(self) -> str:\n"," # TODO: also print signed int, hex, double\n"," return str(struct.unpack(self.struct_int_code, self.buffer)[0])\n","\n"," @classmethod\n"," def read(cls, f: IO[bytes]) -> 'FixedBase':\n"," b = f.read(cls.size)\n"," if len(b) != cls.size:\n"," raise ValueError('end of input while reading %s' % cls.TYPE_NAME)\n"," return cls(b)\n","\n","\n","class Fixed64(FixedBase):\n"," TYPE_NAME = 'fixed64'\n"," size = 8\n"," struct_int_code = 'L'\n","\n","\n","class Fixed32(FixedBase):\n"," TYPE_NAME = 'fixed32'\n"," size = 4\n"," struct_int_code = 'I'\n","\n","\n","class LengthDelimited(ProtobufValue):\n"," TYPE_NAME = 'length-delim'\n","\n"," def __init__(self, buf: bytes, value: bytes):\n"," super().__init__(buf)\n"," self.value = value\n","\n"," def format_value(self) -> str:\n"," # TODO: truncate\n"," if len(self.value) < 23:\n"," s = self.value.decode('unicode-escape')\n"," else:\n"," s = '%s...%s' % (self.value[:20].decode('unicode-escape'),\n"," self.value[-20:].decode('unicode-escape'))\n"," return 'length=%d value=%s' % (len(self.value), s)\n","\n"," def type_name(self):\n"," return 'length={}'.format(len(self.value))\n","\n"," @classmethod\n"," def read(cls, f: IO[bytes]) -> 'LengthDelimited':\n"," length = Varint.read(f)\n"," value = f.read(length.value)\n"," if len(value) != length.value:\n"," raise ValueError('expected %d bytes for length-delimited field; got %d' %\n"," (length.value, len(value)))\n"," return cls(length.buffer + value, value)\n","\n","\n","class StartGroup(ProtobufValue):\n"," TYPE_NAME = 'start-group'\n","\n"," def format_value(self) -> str:\n"," return ''\n","\n"," @classmethod\n"," def read(cls, f: IO[bytes]) -> 'StartGroup':\n"," return cls(b'')\n","\n","\n","class EndGroup(StartGroup):\n"," TYPE_NAME = 'end-group'\n","\n","\n","class Field:\n","\n"," def __init__(self, tag: Varint, start_pos: int, field_number: int,\n"," field_value: ProtobufValue):\n"," self.tag = tag\n"," self.start_pos = start_pos\n"," self.field_number = field_number\n"," self.field_value = field_value\n","\n"," def type_name(self) -> str:\n"," return self.field_value.type_name()\n","\n"," def format_buffer(self) -> str:\n"," return '{} {}'.format(self.tag.format_buffer(),\n"," self.field_value.format_buffer())\n","\n","\n","class BadField:\n","\n"," def __init__(self, tag: Varint, start_pos: int, field_number: int,\n"," error: str):\n"," self.tag = tag\n"," self.start_pos = start_pos\n"," self.field_number = field_number\n"," self.error = error\n","\n"," def type_name(self) -> str:\n"," return 'error'\n","\n"," def format_buffer(self) -> str:\n"," return '{} <{}>'.format(self.tag.format_buffer(), self.error)\n","\n","\n","TYPE_MAP = {\n"," 0: Varint,\n"," 1: Fixed64,\n"," 2: LengthDelimited,\n"," 3: StartGroup,\n"," 4: EndGroup,\n"," 5: Fixed32,\n","}\n","\n","\n","def decode(f: IO[bytes]):\n"," while True:\n"," start_pos = f.tell()\n"," try:\n"," tag = Varint.read(f, allow_missing=True)\n"," except ValueError as e:\n"," # TODO: would be nice to keep buffer of error\n"," tag = Varint(b'', 0) # dummy value\n"," yield BadField(tag, start_pos, -1, str(e))\n"," return\n"," if tag is None:\n"," return\n"," field_number, field_type = tag.value >> 3, tag.value & 7\n"," try:\n"," type_class = TYPE_MAP[field_type]\n"," except KeyError:\n"," yield BadField(tag, start_pos, field_number,\n"," 'invalid field type: %s' % field_type)\n"," return\n"," try:\n"," field_value = type_class.read(f)\n"," except ValueError as e:\n"," yield BadField(tag, start_pos, field_number, str(e))\n"," return\n"," yield Field(tag, start_pos, field_number, field_value)\n","\n","\n","def decode_and_print(data: bytes,\n"," *,\n"," width=None,\n"," limit=None,\n"," header=True) -> None:\n"," if width is None:\n"," width = 80\n"," if header:\n"," print('{:4} {:4} {:12} {}'.format('Pos.', 'Fld#', 'Type', 'Value'))\n"," for i, t in enumerate(decode(io.BytesIO(data))):\n"," line = '{pos:04X} {field_num:4d} {field_type:12} {buffer}'.format(\n"," pos=t.start_pos,\n"," field_num=t.field_number,\n"," field_type=t.type_name(),\n"," buffer=t.format_buffer(),\n"," )\n"," if width > 0 and len(line) > width:\n"," line = line[:width - 1] + '\\u2026' # ellipsis\n"," print(line)\n"," if limit and i + 1 >= limit:\n"," break\n","\n","\n","def main():\n"," import argparse\n"," description = globals()['__doc__'].split('\\n\\n', 1)[0]\n"," p = argparse.ArgumentParser(description=description)\n"," p.add_argument('file', help='file containing raw wire-format proto')\n"," p.add_argument(\n"," '--width',\n"," '-w',\n"," type=int,\n"," default=shutil.get_terminal_size((80, 20)).columns,\n"," help='width of output in columns; <= 0 means unlimited')\n"," p.add_argument(\n"," '--limit',\n"," '-l',\n"," type=int,\n"," help='limit the output to at most this many lines')\n"," args = p.parse_args()\n","\n"," with open(args.file, 'rb') as f:\n"," data = f.read()\n","\n"," decode_and_print(data, width=args.width, limit=args.limit)\n","\n","\n","if __name__ == '__main__':\n"," main()"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Writing proto_parser.py\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"eRrBSjv1Si2Y"},"source":["### Constructing the command"]},{"cell_type":"markdown","metadata":{"id":"07Yj-cZRWL8c"},"source":["First let's inspect the [CBOR](https://en.wikipedia.org/wiki/CBOR) payload with a dummy request. We'll pad it out to roughly the same length as our target payload because we know the length will affect the CBOR encoding."]},{"cell_type":"code","metadata":{"id":"XK2x1ouSSHdi","executionInfo":{"status":"ok","timestamp":1601045411460,"user_tz":240,"elapsed":11225,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"21743fed-46f0-464e-8682-aaaabd230a8f","colab":{"base_uri":"https://localhost:8080/"}},"source":["import json, base64, binascii\n","build_command = 'echo ' + 'x' * len(target_payload)\n","link = Build(build_command, 'CBOR')\n","payload = base64.b64decode(json.loads(link)['payload'])\n","print(binascii.hexlify(payload).decode('utf-8'))"],"execution_count":null,"outputs":[{"output_type":"stream","text":["a567636f6d6d616e6478676563686f207878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878787878696d6174657269616c73a06870726f6475637473a1667374646f7574a1667368613235367840376638343032636439343539303630626665643632373939636334613735396664353863643237333137313734343036643263393635336134616163643832656a627970726f6475637473a0655f74797065646c696e6b\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"sQcHcTR4hn61"},"source":["We can visualize this using :\n","\n","```\n","A5 # map(5)\n"," 67 # text(7)\n"," 636F6D6D616E64 # \"command\"\n"," 78 67 # text(103)\n"," 6563686F20787878... # \"echo xxx...\"\n","...\n","```\n","\n","The field that we have control over, `command`, starts at byte offset 11. Let's see how these first several bytes get interpreted as [protobuf](https://developers.google.com/protocol-buffers/docs/encoding) using our tool above:"]},{"cell_type":"code","metadata":{"id":"ALKtfsSFhSXi","executionInfo":{"status":"ok","timestamp":1601045411461,"user_tz":240,"elapsed":11218,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"9712010d-d995-4507-81a4-d1627012348d","colab":{"base_uri":"https://localhost:8080/"}},"source":["import proto_parser\n","proto_parser.decode_and_print(payload[:19])"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Pos. Fld# Type Value\n","0000 1652 fixed32 ␦g comm\n","0006 12 fixed64 a ndxgecho\n","000F 4 varint ␣ x\n","0011 15 varint x x\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"yVaPtJ07TDLO"},"source":["That means that the victim will interpret our message as having at least two fields, number 1652 and number 12. The real [intoto.proto](https://github.com/grafeas/grafeas/blob/63aff549c1813170558b49e40f41147fd31ad1e3/proto/v1beta1/intoto.proto) has no such fields, which causes the proto library to simply ignore those fields. Lucky for us!\n","\n","Furthermore, we get control over the parsed stream starting at the fifth byte of our command. See the first four bytes (`echo`) are part of field 12 (fixed64) and then the fifth byte (space) is interpreted as a varint-type field number 4?\n","\n","That means we want construct a valid shell command that does nothing but contains our protobuf wire-format payload starting at the fifth byte. We also need to shell-escape our payload so that the command does not fail.\n","\n","Here is such a command:\n","\n","```\n",": ''\n","```\n","\n","Let's try it:"]},{"cell_type":"code","metadata":{"id":"ut3lqnEePpFb","executionInfo":{"status":"ok","timestamp":1601045411463,"user_tz":240,"elapsed":11214,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"5f66ac27-faa1-4959-9d82-4750354051ce","colab":{"base_uri":"https://localhost:8080/"}},"source":["assert b\"'\" not in target_payload\n","build_command = b\": '\" + target_payload + b\"'\"\n","link = Build(build_command, 'CBOR')\n","payload = base64.b64decode(json.loads(link)['payload'])\n","proto_parser.decode_and_print(payload)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Pos. Fld# Type Value\n","0000 1652 fixed32 ␦g comm\n","0006 12 fixed64 a ndXg:␣␣'\n","000F 1 length=18 ␊ ␒echo␣\"hello␣world\"\n","0023 3 length=76 ␚ L␊␆stdout␒B␊@badbadbadbadbadbadbadbadbadbadbadbadbadba…\n","0071 4 error ' \n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"C6hc8NqhoduB"},"source":["Almost there! It correctly interprets fields 1 (`effective_command`) and 3 (`products`), but then it chokes on the `'` character ending our shell command.\n","\n","To fix this, we need to append a tag to our payload to tell the protobuf parser to consume the rest of the input as some dummy field, such as field number 15. The characters `z}` will do precisely that: `z` is field number 15 of type length-delimited, and `~` is length 126, which is the number of remaining bytes.\n","\n","Let's try it out:"]},{"cell_type":"code","metadata":{"id":"OfNQ0btAovqN","executionInfo":{"status":"ok","timestamp":1601045411464,"user_tz":240,"elapsed":11208,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"c8c10817-dd37-4600-b24f-408426468e35","colab":{"base_uri":"https://localhost:8080/"}},"source":["build_command = b\": '\" + target_payload + b\"z~'\"\n","link = Build(build_command, 'CBOR')\n","payload = base64.b64decode(json.loads(link)['payload'])\n","proto_parser.decode_and_print(payload)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Pos. Fld# Type Value\n","0000 1652 fixed32 ␦g comm\n","0006 12 fixed64 a ndXi:␣␣'\n","000F 1 length=18 ␊ ␒echo␣\"hello␣world\"\n","0023 3 length=76 ␚ L␊␆stdout␒B␊@badbadbadbadbadbadbadbadbadbadbadbadbadba…\n","0071 15 length=126 z ~'imaterials␦hproducts␦fstdout␦fsha256x@e3b0c44298fc1c…\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"BuxJHYvdpOIT"},"source":["Yay! We should be good to go. Let's verify by usiing the real proto parser."]},{"cell_type":"code","metadata":{"id":"eJ2viiK0ppTw","executionInfo":{"status":"ok","timestamp":1601045411465,"user_tz":240,"elapsed":11202,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"4de8f1d8-4663-4576-a439-f9915980b16e","colab":{"base_uri":"https://localhost:8080/"}},"source":["intoto_pb2.Link.FromString(payload)"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":["effective_command: \"echo \\\"hello world\\\"\"\n","products {\n"," resource_uri: \"stdout\"\n"," hashes {\n"," sha256: \"badbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadb\"\n"," }\n","}"]},"metadata":{"tags":[]},"execution_count":21}]},{"cell_type":"markdown","metadata":{"id":"52OVcmYOp2SV"},"source":["We're good to go!"]},{"cell_type":"markdown","metadata":{"id":"sw25nmi8pWZr"},"source":["## Steps 3 and 4: Pull off the attack\n","\n","Now that we have constructed our malicious build command, we need to send it to the server and get the victim to consume it."]},{"cell_type":"markdown","metadata":{"id":"YRtmg0NQpZDs"},"source":["First, send the malicious build request to the server and get back a signed CBOR message."]},{"cell_type":"code","metadata":{"id":"MCWY_Bwepfke","executionInfo":{"status":"ok","timestamp":1601045411466,"user_tz":240,"elapsed":11196,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"a54daa72-935e-44e6-a2e7-31c3641611e9","colab":{"base_uri":"https://localhost:8080/"}},"source":["build_command = b\": '\" + target_payload + b\"z~'\"\n","link_original = Build(build_command, 'CBOR')\n","json.loads(link_original) # Print it out for display purposes"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":["{'payload': 'pWdjb21tYW5kWGk6ICAnChJlY2hvICJoZWxsbyB3b3JsZCIaTAoGc3Rkb3V0EkIKQGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJ6fidpbWF0ZXJpYWxzoGhwcm9kdWN0c6Fmc3Rkb3V0oWZzaGEyNTZ4QGUzYjBjNDQyOThmYzFjMTQ5YWZiZjRjODk5NmZiOTI0MjdhZTQxZTQ2NDliOTM0Y2E0OTU5OTFiNzg1MmI4NTVqYnlwcm9kdWN0c6BlX3R5cGVkbGluaw==',\n"," 'payloadType': 'CBOR',\n"," 'signatures': [{'sig': 'LasJ/aXjyKdiVSNrA5uXTiH20D6Am7xa67nBEI9K6ZQYLBitn1NVhMpMEGY6QW7Qnyi6N/LKgZhLsAA5Mvur3Q=='}]}"]},"metadata":{"tags":[]},"execution_count":22}]},{"cell_type":"code","metadata":{"id":"NG7KVgMRvRVa","executionInfo":{"status":"ok","timestamp":1601045411467,"user_tz":240,"elapsed":11190,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"f80e3a37-3d1f-4e6a-ac85-d74e8a1378ce","colab":{"base_uri":"https://localhost:8080/"}},"source":["VerifyAndPrint(link_original)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Good signature\n","{'_type': 'link',\n"," 'byproducts': {},\n"," 'command': b': \\'\\n\\x12echo \"hello world\"\\x1aL\\n\\x06stdout\\x12B\\n@badbadbadb'\n"," b\"adbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbz~'\",\n"," 'materials': {},\n"," 'products': {'stdout': {'sha256': 'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855'}}}\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"id":"ct44_2tup9w2"},"source":["Next, change the `payloadType` to `Protobuf`."]},{"cell_type":"code","metadata":{"id":"4VE1vyDuqBJT","executionInfo":{"status":"ok","timestamp":1601045411468,"user_tz":240,"elapsed":11184,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"bd3cb5ea-3a6d-4c4d-90f8-1a3dea65388d","colab":{"base_uri":"https://localhost:8080/"}},"source":["link = json.loads(link_original)\n","link['payloadType'] = 'Protobuf'\n","link_modified = json.dumps(link)\n","json.loads(link_modified) # Print it out for display purposes"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":["{'payload': 'pWdjb21tYW5kWGk6ICAnChJlY2hvICJoZWxsbyB3b3JsZCIaTAoGc3Rkb3V0EkIKQGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJhZGJ6fidpbWF0ZXJpYWxzoGhwcm9kdWN0c6Fmc3Rkb3V0oWZzaGEyNTZ4QGUzYjBjNDQyOThmYzFjMTQ5YWZiZjRjODk5NmZiOTI0MjdhZTQxZTQ2NDliOTM0Y2E0OTU5OTFiNzg1MmI4NTVqYnlwcm9kdWN0c6BlX3R5cGVkbGluaw==',\n"," 'payloadType': 'Protobuf',\n"," 'signatures': [{'sig': 'LasJ/aXjyKdiVSNrA5uXTiH20D6Am7xa67nBEI9K6ZQYLBitn1NVhMpMEGY6QW7Qnyi6N/LKgZhLsAA5Mvur3Q=='}]}"]},"metadata":{"tags":[]},"execution_count":24}]},{"cell_type":"markdown","metadata":{"id":"6ClrxXmyqv7l"},"source":["Finally, send it to the victim and profit!"]},{"cell_type":"code","metadata":{"id":"mvxELrI0qzHI","executionInfo":{"status":"ok","timestamp":1601045411469,"user_tz":240,"elapsed":11178,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"6622bca7-ee77-463b-d6a7-ce7173837a1c","colab":{"base_uri":"https://localhost:8080/"}},"source":["VerifyAndPrint(link_modified)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Good signature\n","effective_command: \"echo \\\"hello world\\\"\"\n","products {\n"," resource_uri: \"stdout\"\n"," hashes {\n"," sha256: \"badbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadbadb\"\n"," }\n","}\n","\n"],"name":"stdout"}]}]} \ No newline at end of file diff --git a/reference_implementation.ipynb b/reference_implementation.ipynb new file mode 100644 index 0000000..d2be362 --- /dev/null +++ b/reference_implementation.ipynb @@ -0,0 +1 @@ +{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"name":"Copy of ITE-5: reference implementation.ipynb","provenance":[{"file_id":"https://github.com/MarkLodato/ITE/blob/ite-5/ITE/5/reference_implementation.ipynb","timestamp":1601319975941},{"file_id":"1gfSF3mbwhJcP2Lv_ZdOrodmX3qAOI7u8","timestamp":1601064929341}],"collapsed_sections":["yOiiQrZZSdlg"]},"kernelspec":{"name":"python3","display_name":"Python 3"}},"cells":[{"cell_type":"markdown","metadata":{"id":"ll0X3N1LtM_p"},"source":["##### Copyright 2020 Google LLC\n","\n","Licensed under the Apache License, Version 2.0 (the \"License\");\n","you may not use this file except in compliance with the License.\n","You may obtain a copy of the License at\n","\n","https://www.apache.org/licenses/LICENSE-2.0\n","\n","Unless required by applicable law or agreed to in writing, software\n","distributed under the License is distributed on an \"AS IS\" BASIS,\n","WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n","See the License for the specific language governing permissions and\n","limitations under the License."]},{"cell_type":"markdown","metadata":{"id":"XUErxuD6w0_W"},"source":["## Abstract\n","\n","Reference implementation for [ITE-5: Signature scheme that avoids canonicalization](https://github.com/MarkLodato/ITE/tree/ite-5/ITE/5).\n","\n","_Author: Mark Lodato, Google, _ \n","_Date: September 2020_\n","\n","(To edit, [open this doc in Colab](https://colab.research.google.com/github/MarkLodato/ITE/blob/ite-5/ITE/5/reference_implementation.ipynb).)"]},{"cell_type":"markdown","metadata":{"id":"1TaW-T326i9J"},"source":["## Implementation"]},{"cell_type":"code","metadata":{"id":"wYmIALHq6VZU"},"source":["import base64, binascii, json, struct\n","\n","def b64enc(m: bytes) -> str:\n"," return base64.standard_b64encode(m).decode('utf-8')\n","\n","def b64dec(m: str) -> bytes:\n"," m = m.encode('utf-8')\n"," try:\n"," return base64.b64decode(m, validate=True)\n"," except binascii.Error:\n"," return base64.b64decode(m, altchars='-_', validate=True)\n","\n","def PAE(payloadType: str, payload: bytes) -> bytes:\n"," return b''.join([struct.pack(' str:\n"," return json.dumps({\n"," 'payload': b64enc(payload),\n"," 'payloadType': payloadType,\n"," 'signatures': [{\"sig\": b64enc(signer.sign(PAE(payloadType, payload)))}],\n"," })\n","\n","def Verify(json_signature: str, verifier) -> (str, bytes):\n"," wrapper = json.loads(json_signature)\n"," payloadType = wrapper['payloadType']\n"," payload = b64dec(wrapper['payload'])\n"," pae = PAE(payloadType, payload)\n"," for signature in wrapper['signatures']:\n"," if verifier.verify(pae, b64dec(signature['sig'])):\n"," break\n"," else:\n"," raise ValueError('No valid signature found')\n"," return payloadType, payload"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"gQLf_dxIIOuV"},"source":["## Dummy crypto implementation"]},{"cell_type":"code","metadata":{"id":"9AIpNrzY7zdN"},"source":["!pip install pycryptodome"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"0QRXrBLF70zN"},"source":["from Crypto.Hash import SHA256\n","from Crypto.PublicKey import ECC\n","from Crypto.Signature import DSS\n","\n","class Signer:\n"," def __init__(self, secret_key):\n"," self.secret_key = secret_key\n"," self.public_key = self.secret_key.public_key()\n","\n"," @classmethod\n"," def generate(cls):\n"," return cls(ECC.generate(curve='P-256'))\n","\n"," def sign(self, message: bytes) -> bytes:\n"," \"\"\"Returns the signature of `message`.\"\"\"\n"," h = SHA256.new(message)\n"," return DSS.new(self.secret_key, 'deterministic-rfc6979').sign(h)\n","\n","\n","class Verifier:\n"," def __init__(self, public_key):\n"," self.public_key = public_key\n","\n"," def verify(self, message: bytes, signature: bytes) -> bool:\n"," \"\"\"Returns true if `message` was signed by `signature`.\"\"\"\n"," h = SHA256.new(message)\n"," try:\n"," DSS.new(self.public_key, 'fips-186-3').verify(h, signature)\n"," return True\n"," except ValueError:\n"," return False"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"OPQu5M9x-rPQ"},"source":["## Example"]},{"cell_type":"code","metadata":{"id":"0VIJU06pBxBT","executionInfo":{"status":"ok","timestamp":1601068987772,"user_tz":240,"elapsed":440,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"273f5554-d6fe-4312-a6b1-954f219b52ef","colab":{"base_uri":"https://localhost:8080/"}},"source":["signer = Signer.generate()\n","verifier = Verifier(signer.public_key)\n","print('Algorithm: ECDSA with deterministic-rfc6979 and SHA256')\n","print('Curve:', signer.secret_key.curve)\n","print('Public X:', signer.secret_key.pointQ.x)\n","print('Public Y:', signer.secret_key.pointQ.y)\n","print('Private d:', signer.secret_key.d)"],"execution_count":null,"outputs":[{"output_type":"stream","text":["Algorithm: ECDSA with deterministic-rfc6979 and SHA256\n","Curve: NIST P-256\n","Public X: 46950820868899156662930047687818585632848591499744589407958293238635476079160\n","Public Y: 5640078356564379163099075877009565129882514886557779369047442380624545832820\n","Private d: 97358161215184420915383655311931858321456579547487070936769975997791359926199\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"id":"BmxVhaE-C2zs","executionInfo":{"status":"ok","timestamp":1601068966016,"user_tz":240,"elapsed":387,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"3abd6986-ddfe-40f1-b928-22ad311b2c93","colab":{"base_uri":"https://localhost:8080/"}},"source":["signature_json = Sign('http://example.com/HelloWorld', b'hello world', signer)\n","json.loads(signature_json)"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":["{'payload': 'aGVsbG8gd29ybGQ=',\n"," 'payloadType': 'http://example.com/HelloWorld',\n"," 'signatures': [{'sig': 'y7BK8Mm8Mr4gxk4+G9X3BD1iBc/vVVuJuV4ubmsEK4m/8MhQOOS26ejx+weIjyAx8VjYoZRPpoXSNjHEzdE7nQ=='}]}"]},"metadata":{"tags":[]},"execution_count":50}]},{"cell_type":"code","metadata":{"id":"VXqcUMh3IHoM","executionInfo":{"status":"ok","timestamp":1601068967258,"user_tz":240,"elapsed":409,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"6bb8e289-3b00-4cc8-c6dd-0a308cfc8098","colab":{"base_uri":"https://localhost:8080/"}},"source":["Verify(signature_json, verifier)"],"execution_count":null,"outputs":[{"output_type":"execute_result","data":{"text/plain":["('http://example.com/HelloWorld', b'hello world')"]},"metadata":{"tags":[]},"execution_count":51}]},{"cell_type":"code","metadata":{"id":"BjwUSztTKjNE","executionInfo":{"status":"ok","timestamp":1601069676465,"user_tz":240,"elapsed":387,"user":{"displayName":"Mark Lodato","photoUrl":"https://lh3.googleusercontent.com/a-/AOh14Gje3VCiWTKvjKn9WNmOK7z5cDhpwtiSncwf3Flh=s64","userId":"14555828759934874531"}},"outputId":"a3cddfb6-b8a9-4f19-ec66-09d79ce762bd","colab":{"base_uri":"https://localhost:8080/"}},"source":["import binascii, textwrap\n","def print_hex(b: bytes):\n"," octets = ' '.join(textwrap.wrap(binascii.hexlify(b).decode('utf-8'), 2))\n"," print(*textwrap.wrap(octets, 48), sep='\\n')\n","\n","print_hex(PAE('http://example.com/HelloWorld', b'hello world'))"],"execution_count":null,"outputs":[{"output_type":"stream","text":["02 00 00 00 00 00 00 00 1d 00 00 00 00 00 00 00\n","68 74 74 70 3a 2f 2f 65 78 61 6d 70 6c 65 2e 63\n","6f 6d 2f 48 65 6c 6c 6f 57 6f 72 6c 64 0b 00 00\n","00 00 00 00 00 68 65 6c 6c 6f 20 77 6f 72 6c 64\n"],"name":"stdout"}]}]} \ No newline at end of file diff --git a/specification.md b/specification.md new file mode 100644 index 0000000..9c4e358 --- /dev/null +++ b/specification.md @@ -0,0 +1,339 @@ +# $signing_spec + +A signature scheme for software supply chain metadata that avoids canonicalization + +November 25, 2020 + +Version 0.1.0 + +## Abstract + +This document proposes a new signature scheme for use by, among others, the +in-toto and TUF projects. This signature scheme (a) avoids relying on +canonicalization for security and (b) reduces the possibility of +misinterpretation of the payload. The serialized payload is encoded as a string +and verified by the recipient _before_ deserializing. A backwards compatible +variant is available. + +## Specification + +$signing_spec does not rely on Canonical JSON, nor any other canonicalization +scheme. Instead, the producer records the signed bytes exactly as signed and the +consumer verifies those exact bytes before parsing. In addition, the signature +now includes an authenticated `payloadType` field indicating how to interpret +the payload. + +```json +{ + "payload": "", + "payloadType": "", + "signatures": [{ + …, + "sig": "" + }, …] +} +``` + +where PAE is the +[PASETO Pre-Authentication Encoding](https://github.com/paragonie/paseto/blob/master/docs/01-Protocol-Versions/Common.md#authentication-padding): + +```none +PAE([type, body]) := le64(2) || le64(len(type)) || type || le64(len(body)) || body +le64(n) := 64-bit little-endian encoding of `n`, where 0 <= n < 2^63 +``` + +The PAYLOAD_TYPE is a URI indicating how to interpret SERIALIZED_BODY. It +encompasses the content type (JSON, Canonical-JSON, CBOR, etc.), the purpose, +and the schema version of the payload. This obviates the need for the `_type` +field within in-toto/TUF payloads. This URI does not need to be resolved to a +remote resource, nor does such a resource need to be fetched. Examples: + +- https://in-toto.io/Link/v0.9 +- https://in-toto.io/Layout/v0.9 +- https://theupdateframework.com/Root/v1.0.5 +- etc... + +The switch from Hex to Base64 for `sig` is to save space and to be consistent +with `payload`. + +### Steps + +To sign: + +- Serialize BODY according to PAYLOAD_TYPE. Call the result SERIALIZED_BODY. +- Sign PAE([PAYLOAD_TYPE, SERIALIZED_BODY]), base64-encode the result, and + store it in `sig`. +- Base64-encode SERIALIZED_BODY and store it in `payload`. +- Store PAYLOAD_TYPE in `payloadType`. + +To verify: + +- Base64-decode `payload`; call this SERIALIZED_BODY. Reject if the decoding + fails. +- Base64-decode `sig` and verify PAE([PAYLOAD_TYPE, SERIALIZED_BODY]). Reject + if either the decoding or the signature verification fails. +- Parse SERIALIZED_BODY according to PAYLOAD_TYPE. Reject if the parsing + fails. + +Either standard or URL-safe base64 encodings are allowed. Signers may use +either, and verifiers must accept either. + +### Backwards compatible signatures + +To convert existing signatures from the current format to the new format, +`"backwards-compatible-json"` must be added to the payload type URI to indicate +that the signature is over the raw payload. This allows the signatures to remain +valid while avoiding the verifier from having to use CanonicalJson. + +```json +{ + "payload": "", + "payloadType": "/backwards-compatible-json", + "signatures" : [{ + …, + "sig" : "" + }, …] +} +``` + +Support for this backwards compatibility mode is optional. + +To sign: + +- BODY **must** be an object type (`{...}`). +- Serialize BODY as Canonical JSON; call this SERIALIZED_BODY. +- Sign SERIALIZED_BODY, base64-encode the result, and store it in `sig`. +- Base64-encode SERIALIZED_BODY and store it in `payload`. +- Store `"/backwards-compatible-json"` in `payloadType`. + +To verify: + +- If `payloadType` != `"/backwards-compatible-json"`, use the normal + verification process instead of this one. +- Base64-decode `payload`; call this SERIALIZED_BODY. Reject if the decoding + fails. +- Base64-decode `sig` and verify SERIALIZED_BODY. Reject if either the + decoding or the signature verification fails. +- Parse SERIALIZED_BODY as a JSON object. Reject if the parsing fails or if + the result is not a JSON object. In particular, the first byte of + SERIALIZED_BODY must be `{`. Verifiers **must not** require SERIALIZED_BODY + to be Canonical JSON. + +Backwards compatible signatures are not recommended because they lack the +authenticated payloadType indicator. + +This scheme is safe from rollback attacks because the first byte of +SERIALIZED_BODY must be 0x7b (`{`) in backwards compatibility mode and 0x02 in +regular mode. + +### Optional changes to wrapper + +The standard wrapper is JSON with an explicit `payloadType`. Optionally, +applications may encode the wrapper in other methods without invalidating the +signature: + +- An encoding other than JSON, such as CBOR or Protobuf. +- Use a default `payloadType` if omitted and/or code `payloadType` as a + shorter string or enum. + +At this point we do not standardize any other encoding. If a need arises, we may +do so in the future. + +### Differentiating between old and new formats + +Verifiers can differentiate between the old and new wrapper format by detecting +the presence of the `payload` field vs `signed` field. + +## Motivation + +There are two concerns with the current in-toto/TUF signature wrapper. + +First, the signature scheme depends on [Canonical JSON], which has one practical +problem and two theoretical ones: + +1. Practical problem: It requires the payload to be JSON or convertible to + JSON. While this happens to be true of in-toto and TUF today, a generic + signature layer should be able to handle arbitrary payloads. +1. Theoretical problem 1: Two semantically different payloads could have the + same canonical encoding. Although there are currently no known attacks on + Canonical JSON, there have been attacks in the past on other + canonicalization schemes + ([example](https://latacora.micro.blog/2019/07/24/how-not-to.html#canonicalization)). + It is safer to avoid canonicalization altogether. +1. Theoretical problem 2: It requires the verifier to parse the payload before + verifying, which is both error-prone—too easy to forget to verify—and an + unnecessarily increased attack surface. + +The preferred solution is to transmit the encoded byte stream exactly as it was +signed, which the verifier verifies before parsing. This is what is done in +[JWS] and [PASETO], for example. + +Second, the scheme does not include an authenticated "context" indicator to +ensure that the signer and verifier interpret the payload in the same exact way. +For example, if in-toto were extended to support CBOR and Protobuf encoding, the +signer could get a CI/CD system to produce a CBOR message saying X and then a +verifier to interpret it as a protobuf message saying Y. While we don't know of +an exploitable attack on in-toto or TUF today, potential changes could introduce +such a vulnerability. The signature scheme should be resilient against these +classes of attacks. See [example attack](hypothetical_signature_attack.ipynb) +for more details. + +## Reasoning + +Our goal was to create a signature wrapper that is as simple and foolproof as +possible. Alternatives such as [JWS] are extremely complex and error-prone, +while others such as [PASETO] are overly specific. (Both are also +JSON-specific.) We believe our proposal strikes the right balance of simplicity, +usefulness, and security. + +Rationales for specific decisions: + +- Why use base64 for payload and sig? + + - Because JSON strings do not allow binary data, so we need to either + encode the data or escape it. Base64 is a standard, reasonably + space-efficient way of doing so. Protocols that have a first-class + concept of "bytes", such as protobuf or CBOR, do not need to use base64. + +- Why sign raw bytes rather than base64 encoded bytes (as per JWS)? + + - Because it's simpler. Base64 is only needed for putting binary data in a + text field, such as JSON. In other formats, such as protobuf or CBOR, + base64 isn't needed at all. + +- Why does payloadType need to be signed? + + - See [Motivation](#motivation). + +- Why use PAE? + + - Because we need an unambiguous way of serializing two fields, + payloadType and payload. PAE is already documented and good enough. No + need to reinvent the wheel. + +- Why use a URI for payloadType rather than + [Media Type](https://www.iana.org/assignments/media-types/media-types.xhtml) + (a.k.a. MIME type)? + + - Because Media Type only indicates how to parse but does not indicate + purpose, schema, or versioning. If it were just "application/json", for + example, then every application would need to impose some "type" field + within the payload, lest we have similar vulnerabilities as if + payloadType were not signed. + - Also, URIs don't need to be registered while Media Types do. + +- Why use payloadType "backwards-compatible-json" instead of assuming + backwards compatible mode if payloadType is absent? + + - We wanted to leave open the possibility of having an + application-specific "default" value if payloadType is unspecified, + rather than forcing the default to be backwards compatibility mode. + - Note that specific applications can still choose backwards compatibility + to be the default. + +- Why not stay backwards compatible by requiring the payload to always be JSON + with a "_type" field? Then if you want a non-JSON payload, you could simply + have a field that contains the real payload, e.g. `{"_type":"my-thing", + "value":"base64…"}`. + + 1. It encourages users to add a "_type" field to their payload, which in + turn: + - (a) Ties the payload type to the authentication type. Ideally the + two would be independent. + - (b) May conflict with other uses of that same field. + - (c) May require the user to specify type multiple times with + different field names, e.g. with "@context" for + [JSON-LD](https://json-ld.org/). + 2. It would incur double base64 encoding overhead for non-JSON payloads. + 3. It is more complex than PAE. + +## Backwards Compatibility + +### Current format + +The +[current signature format](https://github.com/in-toto/docs/blob/master/in-toto-spec.md#42-file-formats-general-principles) +used by TUF and in-toto has a BODY that is a regular JSON object and a signature over the +[Canonical JSON] serialization of BODY. + +```json +{ + "signed": , + "signatures": [{ + …, + "sig": "" + }, …] +} +``` + +To verify, the consumer parses the whole JSON file, re-serializes BODY using +Canonical JSON, then verifies the signature. + +### Detect if a document is using old format + +To detect whether a signature is in the old or new format: + +- If it contains a `payload` field, assume it is in the new format. +- If it contains a `signed` field, assume it is in the old format. + +To convert an existing signature to the new format: + +- `new.payload = base64encode(CanonicalJson(orig.signed))` +- `new.payloadType = "/backwards-compatible-json"` +- `new.signatures[*].sig = base64encode(hexdecode(orig.signatures[*].sig))` + +To convert a backwards compatible signature to the old format: + +- `old.signed = jsonparse(base64decode(new.payload))` +- `old.signatures[*].sig = hexencode(base64decode(new.signatures[*].sig))` + +## Testing + +See [reference implementation](reference_implementation.ipynb). Here is an +example. + +BODY: + +```none +hello world +``` + +PAYLOAD_TYPE: + +```none +http://example.com/HelloWorld +``` + +PAE: + +```none +02 00 00 00 00 00 00 00 1d 00 00 00 00 00 00 00 +68 74 74 70 3a 2f 2f 65 78 61 6d 70 6c 65 2e 63 +6f 6d 2f 48 65 6c 6c 6f 57 6f 72 6c 64 0b 00 00 +00 00 00 00 00 68 65 6c 6c 6f 20 77 6f 72 6c 64 +``` + +Cryptographic keys: + +```none +Algorithm: ECDSA over NIST P-256 and SHA-256, with deterministic-rfc6979 +Signature: raw concatenation of r and s (Cryptodome binary encoding) +X: 46950820868899156662930047687818585632848591499744589407958293238635476079160 +Y: 5640078356564379163099075877009565129882514886557779369047442380624545832820 +d: 97358161215184420915383655311931858321456579547487070936769975997791359926199 +``` + +Signed wrapper: + +```json +{"payload": "aGVsbG8gd29ybGQ=", + "payloadType": "http://example.com/HelloWorld", + "signatures": [{"sig": "y7BK8Mm8Mr4gxk4+G9X3BD1iBc/vVVuJuV4ubmsEK4m/8MhQOOS26ejx+weIjyAx8VjYoZRPpoXSNjHEzdE7nQ=="}]} +``` + +## References + +- [Canonical JSON](http://wiki.laptop.org/go/Canonical_JSON) +- [JWS](https://tools.ietf.org/html/rfc7515) +- [PASETO](https://github.com/paragonie/paseto/blob/master/docs/01-Protocol-Versions/Version2.md#sig) +