Skip to content

Commit 5e48c67

Browse files
authored
Merge pull request #3727 from handrews/encodings
Clarify how to model binary data in 3.1
2 parents ddbd53f + 8de5a93 commit 5e48c67

File tree

1 file changed

+60
-16
lines changed

1 file changed

+60
-16
lines changed

versions/3.1.1.md

+60-16
Original file line numberDiff line numberDiff line change
@@ -170,6 +170,40 @@ The formats defined by the OAS are:
170170
`number` | `double` | |
171171
`string` | `password` | A hint to obscure the value.
172172

173+
#### <a name="binaryData"></a>Working With Binary Data
174+
175+
The OAS can describe either _raw_ or _encoded_ binary data.
176+
177+
* **raw binary** is used where unencoded binary data is allowed, such as when sending a binary payload as the entire HTTP message body, or as part of a `multipart/*` payload that allows binary parts
178+
* **encoded binary** is used where binary data is embedded in a text-only format such as `application/json` or `application/x-www-form-urlencoded` (either as a message body or in the URL query string).
179+
180+
In the following table showing how to use Schema Object keywords for binary data, we use `image/png` as an example binary media type. Any binary media type, including `application/octet-stream`, is sufficient to indicate binary content.
181+
182+
Keyword | Raw | Encoded | Comments
183+
------- | --- | ------- | --------
184+
`type` | _omit_ | `string` | raw binary is [outside of `type`](https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-00#section-4.2.3)
185+
`contentMediaType` | `image/png` | `image/png` | can sometimes be omitted if redundant (see below)
186+
`contentEncoding` | _omit_ | `base64`&nbsp;or&nbsp;`base64url` | other encodings are [allowed](https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-validation-00#section-8.3)
187+
188+
Note that the encoding indicated by `contentEncoding`, which inflates the size of data in order to represent it as 7-bit ASCII text, is unrelated to HTTP's `Content-Encoding` header, which indicates whether and how a message body has been compressed and is applied after all content serialization described in this section has occurred. Since HTTP allows unencoded binary message bodies, there is no standardized HTTP header for indicating base64 or similar encoding of an entire message body.
189+
190+
Using a `contentEncoding` of `base64url` ensures that URL encoding (as required in the query string and in message bodies of type `application/x-www-form-urlencoded`) does not need to further encode any part of the already-encoded binary data.
191+
192+
The `contentMediaType` keyword is redundant if the media type is already set:
193+
194+
* as the key for a [`MediaType Object`](#mediaTypeObject)
195+
* in the `contentType` field of an [`Encoding Object`](#encodingObject)
196+
197+
If the Schema Object will be processed by a non-OAS-aware JSON Schema implementation, it may be useful to include `contentMediaType` even if it is redundant. However, if `contentMediaType` contradicts a relevant Media Type Object or Encoding Object, then `contentMediaType` SHALL be ignored.
198+
199+
The following table shows how to migrate from OAS 3.0 binary data descriptions, continuing to use `image/png` as the example binary media type:
200+
201+
OAS < 3.1 | OAS 3.1 | Comments
202+
--------- | ------- | --------
203+
`type: string`<br />`format: binary` | `contentMediaType: image/png` | if redundant, can be omitted, often resulting in an empty Schema Object
204+
`type: string`<br />`format: byte` | `type: string`<br />`contentMediaType: image/png`<br />`contentEncoding: base64` | note that `base64url` can be used to avoid re-encoding the base64 string to be URL-safe
205+
206+
173207
### <a name="richText"></a>Rich Text Formatting
174208
Throughout the specification `description` fields are noted as supporting CommonMark markdown formatting.
175209
Where OpenAPI tooling renders rich text it MUST support, at a minimum, markdown syntax as described by [CommonMark 0.27](https://spec.commonmark.org/0.27/). Tooling MAY choose to ignore some CommonMark features to address security concerns.
@@ -1458,9 +1492,7 @@ application/json:
14581492

14591493
In contrast with the 2.0 specification, `file` input/output content in OpenAPI is described with the same semantics as any other schema type.
14601494

1461-
In contrast with the 3.0 specification, the `format` keyword has no effect on the content-encoding of the schema. JSON Schema offers a `contentEncoding` keyword, which may be used to specify the `Content-Encoding` for the schema. The `contentEncoding` keyword supports all encodings defined in [RFC4648](https://tools.ietf.org/html/rfc4648), including "base64" and "base64url", as well as "quoted-printable" from [RFC2045](https://tools.ietf.org/html/rfc2045#section-6.7). The encoding specified by the `contentEncoding` keyword is independent of an encoding specified by the `Content-Type` header in the request or response or metadata of a multipart body -- when both are present, the encoding specified in the `contentEncoding` is applied first and then the encoding specified in the `Content-Type` header.
1462-
1463-
JSON Schema also offers a `contentMediaType` keyword. However, when the media type is already specified by the Media Type Object's key, or by the `contentType` field of an [Encoding Object](#encodingObject), the `contentMediaType` keyword SHALL be ignored if present.
1495+
In contrast with the 3.0 specification, the `format` keyword has no effect on the content-encoding of the schema. Instead, JSON Schema's `contentEncoding` and `contentMediaType` keywords are used. See [Working With Binary Data](#binaryData) for how to model various scenarios with these keywords, and how to migrate from the previous `format` usage.
14641496

14651497
Examples:
14661498

@@ -1478,19 +1510,6 @@ content:
14781510
application/octet-stream: {}
14791511
```
14801512

1481-
Binary content transferred with base64 encoding:
1482-
1483-
```yaml
1484-
content:
1485-
image/png:
1486-
schema:
1487-
type: string
1488-
contentMediaType: image/png
1489-
contentEncoding: base64
1490-
```
1491-
1492-
Note that the `Content-Type` remains `image/png`, describing the semantics of the payload. The JSON Schema `type` and `contentEncoding` fields explain that the payload is transferred as text. The JSON Schema `contentMediaType` is technically redundant, but can be used by JSON Schema tools that may not be aware of the OpenAPI context.
1493-
14941513
These examples apply to either input payloads of file uploads or response payloads.
14951514

14961515
A `requestBody` for submitting a file in a `POST` operation may look like the following example:
@@ -1567,6 +1586,8 @@ When passing in `multipart` types, boundaries MAY be used to separate sections o
15671586

15681587
Per the JSON Schema specification, `contentMediaType` without `contentEncoding` present is treated as if `contentEncoding: identity` were present. While useful for embedding text documents such as `text/html` into JSON strings, it is not useful for a `multipart/form-data` part, as it just causes the document to be treated as `text/plain` instead of its actual media type. Use the Encoding Object without `contentMediaType` if no `contentEncoding` is required.
15691588

1589+
Note that only `multipart/*` media types with named parts can be described as shown here. Note also that while `multipart/form-data` originally defined a per-part `Content-Transfer-Encoding` header that could indicate base64 encoding (`contentEncoding: base64`), it has been deprecated for use with HTTP as of [RFC7578](https://www.rfc-editor.org/rfc/rfc7578#section-4.7).
1590+
15701591
Examples:
15711592

15721593
```yaml
@@ -1620,6 +1641,8 @@ This object MAY be extended with [Specification Extensions](#specificationExtens
16201641

16211642
##### Encoding Object Example
16221643

1644+
`multipart/form-data` allows for binary parts:
1645+
16231646
```yaml
16241647
requestBody:
16251648
content:
@@ -1655,6 +1678,27 @@ requestBody:
16551678
type: integer
16561679
```
16571680

1681+
`application/x-www-form-urlencoded` is a text format, which requires base64-encoding any binary data:
1682+
1683+
```YAML
1684+
requestBody:
1685+
content:
1686+
application/x-www-form-urlencoded:
1687+
schema:
1688+
type: object
1689+
properties:
1690+
name:
1691+
type: string
1692+
icon:
1693+
# default for type string is text/plain, need to declare
1694+
# the appropriate contentType in the Encoding Object
1695+
type: string
1696+
contentEncoding: base64url
1697+
encoding:
1698+
icon:
1699+
contentType: image/png, image/jpeg
1700+
```
1701+
16581702
#### <a name="responsesObject"></a>Responses Object
16591703

16601704
A container for the expected responses of an operation.

0 commit comments

Comments
 (0)