Skip to content

Commit b58edbb

Browse files
committed
Move HTML bits into a separate section, with callouts into the documentLoader algorithm to contain HTML-related processing. Also, removes processor levels describing this as a processor supporting "HTML script extraction".
For w3c/json-ld-syntax#213.
1 parent a710717 commit b58edbb

File tree

1 file changed

+103
-139
lines changed

1 file changed

+103
-139
lines changed

index.html

Lines changed: 103 additions & 139 deletions
Original file line numberDiff line numberDiff line change
@@ -940,52 +940,6 @@ <h2>RDF Serialization/Deserialization</h2>
940940
</tr>
941941
</tbody>
942942
</table>
943-
944-
<section>
945-
<h2>Processor Levels</h2>
946-
947-
<!--
948-
**************** IMPORTANT WARNING ****************
949-
This section is duplicated (with a few adaptations)
950-
in other JSON-LD specifications. Since these
951-
definitions are normative, it is important to
952-
reflect any change in the other documents.
953-
***************************************************
954-
-->
955-
956-
<p>JSON-LD mostly uses the JSON syntax [[RFC8259]] along with
957-
various micro-syntaxes based on XML Schema datatypes [[XMLSCHEMA11-2]].
958-
However, it has become increasingly common to include JSON within
959-
a <a data-cite="HTML/scripting.html#the-script-element">script element</a>
960-
within an HTML document [[HTML]],
961-
as described in <a href="#html-content-algorithms" class="sectionRef"></a>.
962-
As not all processors operate in an environment which can include HTML,
963-
this specification describes various categories of JSON-LD processors.</p>
964-
965-
<p>A <dfn data-cite="JSON-LD11#pure-json-processor">pure JSON Processor</dfn> only requires the use of a
966-
JSON processor and is restricted to processing documents retrieved
967-
with a JSON content type (e.g., <code>application/ld+json</code> or other JSON type).</p>
968-
969-
<p>A <dfn data-cite="JSON-LD11#full-processor">full Processor</dfn> is capable of processing JSON-LD embedded in HTML,
970-
in addition to the capabilities of a <a>pure JSON Processor</a>.</p>
971-
972-
<section class="informative">
973-
<h3>Additional Processor Levels</h3>
974-
975-
<p>In addition to the normatively defined processor levels, an additional processor
976-
level is defined for reference.</p>
977-
978-
<p>A <dfn data-cite="JSON-LD11#event-based-json-processor">event-based JSON Processor</dfn> processes a stream of characters
979-
expecting an event after each syntactic element is encountered.
980-
Such processors are sensitive to the order of the members of <a>JSON objects</a>,
981-
which can have a performance impact if the members of <a>JSON objects</a> are encountered in an unexpected order.
982-
An <a>event-based JSON Processor</a> may process JSON-LD embedded in HTML.</p>
983-
984-
<p class="note">An <a>event-based JSON Processor</a>
985-
may be sensitive to processing certain keywords in order, including
986-
<code>@context</code>, <code>@id</code>, and <code>@type</code>.</p>
987-
</section>
988-
</section>
989943
</section> <!-- end of Conformance section -->
990944

991945
<section>
@@ -5212,29 +5166,6 @@ <h2>Data Round Tripping</h2>
52125166
</section> <!-- end of Data Round Tripping -->
52135167
</section>
52145168

5215-
<section class="changed">
5216-
<h2>HTML Content Algorithms</h2>
5217-
<p class="note">This section describes features required of a <a>full Processor</a>.</p>
5218-
<section id="extract-script-content" class="algorithm">
5219-
<h3>Extract Script Content Algorithm</h3>
5220-
5221-
<p>The algorithm extracts the text content a
5222-
<a>JSON-LD script element</a> into a <a>map</a> or <a>array</a> of <a>maps</a>.
5223-
A <dfn>JSON-LD script element</dfn> is a <a data-cite="HTML/scripting.html#the-script-element">script element</a>
5224-
within an HTML [[HTML]] document with the <a data-cite="HTML/semantics.html#attr-link-type">type attribute</a> set to
5225-
<code>application/ld+json</code>.</p>
5226-
5227-
<p>The algorithm takes a single required input variable: <var>source</var>,
5228-
the <a data-cite="DOM#dom-node-textcontent">textContent</a> of an HTML <a data-cite="HTML/scripting.html#the-script-element">script element</a>.</p>
5229-
5230-
<ol>
5231-
<li>If <var>source</var> is not a valid JSON document,
5232-
an <a data-link-for="JsonLdErrorCode">invalid script element</a> has been detected, and processing is aborted.</li>
5233-
<li>Return the result of transforming <var>source</var> into the <a>internal representation</a>.</li>
5234-
</ol>
5235-
</section>
5236-
</section>
5237-
52385169
<section>
52395170
<h2>The Application Programming Interface</h2>
52405171

@@ -5817,9 +5748,6 @@ <h3>LoadDocumentCallback</h3>
58175748
it MUST be added as a profile on <code>application/ld+json</code>.</p>
58185749

58195750
<p>Processors MAY include other media types using a <code>+json</code> suffix as defined in [[RFC6839]].</p>
5820-
5821-
<p>A <a>full Processor</a> MUST include <code>text/html</code> at any preference level,
5822-
unless <a data-link-for="LoadDocumentOptions">requestProfile</a> is `http://www.w3.org/ns/json-ld#context`.</p>
58235751
</li>
58245752
<li>Set <var>documentUrl</var> to the location of the retrieved resource
58255753
considering redirections (exclusive of HTTP status <code>303</code> "See Other" redirects
@@ -5831,7 +5759,7 @@ <h3>LoadDocumentCallback</h3>
58315759
set <var>url</var> to the associated <code>href</code> relative to the previous <var>url</var>
58325760
and restart the algorithm from <a href="#LoadDocumentCallback-step-2">step 2</a>,
58335761
ensuring that <var>documentUrl</var> is set to the original <var>url</var>.</li>
5834-
<li>If the retrieved resource's <a>Content-Type</a> is <code>application/json</code>
5762+
<li id="LoadDocumentCallback-step-5">If the retrieved resource's <a>Content-Type</a> is <code>application/json</code>
58355763
or any media type with a <code>+json</code> suffix as defined in [[RFC6839]]
58365764
except <code>application/ld+json</code>,
58375765
and the response has an HTTP Link Header [[RFC8288]] using the <code>http://www.w3.org/ns/json-ld#context</code> link relation,
@@ -5843,74 +5771,9 @@ <h3>LoadDocumentCallback</h3>
58435771
<p class="note">The HTTP Link Header is ignored for documents served as <code>application/ld+json</code>
58445772
or <code>text/html</code>.</p>
58455773
</li>
5846-
<li>Otherwise, if the retrieved resource's <a>Content-Type</a> is <code>text/html</code>:
5847-
<ol>
5848-
<li>If the processor is a <a>pure JSON Processor</a>
5849-
the <var>promise</var> is rejected with a <a>JsonLdError</a> whose code is set to <a data-link-for="JsonLdErrorCode">loading document failed</a>
5850-
and processing is terminated.</li>
5851-
<li>Set <var>documentUrl</var> to the the <a data-cite="HTML/urls-and-fetching.html#document-base-url">Document Base URL</a>
5852-
of <a data-link-for="LoadDocumentCallback">url</a>, as defined in [[HTML]],
5853-
using the existing <var>documentUrl</var> as the document's URL.
5854-
</li>
5855-
<li>If the <a data-link-for="LoadDocumentCallback">url</a> parameter
5856-
contains a <a data-cite="RFC3986#section-3.5">fragment identifier</a>,
5857-
set <var>source</var> to the <a data-cite="DOM#dom-node-textcontent">textContent</a>
5858-
of the <a data-cite="HTML/scripting.html#the-script-element">script element</a> in <var>document</var>
5859-
having an <a data-cite="HTML/dom.html#the-id-attribute">id attribute</a>
5860-
that matches the fragment identifier, after decoding <a data-cite="RFC3986#section-2.1">percent encoded sequences</a>.
5861-
<p>If no such element is found,
5862-
or the located element is not a <a>JSON-LD script element</a>,
5863-
the <var>promise</var> is rejected with a <a>JsonLdError</a> whose code is set to <a data-link-for="JsonLdErrorCode">loading document failed</a>
5864-
and processing is terminated.</p>
5865-
</li>
5866-
<li>Otherwise, if the <a data-link-for="LoadDocumentOptions">profile</a>
5867-
option is specified,
5868-
set <var>source</var> to the result of transforming the
5869-
<a data-cite="DOM#dom-node-textcontent">textContent</a>
5870-
of the first <a data-cite="HTML/scripting.html#the-script-element">script element</a> in <var>document</var>
5871-
having an <a data-cite="HTML/semantics.html#attr-link-type">type attribute</a>
5872-
of <code>application/ld+json</code> along with the value of the
5873-
<a data-link-for="LoadDocumentOptions">profile</a> option, if found.</li>
5874-
<li>If <var>source</var> is still undefined and the <a data-link-for="LoadDocumentOptions">extractAllScripts</a> option is not present, or <code>false</code>,
5875-
set <var>source</var> to the <a data-cite="DOM#dom-node-textcontent">textContent</a>
5876-
of the first <a>JSON-LD script element</a> in <var>document</var>.
5877-
<p>If no such element is found,
5878-
or the located element is not a <a>JSON-LD script element</a>,
5879-
the <var>promise</var> is rejected with a <a>JsonLdError</a> whose code is set to <a data-link-for="JsonLdErrorCode">loading document failed</a>
5880-
and processing is terminated.</p></li>
5881-
<li>If <var>source</var> is defined,
5882-
set <var>document</var> to the result of the
5883-
<a href="#extract-script-content">Extract Script Content algorithm</a>,
5884-
using <var>source</var>, rejecting <var>promise</var>
5885-
with a <a>JsonLdError</a> whose code set from the result, if an error is detected
5886-
and processing is terminated.
5887-
</li>
5888-
<li>Otherwise, <var>source</var> is undefined.
5889-
<ol>
5890-
<li>If the <a data-link-for="LoadDocumentOptions">extractAllScripts</a> option is not present, or <code>false</code>,
5891-
the <var>promise</var> is rejected with a <a>JsonLdError</a> whose code is set to <a data-link-for="JsonLdErrorCode">loading document failed</a>
5892-
and processing is terminated.</li>
5893-
<li>Otherwise, the <a data-link-for="LoadDocumentOptions">extractAllScripts</a> option is <code>true</code>.
5894-
Set <var>document</var> to a new empty <a>array</a>.
5895-
For each <a>JSON-LD script element</a> in <var>input</var>:
5896-
<ol>
5897-
<li>Set <var>source</var> to its <a data-cite="DOM#dom-node-textcontent">textContent</a>.</li>
5898-
<li>Set <var>script content</var> to the result of the <a href="#extract-script-content">Extract Script Content algorithm</a>,
5899-
using <var>source</var>, rejecting <var>promise</var>
5900-
with a <a>JsonLdError</a> whose code set from the result, if an error is detected
5901-
and processing is terminated.</li>
5902-
<li>If <var>script content</var> is an <a>array</a>, merge it to the end of <var>document</var>.</li>
5903-
<li>Otherwise, append <var>script content</var> to <var>document</var>.</li>
5904-
</ol>
5905-
</li>
5906-
</ol>
5907-
</li>
5908-
</ol>
5909-
</li>
59105774
<li>Otherwise, the retrieved document's <a>Content-Type</a> is neither
59115775
<code>application/json</code>,
59125776
<code>application/ld+json</code>,
5913-
<code>text/html</code>,
59145777
nor any other media type using a
59155778
<code>+json</code> suffix as defined in [[RFC6839]].
59165779
Reject the <var>promise</var> passing a <a data-link-for="JsonLdErrorCode">loading document failed</a> error.</li>
@@ -5998,6 +5861,108 @@ <h3>RemoteDocument</h3>
59985861
</section>
59995862
</section> <!-- end of Remote Document and Context Retrieval -->
60005863

5864+
<section class="changed">
5865+
<h2>HTML Content Algorithms</h2>
5866+
<p class="note">This section describes features available
5867+
with a <a data-link-for="JsonLdOptions">documentLoader</a> supporting HTML script extraction.</p>
5868+
<p>Implementations of a <a data-link-for="JsonLdOptions">documentLoader</a> MAY support extracting JSON-LD from
5869+
<a data-cite="HTML/scripting.html#the-script-element">script elements</a> contained within an HTML [[HTML]] document.
5870+
This section describes the normative behavior of such processors.
5871+
Such a processor supports <dfn>HTML script extraction</dfn>.</p>
5872+
5873+
<section id="process-html"><h3>Process HTML</h3>
5874+
<p>This sections describe an extension to the algorithm specified
5875+
in <a>LoadDocumentCallback</a> to support extracting JSON-LD from HTML.</p>
5876+
5877+
<p><a href="#LoadDocumentCallback-step-2">Step 2</a> is updated to add the following: A processor supporting <a>HTML script extraction</a> MUST include <code>text/html</code> at any preference level,
5878+
unless <a data-link-for="LoadDocumentOptions">requestProfile</a> is `http://www.w3.org/ns/json-ld#context`.</p>
5879+
5880+
<p>After <a href="#LoadDocumentCallback-step-5">step 5</a>, add the following processing step:
5881+
Otherwise, if the retrieved resource's <a>Content-Type</a> is <code>text/html</code>:</p>
5882+
<ol>
5883+
<li>If the processor does not support <a>HTML script extraction</a>
5884+
the <var>promise</var> is rejected with a <a>JsonLdError</a> whose code is set to <a data-link-for="JsonLdErrorCode">loading document failed</a>
5885+
and processing is terminated.</li>
5886+
<li>Set <var>documentUrl</var> to the the <a data-cite="HTML/urls-and-fetching.html#document-base-url">Document Base URL</a>
5887+
of <a data-link-for="LoadDocumentCallback">url</a>, as defined in [[HTML]],
5888+
using the existing <var>documentUrl</var> as the document's URL.
5889+
</li>
5890+
<li>If the <a data-link-for="LoadDocumentCallback">url</a> parameter
5891+
contains a <a data-cite="RFC3986#section-3.5">fragment identifier</a>,
5892+
set <var>source</var> to the <a data-cite="DOM#dom-node-textcontent">textContent</a>
5893+
of the <a data-cite="HTML/scripting.html#the-script-element">script element</a> in <var>document</var>
5894+
having an <a data-cite="HTML/dom.html#the-id-attribute">id attribute</a>
5895+
that matches the fragment identifier, after decoding <a data-cite="RFC3986#section-2.1">percent encoded sequences</a>.
5896+
<p>If no such element is found,
5897+
or the located element is not a <a>JSON-LD script element</a>,
5898+
the <var>promise</var> is rejected with a <a>JsonLdError</a> whose code is set to <a data-link-for="JsonLdErrorCode">loading document failed</a>
5899+
and processing is terminated.</p>
5900+
</li>
5901+
<li>Otherwise, if the <a data-link-for="LoadDocumentOptions">profile</a>
5902+
option is specified,
5903+
set <var>source</var> to the result of transforming the
5904+
<a data-cite="DOM#dom-node-textcontent">textContent</a>
5905+
of the first <a data-cite="HTML/scripting.html#the-script-element">script element</a> in <var>document</var>
5906+
having an <a data-cite="HTML/semantics.html#attr-link-type">type attribute</a>
5907+
of <code>application/ld+json</code> along with the value of the
5908+
<a data-link-for="LoadDocumentOptions">profile</a> option, if found.</li>
5909+
<li>If <var>source</var> is still undefined and the <a data-link-for="LoadDocumentOptions">extractAllScripts</a> option is not present, or <code>false</code>,
5910+
set <var>source</var> to the <a data-cite="DOM#dom-node-textcontent">textContent</a>
5911+
of the first <a>JSON-LD script element</a> in <var>document</var>.
5912+
<p>If no such element is found,
5913+
or the located element is not a <a>JSON-LD script element</a>,
5914+
the <var>promise</var> is rejected with a <a>JsonLdError</a> whose code is set to <a data-link-for="JsonLdErrorCode">loading document failed</a>
5915+
and processing is terminated.</p></li>
5916+
<li>If <var>source</var> is defined,
5917+
set <var>document</var> to the result of the
5918+
<a href="#extract-script-content">Extract Script Content algorithm</a>,
5919+
using <var>source</var>, rejecting <var>promise</var>
5920+
with a <a>JsonLdError</a> whose code set from the result, if an error is detected
5921+
and processing is terminated.
5922+
</li>
5923+
<li>Otherwise, <var>source</var> is undefined.
5924+
<ol>
5925+
<li>If the <a data-link-for="LoadDocumentOptions">extractAllScripts</a> option is not present, or <code>false</code>,
5926+
the <var>promise</var> is rejected with a <a>JsonLdError</a> whose code is set to <a data-link-for="JsonLdErrorCode">loading document failed</a>
5927+
and processing is terminated.</li>
5928+
<li>Otherwise, the <a data-link-for="LoadDocumentOptions">extractAllScripts</a> option is <code>true</code>.
5929+
Set <var>document</var> to a new empty <a>array</a>.
5930+
For each <a>JSON-LD script element</a> in <var>input</var>:
5931+
<ol>
5932+
<li>Set <var>source</var> to its <a data-cite="DOM#dom-node-textcontent">textContent</a>.</li>
5933+
<li>Set <var>script content</var> to the result of the <a href="#extract-script-content">Extract Script Content algorithm</a>,
5934+
using <var>source</var>, rejecting <var>promise</var>
5935+
with a <a>JsonLdError</a> whose code set from the result, if an error is detected
5936+
and processing is terminated.</li>
5937+
<li>If <var>script content</var> is an <a>array</a>, merge it to the end of <var>document</var>.</li>
5938+
<li>Otherwise, append <var>script content</var> to <var>document</var>.</li>
5939+
</ol>
5940+
</li>
5941+
</ol>
5942+
</li>
5943+
</ol>
5944+
</section>
5945+
5946+
<section id="extract-script-content" class="algorithm">
5947+
<h3>Extract Script Content Algorithm</h3>
5948+
5949+
<p>The algorithm extracts the text content a
5950+
<a>JSON-LD script element</a> into a <a>map</a> or <a>array</a> of <a>maps</a>.
5951+
A <dfn>JSON-LD script element</dfn> is a <a data-cite="HTML/scripting.html#the-script-element">script element</a>
5952+
within an HTML [[HTML]] document with the <a data-cite="HTML/semantics.html#attr-link-type">type attribute</a> set to
5953+
<code>application/ld+json</code>.</p>
5954+
5955+
<p>The algorithm takes a single required input variable: <var>source</var>,
5956+
the <a data-cite="DOM#dom-node-textcontent">textContent</a> of an HTML <a data-cite="HTML/scripting.html#the-script-element">script element</a>.</p>
5957+
5958+
<ol>
5959+
<li>If <var>source</var> is not a valid JSON document,
5960+
an <a data-link-for="JsonLdErrorCode">invalid script element</a> has been detected, and processing is aborted.</li>
5961+
<li>Return the result of transforming <var>source</var> into the <a>internal representation</a>.</li>
5962+
</ol>
5963+
</section>
5964+
</section>
5965+
60015966
<section>
60025967
<h3>Error Handling</h3>
60035968

@@ -6345,7 +6310,6 @@ <h2>Changes since JSON-LD Community Group Final Report</h2>
63456310
<li>Added support for <a>JSON literals</a>.</li>
63466311
<li><a>Term definitions</a> with keys which are of the form of a <a>compact IRI</a> or <a>absolute IRI</a> MUST NOT
63476312
expand to an <a>IRI</a> other than the expansion of the key itself.</li>
6348-
<li>Define different processor modes: <a>pure JSON Processor</a>, <a>event-based JSON processor</a>, and <a>full Processor</a>.</li>
63496313
<li>Consolidate <a>RemoteDocument</a> processing into the <a>LoadDocumentCallback</a>
63506314
including variations on HTML processing.</li>
63516315
<li>The <a href="#iri-compaction">IRI compaction algorithm</a> may generate an error if the result is an

0 commit comments

Comments
 (0)