Skip to content

Commit cece5ac

Browse files
authored
Cached contexts 2 (#20)
* Move schema-verson type to utils package * reference: Make context encompass schema version and base schema * utils: New typedef for schemas * json-schema: Use new interface for validate * json-schema.asd: Untabify * t/reference: Fix call to internal function * benchmark.lisp: Add a simple benchmark of caching * json-schema, reference: Export more context slots and rework validate Make sure calling validate uses the right values whether a schema or context is passed. * README: Add a section about contexts
1 parent 8df3c4f commit cece5ac

8 files changed

+308
-164
lines changed

README.rst

Lines changed: 61 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ json-schema is a validator for drafts 4, 6, 7, and 2019-09 of the `JSON Schema <
2020

2121
**Drafts 4, 6, 7:**
2222

23-
- ``$ref`` overrides any sibling keywords
23+
- ``$ref`` does not override any sibling keywords
2424

2525
-------
2626
Example
@@ -86,6 +86,66 @@ If your data contains a top-level ``$schema`` key, you don't need to pass a sche
8686
Usage Notes
8787
-----------
8888

89+
~~~~~~~~
90+
Contexts
91+
~~~~~~~~
92+
93+
A context is a reusable set of state that contains all of the fetched network resources (if your schema references external resources) and resolved ids. By storing that all, you can reuse the validation context multiple times without fetching/resolving everything again.
94+
95+
::
96+
(ql:quickload '(trivial-benchmark json-schema))
97+
98+
(defvar *schema* (json-schema.parse:parse #P"~/Downloads/schema"))
99+
100+
;; schema is the json-schema meta schema document from:
101+
;; https://json-schema.org/specification-links.html#draft-2019-09-formerly-known-as-draft-8
102+
103+
(defvar *context*
104+
(json-schema:make-context
105+
*schema*
106+
:draft2019-09))
107+
108+
;;; Cached
109+
110+
(let ((data (json-schema.parse:parse "{\"type\": \"string\"}")))
111+
(trivial-benchmark:with-timing (1000)
112+
(json-schema:validate data
113+
:context *context*)))
114+
115+
;; - SAMPLES TOTAL MINIMUM MAXIMUM MEDIAN AVERAGE DEVIATION
116+
;; REAL-TIME 1000 0.826 0 0.022 0.001 0.000826 0.000797
117+
;; RUN-TIME 1000 0.826 0 0.022 0.001 0.000826 0.0008
118+
;; USER-RUN-TIME 1000 0.781011 0 0.020644 0.000745 0.000781 0.000665
119+
;; SYSTEM-RUN-TIME 1000 0.049933 0 0.000986 0 0.00005 0.000184
120+
;; PAGE-FAULTS 1000 0 0 0 0 0 0.0
121+
;; GC-RUN-TIME 1000 0.02 0 0.02 0 0.00002 0.000632
122+
;; BYTES-CONSED 1000 213753664 195344 228976 228032 213753.66 16221.591
123+
;; EVAL-CALLS 1000 0 0 0 0 0 0.0
124+
125+
126+
;;; Uncached
127+
128+
(let ((data (json-schema.parse:parse "{\"type\": \"string\"}")))
129+
(trivial-benchmark:with-timing (1000)
130+
(json-schema:validate data
131+
:schema *schema*
132+
:schema-version :draft2019-09)))
133+
134+
;; - SAMPLES TOTAL MINIMUM MAXIMUM MEDIAN AVERAGE DEVIATION
135+
;; REAL-TIME 1000 203.185 0.148 1.471 0.185 0.203185 0.112807
136+
;; RUN-TIME 1000 9.25 0.006 0.04 0.009 0.00925 0.002294
137+
;; USER-RUN-TIME 1000 8.145081 0.003368 0.039067 0.008105 0.008145 0.002317
138+
;; SYSTEM-RUN-TIME 1000 1.107377 0 0.004927 0.000994 0.001107 0.000967
139+
;; PAGE-FAULTS 1000 0 0 0 0 0 0.0
140+
;; GC-RUN-TIME 1000 0.08 0 0.03 0 0.00008 0.001464
141+
;; BYTES-CONSED 1000 719780512 707728 751424 718160 719780.5 11026.181
142+
;; EVAL-CALLS 1000 0 0 0 0 0 0.0
143+
144+
145+
So, for this trivial example, the cached version is around a 245x speedup! Note, though, that json-schema evaluates these things lazily, so not every reference is necessarily resolved when the context is created. They are mutable, though, and will build up state as they go.
146+
147+
Thank you to `Raymond Wiker <https://github.com/rwiker>`_ for contributing the initial implementation.
148+
89149
~~~~~~~~~~~~~
90150
Decoding JSON
91151
~~~~~~~~~~~~~
@@ -98,12 +158,3 @@ Network access
98158
~~~~~~~~~~~~~~
99159

100160
JSON Schema allows schemas to reference other documents over the network. This library will fetch them automatically, by default. If you don't want this to be allowed, you should set :variable:`json-schema.reference:*resolve-remote-references*` to ``nil``. If a schema references a remote one, it will raise a :class:`json-schema.reference:fetching-not-allowed-error` instead of fetching it when fetching references is disallowed.
101-
102-
103-
~~~~~~~~~~~~~~~
104-
Reusing Schemas
105-
~~~~~~~~~~~~~~~
106-
107-
Because of the nature of JSON Schema's references (location-independent references, particularly), schema documents need to be walked when loaded to discover named anchors and ids. They also may load other schemas.
108-
109-
If you're reusing a large schema document repeatedly, you might want to cache the resolution context. Unfortunately, I'm still working on this feature!

benchmark.lisp

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
(ql:quickload '(trivial-benchmark json-schema))
2+
3+
(defvar *schema* (json-schema.parse:parse #P"~/Downloads/schema"))
4+
5+
;; schema is the json-schema meta schema document from:
6+
;; https://json-schema.org/specification-links.html#draft-2019-09-formerly-known-as-draft-8
7+
8+
(defvar *context*
9+
(json-schema:make-context
10+
*schema*
11+
:draft2019-09))
12+
13+
;;; Cached
14+
15+
(let ((data (json-schema.parse:parse "{\"type\": \"string\"}")))
16+
(trivial-benchmark:with-timing (1000)
17+
(json-schema:validate data
18+
:context *context*)))
19+
20+
;; - SAMPLES TOTAL MINIMUM MAXIMUM MEDIAN AVERAGE DEVIATION
21+
;; REAL-TIME 1000 0.826 0 0.022 0.001 0.000826 0.000797
22+
;; RUN-TIME 1000 0.826 0 0.022 0.001 0.000826 0.0008
23+
;; USER-RUN-TIME 1000 0.781011 0 0.020644 0.000745 0.000781 0.000665
24+
;; SYSTEM-RUN-TIME 1000 0.049933 0 0.000986 0 0.00005 0.000184
25+
;; PAGE-FAULTS 1000 0 0 0 0 0 0.0
26+
;; GC-RUN-TIME 1000 0.02 0 0.02 0 0.00002 0.000632
27+
;; BYTES-CONSED 1000 213753664 195344 228976 228032 213753.66 16221.591
28+
;; EVAL-CALLS 1000 0 0 0 0 0 0.0
29+
30+
31+
;;; Uncached
32+
33+
(let ((data (json-schema.parse:parse "{\"type\": \"string\"}")))
34+
(trivial-benchmark:with-timing (1000)
35+
(json-schema:validate data
36+
:schema *schema*
37+
:schema-version :draft2019-09)))
38+
39+
;; - SAMPLES TOTAL MINIMUM MAXIMUM MEDIAN AVERAGE DEVIATION
40+
;; REAL-TIME 1000 203.185 0.148 1.471 0.185 0.203185 0.112807
41+
;; RUN-TIME 1000 9.25 0.006 0.04 0.009 0.00925 0.002294
42+
;; USER-RUN-TIME 1000 8.145081 0.003368 0.039067 0.008105 0.008145 0.002317
43+
;; SYSTEM-RUN-TIME 1000 1.107377 0 0.004927 0.000994 0.001107 0.000967
44+
;; PAGE-FAULTS 1000 0 0 0 0 0 0.0
45+
;; GC-RUN-TIME 1000 0.08 0 0.03 0 0.00008 0.001464
46+
;; BYTES-CONSED 1000 719780512 707728 751424 718160 719780.5 11026.181
47+
;; EVAL-CALLS 1000 0 0 0 0 0 0.0

json-schema.asd

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,8 @@
33
(defsystem json-schema
44
:description "JSON schema validation"
55
:author "Matt Novenstern <[email protected]>"
6-
:license "LGPL"
7-
:version "1.0.2"
6+
:license "LLGPL"
7+
:version "2.0.0"
88
:pathname "src"
99
:components ((:file "utils")
1010
(:file "parse")
@@ -32,7 +32,7 @@
3232

3333
(defsystem json-schema/json-schema-test-suite
3434
:depends-on ("json-schema"
35-
"rove")
35+
"rove")
3636
:pathname "t"
3737
:components ((:file "json-schema-test-case-helper")
3838
(:file "draft2019-09")
@@ -41,17 +41,17 @@
4141
(:file "draft4"))
4242
:perform (test-op (op c)
4343
(declare (ignore op))
44-
(uiop:symbol-call :rove :run c)))
44+
(uiop:symbol-call :rove :run c)))
4545

4646
(defsystem json-schema/unit-tests
4747
:depends-on ("json-schema"
48-
"rove")
48+
"rove")
4949
:pathname "t"
5050
:components ((:file "utils")
5151
(:file "reference"))
5252
:perform (test-op (op c)
5353
(declare (ignore op))
54-
(uiop:symbol-call :rove :run c)))
54+
(uiop:symbol-call :rove :run c)))
5555

5656
(defsystem json-schema/test
5757
:in-order-to ((test-op (test-op json-schema/json-schema-test-suite)

src/json-schema.lisp

Lines changed: 22 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,12 @@
22
(:use :cl :alexandria)
33
(:local-nicknames (:reference :json-schema.reference)
44
(:validators :json-schema.validators))
5-
(:shadowing-import-from :json-schema.validators
5+
(:shadowing-import-from :json-schema.utils
66
:schema-version)
7+
(:import-from #:json-schema.reference
8+
#:make-context)
79
(:export #:validate
10+
#:make-context
811
#:*schema-version*
912
#:schema-version))
1013

@@ -14,12 +17,22 @@
1417
(defparameter *schema-version* :draft7)
1518

1619

17-
(defun validate (data &key (schema-version *schema-version*) (pretty-errors-p t) schema)
18-
"The primary validation function for json-schema. Takes data: which can be a simple value or an object as a hash table, and then optionally accepts a schema (if the data doesn't contain a top-level ``$schema`` key), schema version and pretty-errors-p deterimines whether the second return value is exception objects or strings of the rendered errors (strings by default)."
20+
(defun validate (data &key (schema-version *schema-version*) (pretty-errors-p t) schema context)
21+
"The primary validation function for json-schema. Takes data: which can be a simple value or an object as a hash table, and then optionally accepts a schema (if the data doesn't contain a top-level ``$schema`` key), schema version and pretty-errors-p deterimines whether the second return value is exception objects or strings of the rendered errors (strings by default).
1922
20-
(let ((schema (or schema (json-schema.parse:parse (dex:get (json-schema.utils:object-get "$schema" data) :force-string t)))))
21-
(reference:with-context ((reference:get-id-fun-for-draft schema-version))
22-
(reference:with-pushed-context (schema)
23-
(if-let ((errors (validators:validate schema data schema-version)))
24-
(values nil (mapcar (if pretty-errors-p #'princ-to-string #'identity) errors))
25-
(values t nil))))))
23+
The third return value is a :class:`json-schema.reference::context`, which contains all of the state stored in processing a schema including caching network resources and all of the resolved ids."
24+
(assert (not (and schema context)) nil "You should only pass one of ")
25+
26+
(let* ((schema (or schema
27+
(and context (reference:context-schema-version context))
28+
(reference:fetch-schema (json-schema.utils:object-get "$schema" data))))
29+
(context (or context (reference:make-context
30+
(or schema (reference:fetch-schema (json-schema.utils:object-get "$schema" data)))
31+
schema schema-version))))
32+
(reference:with-context (context)
33+
(if-let ((errors (validators:validate
34+
(reference:context-root-schema context)
35+
data
36+
(reference:context-schema-version context))))
37+
(values nil (mapcar (if pretty-errors-p #'princ-to-string #'identity) errors) context)
38+
(values t nil context)))))

0 commit comments

Comments
 (0)