Skip to content

Commit 36cab57

Browse files
committed
Improved docs for read_json table orient
1 parent 7a9e834 commit 36cab57

File tree

3 files changed

+60
-29
lines changed

3 files changed

+60
-29
lines changed

doc/source/io.rst

+38-2
Original file line numberDiff line numberDiff line change
@@ -1648,7 +1648,7 @@ with optional parameters:
16481648

16491649
DataFrame
16501650
- default is ``columns``
1651-
- allowed values are {``split``, ``records``, ``index``, ``columns``, ``values``}
1651+
- allowed values are {``split``, ``records``, ``index``, ``columns``, ``values``, ``table``}
16521652

16531653
The format of the JSON string
16541654

@@ -1732,6 +1732,9 @@ values, index and columns. Name is also included for ``Series``:
17321732
dfjo.to_json(orient="split")
17331733
sjo.to_json(orient="split")
17341734
1735+
**Table oriented** serializes to the JSON `Table Schema`_, allowing for the
1736+
preservation of metadata including but not limited to dtypes and index names.
1737+
17351738
.. note::
17361739

17371740
Any orient option that encodes to a JSON object will not preserve the ordering of
@@ -1848,6 +1851,7 @@ is ``None``. To explicitly force ``Series`` parsing, pass ``typ=series``
18481851
``values``; just the values array
18491852
``table``; adhering to the JSON `Table Schema`_
18501853

1854+
18511855
- ``dtype`` : if True, infer dtypes, if a dict of column to dtype, then use those, if False, then don't infer dtypes at all, default is True, apply only to the data
18521856
- ``convert_axes`` : boolean, try to convert the axes to the proper dtypes, default is True
18531857
- ``convert_dates`` : a list of columns to parse for dates; If True, then try to parse date-like columns, default is True
@@ -2203,7 +2207,39 @@ A few notes on the generated table schema:
22032207
then ``level_<i>`` is used.
22042208

22052209

2206-
_Table Schema: http://specs.frictionlessdata.io/json-table-schema/
2210+
.. versionadded:: 0.23.0
2211+
2212+
``read_json`` also accepts ``orient='table'`` as an argument. This allows for
2213+
the preserveration of metadata such as dtypes and index names in a
2214+
round-trippable manner.
2215+
2216+
.. ipython:: python
2217+
2218+
df = pd.DataFrame({'foo': [1, 2, 3, 4],
2219+
'bar': ['a', 'b', 'c', 'd'],
2220+
'baz': pd.date_range('2018-01-01', freq='d', periods=4),
2221+
'qux': pd.Categorical(['a', 'b', 'c', 'c'])
2222+
}, index=pd.Index(range(4), name='idx'))
2223+
df
2224+
df.dtypes
2225+
2226+
df.to_json('test.json', orient='table')
2227+
new_df = pd.read_json('test.json', orient='table')
2228+
new_df
2229+
new_df.dtypes
2230+
2231+
Please note that the string `index` is not supported with the round trip
2232+
format, as it is used by default in ``write_json`` to indicate a missing index
2233+
name.
2234+
2235+
.. ipython:: python
2236+
2237+
df.index.name = 'index'
2238+
df.to_json('test.json', orient='table')
2239+
new_df = pd.read_json('test.json', orient='table')
2240+
print(new_df.index.name)
2241+
2242+
.. _Table Schema: http://specs.frictionlessdata.io/json-table-schema/
22072243

22082244
HTML
22092245
----

doc/source/whatsnew/v0.23.0.txt

+13-27
Original file line numberDiff line numberDiff line change
@@ -148,7 +148,7 @@ Current Behavior
148148
.. _whatsnew_0230.enhancements.round-trippable_json:
149149

150150
JSON read/write round-trippable with ``orient='table'``
151-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
151+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
152152

153153
A ``DataFrame`` can now be written to and subsequently read back via JSON while preserving metadata through usage of the ``orient='table'`` argument (see :issue:`18912` and :issue:`9146`). Previously, none of the available ``orient`` values guaranteed the preservation of dtypes and index names, amongst other metadata.
154154

@@ -160,35 +160,21 @@ A ``DataFrame`` can now be written to and subsequently read back via JSON while
160160
'qux': pd.Categorical(['a', 'b', 'c', 'c'])
161161
}, index=pd.Index(range(4), name='idx'))
162162
df
163+
df.dtypes
164+
df.to_json('test.json', orient='table')
165+
new_df = pd.read_json('test.json', orient='table')
166+
new_df
167+
new_df.dtypes
163168

164-
Previous Behavior:
165-
166-
.. code-block:: ipython
167-
168-
In [17]: df.to_json("test.json", orient='columns')
169-
In [17]: pd.read_json("test.json", orient='columns')
170-
Out[18]:
171-
bar baz foo qux
172-
0 a 1514764800000 1 a
173-
1 b 1514851200000 2 b
174-
2 c 1514937600000 3 c
175-
3 d 1515024000000 4 c
176-
177-
Current Behavior:
178-
179-
.. code-block:: ipython
169+
Please note that the string `index` is not supported with the round trip format, as it is used by default in ``write_json`` to indicate a missing index name.
180170

181-
In [29]: df.to_json("test.json", orient='table')
182-
In [30]: pd.read_json("test.json", orient='table')
183-
Out[30]:
184-
bar baz foo qux
185-
idx
186-
0 a 2018-01-01 1 a
187-
1 b 2018-01-02 2 b
188-
2 c 2018-01-03 3 c
189-
3 d 2018-01-04 4 c
171+
.. ipython:: python
190172

191-
Please note that the string `index` is not supported with the round trip format, as it is used by default in ``write_json`` to indicate a missing index name.
173+
df.index.name = 'index'
174+
df.to_json('test.json', orient='table')
175+
new_df = pd.read_json('test.json', orient='table')
176+
new_df
177+
print(new_df.index.name)
192178

193179
.. _whatsnew_0230.enhancements.other:
194180

pandas/io/json/json.py

+9
Original file line numberDiff line numberDiff line change
@@ -339,6 +339,15 @@ def read_json(path_or_buf=None, orient=None, typ='frame', dtype=True,
339339
-------
340340
result : Series or DataFrame, depending on the value of `typ`.
341341
342+
Notes
343+
-----
344+
Specific to ``orient='table'``, if a ``DataFrame`` with a literal ``Index``
345+
name of `index` gets written with ``write_json``, the subsequent read
346+
operation will incorrectly set the ``Index`` name to ``None``. This is
347+
because `index` is also used by ``write_json`` to denote a missing
348+
``Index`` name, and the subsequent ``read_json`` operation cannot
349+
distinguish between the two.
350+
342351
See Also
343352
--------
344353
DataFrame.to_json

0 commit comments

Comments
 (0)