Skip to content

read_json ValueError: Value is too big #14530

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
incognick opened this issue Oct 28, 2016 · 2 comments
Closed

read_json ValueError: Value is too big #14530

incognick opened this issue Oct 28, 2016 · 2 comments
Labels
IO JSON read_json, to_json, json_normalize Usage Question

Comments

@incognick
Copy link

incognick commented Oct 28, 2016

Loading a json file with large integers (> 2^32), results in "Value is too big". I have tried changing the orient to "records" and also passing in dtype={'id': numpy.dtype('uint64')}. The error is the same.

import pandas
data = pandas.read_json('''{"id": 10254939386542155531}''')
print(data.describe())

Expected Output

                          id
count                      1
unique                     1
top     10254939386542155531
freq                       1

Actual Output (even with dtype passed in)

 File "./parse_dispatch_table.py", line 34, in <module>
    print(pandas.read_json('''{"id": 10254939386542155531}''', dtype=dtype_conversions).describe())
  File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 234, in read_json
    date_unit).parse()
  File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 302, in parse
    self._parse_no_numpy()
  File "/users/XXX/.local/lib/python3.4/site-packages/pandas/io/json.py", line 519, in _parse_no_numpy
    loads(json, precise_float=self.precise_float), dtype=None)
ValueError: Value is too big

No problem using read_csv:

import pandas
import io
print(pandas.read_csv(io.StringIO('''id\n10254939386542155531''')).describe())

Output using read_csv

                          id
count                      1
unique                     1
top     10254939386542155531
freq                       1

Output of pd.show_versions()

## INSTALLED VERSIONS

commit: None
python: 3.4.3.final.0
python-bits: 64
OS: Linux
OS-release: 3.10.0-327.el7.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.19.0
nose: None
pip: 8.1.2
setuptools: 28.6.0
Cython: None
numpy: 1.11.2
scipy: None
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Oct 28, 2016

that's not valid JSON. your numbers should be quoted if they are in fact not numbers. That is out of range and should raise.

In [34]: pd.read_json('''{"id": "10254939386542155531"}''', dtype=object, orient='record', typ='series')
Out[34]: 
id    10254939386542155531
dtype: object

@jreback jreback closed this as completed Oct 28, 2016
@jreback jreback added the IO JSON read_json, to_json, json_normalize label Oct 28, 2016
@jreback jreback added this to the No action milestone Oct 28, 2016
@jxramos
Copy link

jxramos commented Jan 7, 2019

I'm bumping up into this same issue too where a 64bit integer is being used as an id. Any workaround for overriding? Would have been nice if the dtype specification drove an override but type coercion must occur after default inferred type loading

This comes up during a system log archive collection in MacOS High Sierra executing from a bash shell that is later rendered to text with a json styling...

log collect
log show --style json  > ~/syslogarchive.json


python
>import pandas
>dfSysLog = pandas.read_json( '~/syslogarchive.json' )
...
ValueError: Value is too big

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO JSON read_json, to_json, json_normalize Usage Question
Projects
None yet
Development

No branches or pull requests

4 participants