You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The below example code is a variant of this comment. The problem is that pandas can't infer the type when the columns are all null. I would like to be able to use Postgres' NaN value if possible, but seems that pandas passes None to SQLAlchemy instead of float('NaN'). I would rather not cast the column's dtype to numeric after-the-fact. From previous link I know it's possible to make Postgres NaN values from SQLAlchemy. Is it possible pandas could support this?
tmaloney@vm-data-1:/var/tmp$ sudo -u postgres psql -c "drop database test;"
DROP DATABASE
tmaloney@vm-data-1:/var/tmp$ cat test.sql
create database test owner tmaloney;
tmaloney@vm-data-1:/var/tmp$ sudo -u postgres psql < /var/tmp/test.sql
CREATE DATABASE
tmaloney@vm-data-1:/var/tmp$ cat test2.py
#!/usr/bin/env python
'''
Creates a test python db to play with
'''
import sqlalchemy as sa
import pandas as pd
from pandas.util.testing import assert_frame_equal
CON_URL = 'postgres://tmaloney:secret@localhost/test'
db = sa.create_engine(CON_URL)
meta = sa.MetaData()
test_table = sa.Table(
'test_float', meta,
# A table contains design id and blocks
sa.Column('id', sa.Integer, primary_key=True),
sa.Column('float_val', sa.Float())
)
meta.drop_all(db)
meta.create_all(db)
data = [
{'float_val': float('nan')},
{'float_val': float('NaN')},
]
df = pd.DataFrame(data)
engine = sa.create_engine(CON_URL)
df.to_sql('test_float', engine, if_exists='append', index=False)
returned_df = pd.read_sql_table('test_float', engine)
assert_frame_equal(df, returned_df[['float_val']])
returned_df = pd.read_sql('select * from test_float', engine)
assert_frame_equal(df, returned_df[['float_val']])
tmaloney@vm-data-1:/var/tmp$ python test2.py
Traceback (most recent call last):
File "test2.py", line 37, in <module>
assert_frame_equal(df, returned_df[['float_val']])
File "/usr/local/lib/python2.7/dist-packages/pandas/util/testing.py", line 746, in assert_frame_equal
check_names=check_names)
File "/usr/local/lib/python2.7/dist-packages/pandas/util/testing.py", line 675, in assert_series_equal
assert_attr_equal('dtype', left, right)
File "/usr/local/lib/python2.7/dist-packages/pandas/util/testing.py", line 552, in assert_attr_equal
assert_equal(left_attr,right_attr,"attr is not equal [{0}]" .format(attr))
File "/usr/local/lib/python2.7/dist-packages/pandas/util/testing.py", line 533, in assert_equal
assert a == b, "%s: %r != %r" % (msg.format(a,b), a, b)
AssertionError: attr is not equal [dtype]: dtype('float64') != dtype('O')
The first assertion works, the second one does not.
The text was updated successfully, but these errors were encountered:
The problem is that such float values are not supported for all columns types, and not for all database flavors. In this regard, the approach with None is generic and works in all cases.
For now, we have followed the flavor-agnostic path in pandas, and delegating flavor-specific things to sqlalchemy. And I am not sure to what extent we want to deviate from this path.
Another issue is that the use of NaN in pandas is not only used as Not-a-Number (the actual meaning), but in practice also to represent a 'missing value', for which NULL (and thus None on the python side) is more suited I think.
Anyway, if we would want to add this functionality, we will have to think about the API, as I wouldn't change the default.
). There should possibly also be an option to convert None into NaN when loading data using read_sql, even when all rows contain NaN:s so that columns are not converted to dtype object in that case, see #6798
The below example code is a variant of this comment. The problem is that pandas can't infer the type when the columns are all null. I would like to be able to use Postgres' NaN value if possible, but seems that pandas passes
None
to SQLAlchemy instead offloat('NaN')
. I would rather not cast the column's dtype to numeric after-the-fact. From previous link I know it's possible to make Postgres NaN values from SQLAlchemy. Is it possible pandas could support this?The first assertion works, the second one does not.
The text was updated successfully, but these errors were encountered: