Skip to content

SyntaxError on HDF queries where right-hand contains string delimiter #6901

Closed
@ariddell

Description

@ariddell

If your index is a string that contains a ' or " there is potentially no consistent way to do an HDF query from disk. The following is real-ish data from the Wikipedia pagecounts dataset. Worked in 0.12, SyntaxError in 0.13.

Easy to reproduce:
This is test.csv

title,hits
Al Lawson',30
'Blind' Willie McTell,20
df = pd.read_csv('test.csv', index_col=0)
store = pd.HDFStore('test.h5')
store.append('df', df)

# bug, SyntaxError
title = """Al Lawson'"""
result = store.select('df', pd.Term('index', '=', title))
result = store.select('df', 'index == {}'.format(title))

# fancy escaping helps, but it isn't a cure all
result = store.select('df', 'index == """{}"""'.format(title))
# SyntaxError
result = store.select('df', "index == '''{}'''".format(title))

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions