Skip to content

BUG: Accept unicode quotechars again in pd.read_csv #14492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

gfyoung
Copy link
Member

@gfyoung gfyoung commented Oct 25, 2016

Title is self-explanatory. Affects Python 2.x only. Closes #14477.

@jorisvandenbossche jorisvandenbossche added Bug IO CSV read_csv, to_csv labels Oct 25, 2016
@jorisvandenbossche jorisvandenbossche added this to the 0.19.1 milestone Oct 25, 2016
@jorisvandenbossche
Copy link
Member

@gfyoung I think there is already compat.text_type that does this (capturing unicode/str for py2/py3)

@gfyoung gfyoung force-pushed the quotechar-unicode-2.x branch from 9a31321 to 814746b Compare October 25, 2016 20:50
@gfyoung
Copy link
Member Author

gfyoung commented Oct 25, 2016

@jorisvandenbossche : Ah, good catch. Fixed.

@gfyoung gfyoung force-pushed the quotechar-unicode-2.x branch from 814746b to 1d3a3d7 Compare October 25, 2016 20:52
@gfyoung
Copy link
Member Author

gfyoung commented Oct 25, 2016

@jorisvandenbossche : Could you cancel Travis builds #170552837 and #170586342? Thanks!

@codecov-io
Copy link

codecov-io commented Oct 25, 2016

Current coverage is 85.26% (diff: 100%)

Merging #14492 into master will increase coverage by <.01%

@@             master     #14492   diff @@
==========================================
  Files           140        140          
  Lines         50670      50672     +2   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43205      43207     +2   
  Misses         7465       7465          
  Partials          0          0          

Powered by Codecov. Last update e3d943d...ec9f59a

@gfyoung gfyoung force-pushed the quotechar-unicode-2.x branch from 1d3a3d7 to 6a47510 Compare October 25, 2016 21:18
@gfyoung
Copy link
Member Author

gfyoung commented Oct 25, 2016

@jorisvandenbossche : Also #170586654 if possible, as I figured out what was causing the 2.7 machine to fail. The 3.4 failure is unrelated.

@jreback
Copy link
Contributor

jreback commented Oct 25, 2016

looks fine to me; try with an actual unicode quotechar though, e.g

In [1]: "\u0394"
Out[1]: 'Δ'

@gfyoung gfyoung force-pushed the quotechar-unicode-2.x branch from 6a47510 to 523412b Compare October 26, 2016 00:56
@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Oct 26, 2016

Actual unicode quotechar do actually fail (see the tests). But I don't think we need to support actual unicode quotechars, just supporting ascii chars that are passed as unicode is enough IMO.

@gfyoung
Copy link
Member Author

gfyoung commented Oct 26, 2016

@jorisvandenbossche : Agreed. The current failure is because of Python 2.x. I modified test so that last check is only done on Python 3.x

@gfyoung gfyoung force-pushed the quotechar-unicode-2.x branch from 523412b to ec9f59a Compare October 26, 2016 13:34
@jreback
Copy link
Contributor

jreback commented Oct 26, 2016

lgtm. thanks.

@jreback jreback closed this in 6130e77 Oct 26, 2016
@gfyoung gfyoung deleted the quotechar-unicode-2.x branch October 27, 2016 00:30
@jorisvandenbossche
Copy link
Member

@gfyoung thanks!

jorisvandenbossche pushed a commit to jorisvandenbossche/pandas that referenced this pull request Nov 2, 2016
…d.read_csv

Title is self-explanatory.  Affects Python 2.x only.  Closes pandas-dev#14477.

Author: gfyoung <[email protected]>

Closes pandas-dev#14492 from gfyoung/quotechar-unicode-2.x and squashes the following commits:

ec9f59a [gfyoung] BUG: Accept unicode quotechars again in pd.read_csv

(cherry picked from commit 6130e77)
yarikoptic added a commit to neurodebian/pandas that referenced this pull request Nov 18, 2016
Version 0.19.1

* tag 'v0.19.1': (43 commits)
  RLS: v0.19.1
  DOC: update whatsnew/release notes for 0.19.1 (pandas-dev#14573)
  [Backport pandas-dev#14545] BUG/API: Index.append with mixed object/Categorical indices (pandas-dev#14545)
  DOC: rst fixes
  [Backport pandas-dev#14567] DEPR: add deprecation warning for com.array_equivalent (pandas-dev#14567)
  [Backport pandas-dev#14551] PERF: casting loc to labels dtype before searchsorted (pandas-dev#14551)
  [Backport pandas-dev#14536] BUG: DataFrame.quantile with NaNs (GH14357) (pandas-dev#14536)
  [Backport pandas-dev#14520] BUG: don't close user-provided file handles in C parser (GH14418) (pandas-dev#14520)
  [Backport pandas-dev#14392] BUG: Dataframe constructor when given dict with None value (pandas-dev#14392)
  [Backport pandas-dev#14514] BUG: Don't parse inline quotes in skipped lines (pandas-dev#14514)
  [Bacport pandas-dev#14543] BUG: tseries ceil doc fix (pandas-dev#14543)
  [Backport pandas-dev#14541] DOC: Simplify the gbq integration testing procedure for contributors (pandas-dev#14541)
  [Backport pandas-dev#14527] BUG/ERR: raise correct error when sql driver is not installed (pandas-dev#14527)
  [Backport pandas-dev#14501] BUG: fix DatetimeIndex._maybe_cast_slice_bound for empty index (GH14354) (pandas-dev#14501)
  [Backport pandas-dev#14442] DOC: Expand on reference docs for read_json() (pandas-dev#14442)
  BLD: fix 3.4 build for cython to 0.24.1
  [Backport pandas-dev#14492] BUG: Accept unicode quotechars again in pd.read_csv
  [Backport pandas-dev#14496] BLD: Support Cython 0.25
  [Backport pandas-dev#14498] COMPAT/TST: fix test for range testing of negative integers to neg powers
  [Backport pandas-dev#14476] PERF: performance regression in Series.asof (pandas-dev#14476)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants