add support for specifying secondary indexes with to_sql #12904

trbs · 2016-04-15T16:26:04Z

Support for creating secondary indexes in to_sql via the parameter indexes.

The code also tries to avoid creating duplicate indexes when keys is specified in SQLTable.
(Once the PR for passing the keys this would be useful)

This PR introduces a new method in SQLTable called _is_column_indexed so that subclasses can easily override or change the logic.

Interesting future work will be to also support creating compound indexes.

jreback · 2016-04-16T01:37:11Z

pandas/io/sql.py

@@ -516,7 +516,8 @@ def read_sql(sql, con, index_col=None, coerce_float=True, params=None,


 def to_sql(frame, name, con, flavor='sqlite', schema=None, if_exists='fail',
-           index=True, index_label=None, chunksize=None, dtype=None):
+           index=True, index_label=None, chunksize=None, dtype=None,
+           indexes=None):


would need to be added to the doc-string with a versionadded directive

Can this be included in v0.18.1 ? or will it be for v0.19.0.

I will also update the corresponding whatsnew/.txt file.

jreback · 2016-04-16T01:38:18Z

does this have an open issue for this? (IIRC there is)

trbs · 2016-04-18T15:35:53Z

I didn't find this exact issue.

#7984 and #9084 are similar in that they also deal with to_sql / sql indexes but not specifically specifying which columns should be indexes.

I am interested in maybe picking up #7984 as well after this PR. Together with a PR for specifying multi-column / compound indexes.

jreback · 2016-04-18T15:37:38Z

hmm @trbs this PR looks like it should close those 2 issues, or is there something that is indicated in those that is not in your PR?

trbs · 2016-04-18T15:43:37Z

#7984 is about unique indexes... this PR would only add an index to the column not unique. I add that to this PR.

#9084 is basically about adding more parameters to get_schema(...) to that it includes the index=.... and index_label=... parameters that to_sql has. (so its about the dataframe index not database indices)

I could do the patch for #9084 as well since it looks kinda trivial. But it's probably better to do that in a separate PR ?

jreback · 2016-04-18T15:49:40Z

@jorisvandenbossche

jreback · 2016-05-20T13:32:06Z

status? @jorisvandenbossche

jreback · 2016-09-09T22:44:35Z

can you rebase / update?

* github.com:pydata/pandas: (554 commits) BUG: compat with Stata ver 111 Fix: F999 dictionary key '2000q4' repeated with different values (pandas-dev#14198) BLD: Test for Python 3.5 with C locale BUG: DatetimeTZBlock can't assign values near dst boundary BUG: union_categorical with Series and cat idx BUG: fix str.contains for series containing only nan values BUG: Categorical constructor not idempotent with ext dtype TST: Make encoded sep check more locale sensitive (pandas-dev#14161) DOC: minor typo in 0.19.0 whatsnew file (pandas-dev#14185) BUG: fix tz-aware datetime convert to DatetimeIndex (GH 14088) BUG : bug in setting a slice of a Series with a np.timedelta64 RLS: v0.19.0rc1 DOC: clean-up 0.19.0 whatsnew file (pandas-dev#14176) DOC: cleanup build warnings (pandas-dev#14172) Add steps to run gbq integration testing to the contributing docs (pandas-dev#14144) ENH: concat and append now can handle unordered categories (pandas-dev#13767) DEPR: Deprecate pandas.core.datetools (pandas-dev#14105) API/DEPR: Remove +/- as setops for DatetimeIndex/PeriodIndex (GH9630) (pandas-dev#14164) Fix trivial typo in comment (pandas-dev#14174) API/DEPR: Remove +/- as setops for Index (GH8227) (pandas-dev#14127) ...

codecov-io · 2016-09-12T05:11:37Z

Current coverage is 85.24% (diff: 100%)

Merging #12904 into master will increase coverage by <.01%

@@             master     #12904   diff @@
==========================================
  Files           140        140          
  Lines         50563      50575    +12   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43103      43114    +11   
- Misses         7460       7461     +1   
  Partials          0          0

Powered by Codecov. Last update 54ab5be...a092198

jreback · 2016-12-06T23:32:27Z

pandas/core/generic.py

@@ -1158,7 +1158,8 @@ def to_msgpack(self, path_or_buf=None, encoding='utf-8', **kwargs):
                                  **kwargs)

    def to_sql(self, name, con, flavor=None, schema=None, if_exists='fail',
-               index=True, index_label=None, chunksize=None, dtype=None):
+               index=True, index_label=None, chunksize=None, dtype=None,
+               indexes=None):
        """
        Write records stored in a DataFrame to a SQL database.


are the params documented here?

jreback · 2016-12-06T23:32:40Z

pandas/io/sql.py

+    indexes : list of column name(s). Columns names in this list will have
+        an indexes created for them in the database.
+
+        .. versionadded:: 0.18.2


jreback · 2016-12-06T23:33:05Z

@jorisvandenbossche status?

trbs · 2016-12-07T23:23:50Z

btw happy to rebase again if Pandas wants to merge this PR.

jreback · 2017-02-01T20:52:39Z

closing as stale

add support for specifying secondary indexes with to_sql

d164182

jreback reviewed Apr 16, 2016
View reviewed changes

jreback added Enhancement IO SQL to_sql, read_sql, read_sql_query labels Apr 16, 2016

trbs added 2 commits September 11, 2016 23:47

add comments and versionadded

82a0118

jreback reviewed Dec 6, 2016

View reviewed changes

jreback closed this Feb 1, 2017

Uh oh!

add support for specifying secondary indexes with to_sql #12904

add support for specifying secondary indexes with to_sql #12904

Uh oh!

Conversation

trbs commented Apr 15, 2016

Uh oh!

jreback Apr 16, 2016

Choose a reason for hiding this comment

Uh oh!

trbs Apr 18, 2016

Choose a reason for hiding this comment

Uh oh!

jreback commented Apr 16, 2016

Uh oh!

trbs commented Apr 18, 2016

Uh oh!

jreback commented Apr 18, 2016

Uh oh!

trbs commented Apr 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback commented Apr 18, 2016

Uh oh!

jreback commented May 20, 2016

Uh oh!

jreback commented Sep 9, 2016

Uh oh!

codecov-io commented Sep 12, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Current coverage is 85.24% (diff: 100%)

Uh oh!

jreback Dec 6, 2016

Choose a reason for hiding this comment

Uh oh!

jreback Dec 6, 2016

Choose a reason for hiding this comment

Uh oh!

jreback commented Dec 6, 2016

Uh oh!

trbs commented Dec 7, 2016

Uh oh!

jreback commented Feb 1, 2017

Uh oh!

Uh oh!

trbs commented Apr 18, 2016 •

edited

Loading

codecov-io commented Sep 12, 2016 •

edited

Loading