Skip to content

add support for specifying secondary indexes with to_sql #12904

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

trbs
Copy link

@trbs trbs commented Apr 15, 2016

Support for creating secondary indexes in to_sql via the parameter indexes.

The code also tries to avoid creating duplicate indexes when keys is specified in SQLTable.
(Once the PR for passing the keys this would be useful)

This PR introduces a new method in SQLTable called _is_column_indexed so that subclasses can easily override or change the logic.

Interesting future work will be to also support creating compound indexes.

@@ -516,7 +516,8 @@ def read_sql(sql, con, index_col=None, coerce_float=True, params=None,


def to_sql(frame, name, con, flavor='sqlite', schema=None, if_exists='fail',
index=True, index_label=None, chunksize=None, dtype=None):
index=True, index_label=None, chunksize=None, dtype=None,
indexes=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would need to be added to the doc-string with a versionadded directive

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be included in v0.18.1 ? or will it be for v0.19.0.

I will also update the corresponding whatsnew/.txt file.

@jreback
Copy link
Contributor

jreback commented Apr 16, 2016

does this have an open issue for this? (IIRC there is)

@jreback jreback added Enhancement IO SQL to_sql, read_sql, read_sql_query labels Apr 16, 2016
@trbs
Copy link
Author

trbs commented Apr 18, 2016

I didn't find this exact issue.

#7984 and #9084 are similar in that they also deal with to_sql / sql indexes but not specifically specifying which columns should be indexes.

I am interested in maybe picking up #7984 as well after this PR. Together with a PR for specifying multi-column / compound indexes.

@jreback
Copy link
Contributor

jreback commented Apr 18, 2016

hmm @trbs this PR looks like it should close those 2 issues, or is there something that is indicated in those that is not in your PR?

@trbs
Copy link
Author

trbs commented Apr 18, 2016

#7984 is about unique indexes... this PR would only add an index to the column not unique. I add that to this PR.

#9084 is basically about adding more parameters to get_schema(...) to that it includes the index=.... and index_label=... parameters that to_sql has. (so its about the dataframe index not database indices)

I could do the patch for #9084 as well since it looks kinda trivial. But it's probably better to do that in a separate PR ?

@jreback
Copy link
Contributor

jreback commented Apr 18, 2016

@jorisvandenbossche

@jreback
Copy link
Contributor

jreback commented May 20, 2016

status? @jorisvandenbossche

@jreback
Copy link
Contributor

jreback commented Sep 9, 2016

can you rebase / update?

* github.com:pydata/pandas: (554 commits)
  BUG: compat with Stata ver 111
  Fix: F999 dictionary key '2000q4' repeated with different values (pandas-dev#14198)
  BLD: Test for Python 3.5 with C locale
  BUG: DatetimeTZBlock can't assign values near dst boundary
  BUG: union_categorical with Series and cat idx
  BUG: fix str.contains for series containing only nan values
  BUG: Categorical constructor not idempotent with ext dtype
  TST: Make encoded sep check more locale sensitive (pandas-dev#14161)
  DOC: minor typo in 0.19.0 whatsnew file (pandas-dev#14185)
  BUG: fix tz-aware datetime convert to DatetimeIndex (GH 14088)
  BUG : bug in setting a slice of a Series with a np.timedelta64
  RLS: v0.19.0rc1
  DOC: clean-up 0.19.0 whatsnew file (pandas-dev#14176)
  DOC: cleanup build warnings (pandas-dev#14172)
  Add steps to run gbq integration testing to the contributing docs (pandas-dev#14144)
  ENH: concat and append now can handle unordered categories (pandas-dev#13767)
  DEPR: Deprecate pandas.core.datetools (pandas-dev#14105)
  API/DEPR: Remove +/- as setops for DatetimeIndex/PeriodIndex (GH9630) (pandas-dev#14164)
  Fix trivial typo in comment (pandas-dev#14174)
  API/DEPR: Remove +/- as setops for Index (GH8227) (pandas-dev#14127)
  ...
@codecov-io
Copy link

codecov-io commented Sep 12, 2016

Current coverage is 85.24% (diff: 100%)

Merging #12904 into master will increase coverage by <.01%

@@             master     #12904   diff @@
==========================================
  Files           140        140          
  Lines         50563      50575    +12   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43103      43114    +11   
- Misses         7460       7461     +1   
  Partials          0          0          

Powered by Codecov. Last update 54ab5be...a092198

@@ -1158,7 +1158,8 @@ def to_msgpack(self, path_or_buf=None, encoding='utf-8', **kwargs):
**kwargs)

def to_sql(self, name, con, flavor=None, schema=None, if_exists='fail',
index=True, index_label=None, chunksize=None, dtype=None):
index=True, index_label=None, chunksize=None, dtype=None,
indexes=None):
"""
Write records stored in a DataFrame to a SQL database.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are the params documented here?

indexes : list of column name(s). Columns names in this list will have
an indexes created for them in the database.

.. versionadded:: 0.18.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.20.0

@jreback
Copy link
Contributor

jreback commented Dec 6, 2016

@jorisvandenbossche status?

@trbs
Copy link
Author

trbs commented Dec 7, 2016

btw happy to rebase again if Pandas wants to merge this PR.

@jreback
Copy link
Contributor

jreback commented Feb 1, 2017

closing as stale

@jreback jreback closed this Feb 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO SQL to_sql, read_sql, read_sql_query
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants