-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
add support for specifying secondary indexes with to_sql #12904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -516,7 +516,8 @@ def read_sql(sql, con, index_col=None, coerce_float=True, params=None, | |||
|
|||
|
|||
def to_sql(frame, name, con, flavor='sqlite', schema=None, if_exists='fail', | |||
index=True, index_label=None, chunksize=None, dtype=None): | |||
index=True, index_label=None, chunksize=None, dtype=None, | |||
indexes=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would need to be added to the doc-string with a versionadded directive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be included in v0.18.1 ? or will it be for v0.19.0.
I will also update the corresponding whatsnew/.txt file.
does this have an open issue for this? (IIRC there is) |
I didn't find this exact issue. #7984 and #9084 are similar in that they also deal with to_sql / sql indexes but not specifically specifying which columns should be indexes. I am interested in maybe picking up #7984 as well after this PR. Together with a PR for specifying multi-column / compound indexes. |
hmm @trbs this PR looks like it should close those 2 issues, or is there something that is indicated in those that is not in your PR? |
#7984 is about unique indexes... this PR would only add an index to the column not unique. I add that to this PR. #9084 is basically about adding more parameters to get_schema(...) to that it includes the index=.... and index_label=... parameters that I could do the patch for #9084 as well since it looks kinda trivial. But it's probably better to do that in a separate PR ? |
status? @jorisvandenbossche |
can you rebase / update? |
* github.com:pydata/pandas: (554 commits) BUG: compat with Stata ver 111 Fix: F999 dictionary key '2000q4' repeated with different values (pandas-dev#14198) BLD: Test for Python 3.5 with C locale BUG: DatetimeTZBlock can't assign values near dst boundary BUG: union_categorical with Series and cat idx BUG: fix str.contains for series containing only nan values BUG: Categorical constructor not idempotent with ext dtype TST: Make encoded sep check more locale sensitive (pandas-dev#14161) DOC: minor typo in 0.19.0 whatsnew file (pandas-dev#14185) BUG: fix tz-aware datetime convert to DatetimeIndex (GH 14088) BUG : bug in setting a slice of a Series with a np.timedelta64 RLS: v0.19.0rc1 DOC: clean-up 0.19.0 whatsnew file (pandas-dev#14176) DOC: cleanup build warnings (pandas-dev#14172) Add steps to run gbq integration testing to the contributing docs (pandas-dev#14144) ENH: concat and append now can handle unordered categories (pandas-dev#13767) DEPR: Deprecate pandas.core.datetools (pandas-dev#14105) API/DEPR: Remove +/- as setops for DatetimeIndex/PeriodIndex (GH9630) (pandas-dev#14164) Fix trivial typo in comment (pandas-dev#14174) API/DEPR: Remove +/- as setops for Index (GH8227) (pandas-dev#14127) ...
Current coverage is 85.24% (diff: 100%)@@ master #12904 diff @@
==========================================
Files 140 140
Lines 50563 50575 +12
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43103 43114 +11
- Misses 7460 7461 +1
Partials 0 0
|
@@ -1158,7 +1158,8 @@ def to_msgpack(self, path_or_buf=None, encoding='utf-8', **kwargs): | |||
**kwargs) | |||
|
|||
def to_sql(self, name, con, flavor=None, schema=None, if_exists='fail', | |||
index=True, index_label=None, chunksize=None, dtype=None): | |||
index=True, index_label=None, chunksize=None, dtype=None, | |||
indexes=None): | |||
""" | |||
Write records stored in a DataFrame to a SQL database. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are the params documented here?
indexes : list of column name(s). Columns names in this list will have | ||
an indexes created for them in the database. | ||
|
||
.. versionadded:: 0.18.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0.20.0
@jorisvandenbossche status? |
btw happy to rebase again if Pandas wants to merge this PR. |
closing as stale |
Support for creating secondary indexes in
to_sql
via the parameterindexes
.The code also tries to avoid creating duplicate indexes when
keys
is specified inSQLTable
.(Once the PR for passing the
keys
this would be useful)This PR introduces a new method in
SQLTable
called_is_column_indexed
so that subclasses can easily override or change the logic.Interesting future work will be to also support creating compound indexes.