Skip to content

DOC:Remove DataFrame.append from the 10min intro #27520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 2, 2019
Merged

DOC:Remove DataFrame.append from the 10min intro #27520

merged 7 commits into from
Aug 2, 2019

Conversation

sameshl
Copy link
Contributor

@sameshl sameshl commented Jul 22, 2019

Remove the append section from 10 min intro doc as complexity of that is very different than list.append

Remove the `append` section from 10 min intro doc as complexity of that is very different than `list.append`

closes #27518
@TomAugspurger
Copy link
Contributor

As a note, the DataFrame.append API docs have a good discussion of this https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.append.html#pandas.DataFrame.append.

We need to determine whether the "append" use-case (incrementally building a DataFrame) is important enough to warrant inclusion in the 10min intro.

@TomAugspurger TomAugspurger added this to the 1.0 milestone Jul 22, 2019
@sameshl
Copy link
Contributor Author

sameshl commented Jul 22, 2019

I think as we already have explained concat at https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html#concat, it should suffice for a beginner and it is anyway better than iteratively using append.

@WillAyd
Copy link
Member

WillAyd commented Jul 22, 2019

I'd prefer a warning and link to append docs about performance rather than outright removal

@sameshl sameshl mentioned this pull request Jul 23, 2019
@TomAugspurger
Copy link
Contributor

CI should be fixed if you fetch and merge master then push.

I'd prefer a warning and link to append docs about performance rather than outright removal

Note that we do still discuss DataFrame.append in the user guide: https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html#concatenating-using-append. I just don't think it's valuable enough to include in your first 10 minutes to learning pandas.

That said, we may want to head of the style of iteratively building a DataFrame, like you would with a list, by leaving a comment at the bottom of https://pandas.pydata.org/pandas-docs/stable/getting_started/10min.html#object-creation saying something like

Adding a column to a DataFrame is relatively fast. Adding a row, however, requires a full copy of the data and a construction of a new Index, which may be expensive. Rather than iteratively building a DataFrame by appending records, we recommend passing the pre-built list of records to the DataFrame constructor. See ... for more.

How does that sound @WillAyd?

@WillAyd
Copy link
Member

WillAyd commented Jul 23, 2019

Sounds good!

@@ -83,6 +83,13 @@ As you can see, the columns ``A``, ``B``, ``C``, and ``D`` are automatically
tab completed. ``E`` is there as well; the rest of the attributes have been
truncated for brevity.

.. note::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't seem like a good place for this, can you put near usages of concat instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, even I felt that would be a better place for this.

@sameshl
Copy link
Contributor Author

sameshl commented Jul 26, 2019

@jreback I have made the required changes.

@@ -468,6 +468,13 @@ Concatenating pandas objects together with :func:`concat`:

pd.concat(pieces)

.. note::
Adding a column to a DataFrame is relatively fast. Adding a row, however,
requires a full copy of the data and a construction of a new Index, which
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use double-backticks around DataFrame. Also let's make this shorter / simpler.

first part do something like

Adding a column to a DataFrame is relatively fast, However, adding a row requires a copy, and may be expensive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback Done. Let me know if any more changes are required.

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small style corrections. LGTM otherwise.

@sameshl
Copy link
Contributor Author

sameshl commented Jul 31, 2019

@TomAugspurger Done!

@TomAugspurger TomAugspurger merged commit 0f0dc80 into pandas-dev:master Aug 2, 2019
@TomAugspurger
Copy link
Contributor

Thanks @sameshl!

@sameshl sameshl deleted the doc_10_min branch August 2, 2019 16:10
quintusdias pushed a commit to quintusdias/pandas_dev that referenced this pull request Aug 16, 2019
* DOC:Remove DataFrame.append from the 10min intro

Remove the `append` section from 10 min intro doc as complexity of that is very different than `list.append`

closes pandas-dev#27518
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove DataFrame.append from the 10min intro
4 participants