Skip to content

DOC: Add pandas video series to tutorials.rst #24117

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 17, 2018
Merged

DOC: Add pandas video series to tutorials.rst #24117

merged 3 commits into from
Dec 17, 2018

Conversation

justmarkham
Copy link
Contributor

This is a 6-hour video tutorial series that is freely available on YouTube. If you would like to quickly preview the contents of the series, you can see all of the code in this Jupyter Notebook.

Collectively, these videos have nearly 20,000 likes across 1 million views. If you would like to read some testimonials about the quality of the instruction, please see here.

Please let me know if you have any questions or would like me to make any changes. Thanks for your consideration!

@datapythonista datapythonista added Docs Needs Discussion Requires discussion from core team before further action labels Dec 5, 2018
@datapythonista
Copy link
Member

-1 on this

Personally, I don't think we should have that whole page in the pandas repo. It's surely useful, but should be maintained somewhere else.

@TomAugspurger
Copy link
Contributor

Given that the page is there, doesn't it make sense to add this? I haven't watched it, but Kevin's work is typically high quality.

@datapythonista
Copy link
Member

Not a big deal this PR. But maintaining a page like this, seems to me like a lot of work of keeping the content updated, verify the content that people propose to add...

I'd say that at the moment, as it's not easy to navigate the pandas documentation, this page is not as popular. But with some luck, we'll have better navigation and visibility of the pages soon, and besides the time to keep it updated, I think it'll be an increasing demand for content creators to have their resources there.

You can see also this PR: https://github.com/pandas-dev/pandas-website/pull/67

My point is that it'd be great to have a page like this with resources. But I think we already have enough to do to have to keep deciding what content is worth to be "endorsed" in the official pandas website, and what's not.

But as said, feel free to merge it, my concerns are more about the mid-term.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Dec 6, 2018 via email

@WillAyd
Copy link
Member

WillAyd commented Dec 6, 2018

At the very least can we just have one link to the playlist instead of one to each video?

@justmarkham
Copy link
Contributor Author

Hello all, thanks so much for your thoughtful comments. In case it's helpful, I wanted to clarify my motivation for this PR:

I love that the pandas documentation provides a great level of detail about much of the pandas functionality. However, many sections of the documentation lack what might be most helpful to beginners (in my opinion), which is a high-level overview of "how do I accomplish some common task" or "when should I use this particular feature", along with corresponding examples using real-world datasets.

For example, a beginner might have many of these questions:

  • What exactly is the "axis"?
  • When should I use "inplace"?
  • What is the purpose of the index?
  • When should I use a "groupby"? What do the results tell me?
  • How do I use the MultiIndex? What's the purpose of it?
  • How do I fix the SettingWithCopyWarning? What happens if I just ignore it?
  • and so on...

To be fair, the answers to many of these questions may be available in the documentation, but they are not often easy to find (in my experience), especially if you don't already know the terminology to look for. And when you do find the right section, the deep level of detail can obscure the high-level information.

My goal, when creating this video series, was specifically to answer many of those beginner and intermediate questions. I use real-world datasets to make the examples more relatable. And each video is titled such that if you just want to know about some particular topic (e.g., how to use a string method), you immediately know which video to watch by its title ("How do I use string methods in pandas?")

My guess is that lots of beginners poke around in the documentation, looking for answers to these questions, and many of them come across the Tutorials page. For those users, I wanted to provide an easy way to get an answer to their particular question, which is why I listed the videos out individually.

However, I totally understand if my vision for the Tutorials page is not in line with the core team's vision for this page, and I'm happy to proceed however you like!

@datapythonista
Copy link
Member

I agree on your points, and we're making a big effort to improve the documentation and make sure all these questions are answered. And PRs to help on that are more than welcome.

For me the discussion is whether we want to maintain a page of external resources for pandas. I think the people who can merge these PRs have already enough work, to have to review the linked content, and deal with the spam that such a page will generate (once we make navigating the documentation easier).

I won't be reviewing the content, and my feeling is that nobody else will in most cases. So, I'd leave that task to some external website that is interested on maintaining a list of pandas resources.

I'm ok if somebody wants to review and merge this. But personally I'd prefer to remove this page, and close PRs like this or the one for the video links in the web directly.

@jreback
Copy link
Contributor

jreback commented Dec 7, 2018

yeah this should only be a single link

@codecov
Copy link

codecov bot commented Dec 7, 2018

Codecov Report

Merging #24117 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24117      +/-   ##
==========================================
- Coverage    92.2%    92.2%   -0.01%     
==========================================
  Files         162      162              
  Lines       51729    51729              
==========================================
- Hits        47697    47696       -1     
- Misses       4032     4033       +1
Flag Coverage Δ
#multiple 90.6% <ø> (ø) ⬆️
#single 43.02% <ø> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/io/json/json.py 92.61% <0%> (-0.48%) ⬇️
pandas/util/testing.py 87.51% <0%> (+0.09%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8ea7744...74ac4cb. Read the comment docs.

@codecov
Copy link

codecov bot commented Dec 7, 2018

Codecov Report

Merging #24117 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #24117      +/-   ##
==========================================
- Coverage   92.28%   92.28%   -0.01%     
==========================================
  Files         162      162              
  Lines       51831    51831              
==========================================
- Hits        47833    47832       -1     
- Misses       3998     3999       +1
Flag Coverage Δ
#multiple 90.69% <ø> (ø) ⬆️
#single 43.01% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/util/testing.py 87.48% <0%> (-0.1%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6077b88...3a8a14c. Read the comment docs.

@justmarkham
Copy link
Contributor Author

I've removed the original list of videos, and instead added a link to the YouTube playlist under the "Video Tutorials" section. I included links to both the GitHub repo and the Jupyter Notebook, since the Notebook effectively summarizes the content of the video series.

I additionally included a second (separate) pandas video series from PyCon 2018.

I'd be glad to make additional changes as requested. Thanks!

@jorisvandenbossche
Copy link
Member

I think Joris has stats on the actual page views, so we can see if anyone
goes here.

It's actually (surprisingly, at least to me) quite popular: n° 22 in ranking of most visited pages (although only n° 34 for unique visits). Around 50,000 page visits (for the last 30 days), compared to eg around 95,000 for api.html

I am fine with the reduced list as it is now in this PR (given that we have the page at the moment).

But in general, it is true that if we have such a page, it should also be maintained. Which it isn't at the moment. And it's not only about reviewing PRs with new additions, but also cleaning up current content, as older links also get out of date.
On the other hand, a good curated page similar to this is also very valuable to users.

For example, a beginner might have many of these questions:

All good questions, and ideally, those should be much better answered in the main docs as well ..

@jorisvandenbossche
Copy link
Member

Triggered by the discussion here, I was looking at the current page, and I find the long listing of content of some of the tutorials (but others much less) a bit distractive of the overall list. So proposal to shorten the long ones: #24152

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with this

@datapythonista
Copy link
Member

@justmarkham can you merge master, update the PR, and make sure the CI is green, so we can merge this please

@justmarkham
Copy link
Contributor Author

@datapythonista I merged master, and it looks like the Travis CI build failed. I looked at the error logs, and the failure I'm seeing doesn't seem to relate to my change. Please pardon my ignorance, but I'm not sure how to proceed from here? Thanks for any help!

@datapythonista
Copy link
Member

Thanks @justmarkham, I think I've seen this test failing randomly some times. We should take a look and fix it's more deterministic (not sure if that's easy being a hypothesis test). I restarted it, hopefully this time finish successfully.

@justmarkham
Copy link
Contributor Author

@datapythonista Thanks for restarting the tests, this time it passed! 👍✅

@datapythonista datapythonista merged commit 216986d into pandas-dev:master Dec 17, 2018
@datapythonista
Copy link
Member

Thanks @justmarkham

TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request Dec 20, 2018
@justmarkham justmarkham deleted the patch-1 branch December 21, 2018 14:00
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants