-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
support for .pipe, how to make this render in the notebook w/o using show(p) #3046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If y'all have gone with the Something like this could probably be made to work:
but that kind of sucks too. ping @bokeh/dev in case anyone has better ideas than me. |
what does the e.g.
|
I thought that had been gotten rid of in the new charts work, but maybe not. |
so is their a method to call to show things in the notebook? (obviously Its quite convenient to do this type of thing ad-hoc by using chained/pipe computations that end in a plot. |
It depends on what kind of output has been initialized (e.g., file, notebook, server) and then it eventually will call the Jupyter HTML publishing API at the lowest level render some HTML templates that display the plots (or layouts or widgets). I guess the issues is this: the "pipe-able" functions that accept a data frame as the first are are only a very tiny fraction of all bokeh functions. I don't want to add something that is too easy to use incorrectly or leads to more confusion than utility. This is why |
@bryevdv that seems kind of odd for bokeh to have to 'depend on the type of output', when bokeh is already 'figuring it out' in the I think providing syntactic compatibility with what other libraries (mpl/seaborn/pandas) in the ad-hoc notebook analysis would be first class. This makes bokeh feel like the old matplotlib way of doing things (which to be honest was not fun). Maybe I don't understand the design / impl constraints. Generally these libraries support a |
If As to what is different, lots of things. Those other libraries don't have to load a separate javascript library to actually do the rendering, for instance. |
+1 to some kind of pipeable interactive analysis mode with economy of typing. |
Some naive questions, I don't know the history or all the code here, but have been hacking on output_notebook and show() a little.
Is the seaborn/matplotlib behavior based on whether they are running in a notebook, something like "if in_notebook: send plot to notebook" at the end of every plotting function, or is it more complicated? Bokeh does let you assemble plots into layouts of plots, so if making a plot sent it to the notebook, one question I have is how you would be able to make four plots, put them in a layout, and only then send the whole layout to the notebook. |
not really sure how mpl does this, my point of all of this, is the upstream code already supports pipe chaining.
bokeh should simply be able to act as a drop-in replacement, original impl is here |
I don't understand what you mean by "drop in replacement" - for pipe? for mpl you mean? How would the 4-plots-in-a-layout case work? does mpl have an equivalent and what does it do there? the rest of you have more background knowledge than I do. |
@havocp 4-plots-in-a-layout case is not part of this. It is simply taking a Series/DataFrame and plotting it. (a single plot). |
but my question is how does the function you pass to pipe know whether it should push the plot over to the notebook, vs the plot is an intermediate result that is going into a layout and should not be pushed. Are those functions passed to pipe special functions only used for pipe ? |
i.e. how do we implement Histogram so it works with pipe and also not with pipe. or is the idea I have a special pipe-only Histogram |
Those other tools are very "immediate mode", and I think pandas is using On Sunday, November 1, 2015, Havoc Pennington [email protected]
Peter Wang |
@havocp that's my point. you shouldn't have to do anything special. The key is that an object that is returned in a notebook cell will render (this is how mpl / seaborn / pandas work). I don't know how this happens. totally not averse to doing doing semething like:
e.g. I simply want a drop-in replacement for this:
|
IMHO this would instantly create a fair amount of usage for bokeh, just like what has happened with In exploring data, one cann build up these little pipelines in a notebook, for testing, viewing, just playing around. Sure they may want to control various aspects and/or make more sophisticated plots, but making it dead-simple to swap in bokeh would mean people would gravitate to these new functions easily. |
@jreback the reason I ask is that right now the notebook output mode sends some html over zeromq to the notebook. So if Histogram always sends the histogram over zeromq, in the layout case we would send 4 histograms, then send the layout with 4 histograms in it. Since sending to the notebook is a side effect. Also bokeh currently lets you create the histogram or layout and then further modify it. and that would also break if creating the histogram auto-sends. It looks to the notebook user a bit like the notebook is rendering the return value of the cell, but for Bokeh as far as I know that isn't true. The rendered plot is a message that bokeh sends when you show(), so we need to have the "final" output somehow indicated (I think anyway). So I'm trying to figure out if we somehow know we are the final step in a pipe and it sounds like no. That leaves us with either special versions of Histogram that are only used as the last step, or an explicit show method of some kind I guess ? the .show() method at the end I expect could work. |
Check these out: Agree with @jreback - I think this ggvis style of easy piping/syntax but powerful and flexible interactivity(accessed as part of initial plotting), layering and api at the right level of abstraction is what we should shoot for. Bokeh could be amazing for quickly iterating EDA. |
as I said, I am not really sure how the rendering actually occurs in the notebook. Though I suspect something like
I don't really have a problem with explicity calling e.g.
see [12] here; this is part of my PyDataNYC tutorial. I use a |
Just to show something else we are doing here and the notebook here we do rendering in a chain where its easy to building; the return value ultimately is HTML rendered (this of course is a table).....in theory we could use bokeh as a renderer as well (where you could then take the 'hints' that we have constructed and make handle it directly rather than our in-built templating soln) |
Looks really cool...maybe it can interface with bokeh and or phosphor datatable. I could see that having some great interactive filtering/conditional formatting functionality. |
to be clear, there's no need to convince people some solution is useful. for me I'm purely in the mode of figuring out how it could best work without breaking some other thing. I guess next step for me would be to go see how mpl and seaborn etc do it (pointers welcome). If they unconditionally send html to the notebook then bokeh would have to somehow be different afaik since it has multiple kinds of output. |
Maybe a naive question, but can't we use |
We can, but we're not going to. We've already been down this path. More implicit, "auto showing", we already had these discussions, and made these decision, and went through more than a little pain to rip them out. The potential for out of order execution in the notebook, and the cross-language execution mean implicit state and actions make it hard to reason about what will happen in the notebook, and easy to have things happen that were not intended or wanted. TLDR; it was a mess. I don't understand what is still at issue? Unless I misread, @jreback said this would be ok:
I think that's ok, so what is left to decide? To add more color, I think the above is in fact, much better. If
Or this:
All of those things are basically precluded if we do some auto show thing. By not doing it, we get to play to the best strengths of both libraries at once. I'm also completely unconvinced that
or even less:
puts any kind of actual burden on anyone, or prevents anything (in fact it allows more things, more easily). Finally, this only concerns like six functions out of the entire Bokeh library. My main point of contention with doing anything other more than the minimal |
@bryevdv all of your examples above are great. However wouldn't this on the
makes this render if its the last cell in a notebook, yes? can I have cake and eat it too? |
Well, that won't work because it's @mattpap can you expand on this comment: https://github.com/bokeh/bokeh/blob/master/bokeh/models/component.py#L17 Maybe things are better now, and we could entertain enabling the html repr. Still, I am really really trying to understand, what is wrong with
In general I am very skeptical of burdening APIs with syntactic sugar for things this trivial. Beyond that, adding that only to |
No. I didn't enable this by default, because it breaks the workflow. The solution is to change the workflow, but that wasn't critical at the time this code was added. If you enable html repr, then, with current notebook examples, you will get double output quite often. Either remove |
@bryevdv nothing wrong with except that when someone is using mpl/seaborn/pandas they are acustomed to:
OTOH
may look only slightly different, but now bokeh is just not a drop-in replacement, and requires more mental effort. This was not meant to be a 'I need it now!' feature. On your timeframe (sooner obviously preferred), and clearly you have other usecases in mind so may want to take a different path. I am urging that a nice soln (from my perspective) is that bokeh is drop-in replaceable (maybe a |
The high level goal here is to make Bokeh easy and convenient to use for Pandas users, and not give people any reason not to use it compared to e.g. MPL or Seaborn. Having to add a call to I could be misunderstanding the issue here, but this seems pretty straightforward to me. When
I think Jeff's volunteering to help put Bokeh in front of all the Pandas users. :-) Speaking of documentation, when are we getting our tech documentation team back from the marketing/web-site overhaul? They should be able to help with this. |
Just wanted to chime in here since this is something I know a little bit about from working on HoloViews. What we do is define so called display formatters with IPython, which are basically equivalent to the Edit: Here's a simple self-contained example (note it only works in the notebook): from bokeh.charts import Bar, Chart
from bokeh.io import notebook_div, load_notebook
from bokeh.sampledata.autompg import autompg as df
load_notebook()
def display(chart):
return notebook_div(chart)
ip = get_ipython()
html_formatter = ip.display_formatter.formatters['text/html']
html_formatter.for_type(Chart, display)
Bar(df, 'cyl', values='mpg', title="Total MPG by CYL") |
OK, @bryevdv we should probably take another stab on this as time permits... I think the display formatter idea exposed above it worth exploring (btw, I agree with you that, if we do it, we should do it consistently across all the API levels, not only charts). |
I just wanted to mention that I discussed with @fpliger last week that I left much of the Chart class as is, and am not married in any way to it. After all the legacy charts are retired, I do want to go back and see what we actually want to keep, since I think the I'm definitely on board with providing the notebook user a less verbose experience if at all possible, which I do think is pretty important to broad adoption. Every time someone has to reach back to documentation due to an unexpected outcome is a chance they will give up and go back to what they are used to. |
Given that charts are being moved out to their own repo and HV is going to be promoted more heavily as a high level interface, I am closing this (one of those two places will be a better place to discuss this idea) |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Pandas has had support for a
.pipe
operator for a while, which allows convient pipeing to external functions, see docs hereSee my example here:
http://nbviewer.ipython.org/gist/jreback/cd0d8874495c33a91c79
e.g. want [29] to just work
With seaborn/matplotlib this works quite nicely. In fact it works in bokeh as well already, but
doesn't render it (e.g. you have to call
show(...)
, which defeats the purpose of the pipingany way to have this automatically render? (or at least a
.show()
) method that would render at the end of the cell,e.g. quite convenient to do things like:
and just have it work
The text was updated successfully, but these errors were encountered: