You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/basics.rst
+74
Original file line number
Diff line number
Diff line change
@@ -624,6 +624,77 @@ We can also pass infinite values to define the bins:
624
624
Function application
625
625
--------------------
626
626
627
+
To apply your own or another library's functions to pandas objects,
628
+
you should be aware of the three methods below. The appropriate
629
+
method to use depends on whether your function expects to operate
630
+
on an entire ``DataFrame`` or ``Series``, row- or column-wise, or elementwise.
631
+
632
+
1. `Tablewise Function Application`_: :meth:`~DataFrame.pipe`
633
+
2. `Row or Column-wise Function Application`_: :meth:`~DataFrame.apply`
634
+
3. Elementwise_ function application: :meth:`~DataFrame.applymap`
635
+
636
+
.. _basics.pipe:
637
+
638
+
Tablewise Function Application
639
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
640
+
641
+
.. versionadded:: 0.16.2
642
+
643
+
``DataFrames`` and ``Series`` can of course just be passed into functions.
644
+
However, if the function needs to be called in a chain, consider using the :meth:`~DataFrame.pipe` method.
645
+
Compare the following
646
+
647
+
.. code-block:: python
648
+
649
+
# f, g, and h are functions taking and returning ``DataFrames``
650
+
>>> f(g(h(df), arg1=1), arg2=2, arg3=3)
651
+
652
+
with the equivalent
653
+
654
+
.. code-block:: python
655
+
656
+
>>> (df.pipe(h)
657
+
.pipe(g, arg1=1)
658
+
.pipe(f, arg2=2, arg3=3)
659
+
)
660
+
661
+
Pandas encourages the second style, which is known as method chaining.
662
+
``pipe`` makes it easy to use your own or another library's functions
663
+
in method chains, alongside pandas' methods.
664
+
665
+
In the example above, the functions ``f``, ``g``, and ``h`` each expected the ``DataFrame`` as the first positional argument.
666
+
What if the function you wish to apply takes its data as, say, the second argument?
667
+
In this case, provide ``pipe`` with a tuple of ``(callable, data_keyword)``.
668
+
``.pipe`` will route the ``DataFrame`` to the argument specified in the tuple.
669
+
670
+
For example, we can fit a regression using statsmodels. Their API expects a formula first and a ``DataFrame`` as the second argument, ``data``. We pass in the function, keyword pair ``(sm.poisson, 'data')`` to ``pipe``:
See also :ref:`Categorical Memory Usage <categorical.memory>`.
91
91
92
-
.. _ref-monkey-patching:
93
-
94
-
Adding Features to your pandas Installation
95
-
-------------------------------------------
96
-
97
-
pandas is a powerful tool and already has a plethora of data manipulation
98
-
operations implemented, most of them are very fast as well.
99
-
It's very possible however that certain functionality that would make your
100
-
life easier is missing. In that case you have several options:
101
-
102
-
1) Open an issue on `Github <https://github.com/pydata/pandas/issues/>`__ , explain your need and the sort of functionality you would like to see implemented.
103
-
2) Fork the repo, Implement the functionality yourself and open a PR
104
-
on Github.
105
-
3) Write a method that performs the operation you are interested in and
106
-
Monkey-patch the pandas class as part of your IPython profile startup
107
-
or PYTHONSTARTUP file.
108
-
109
-
For example, here is an example of adding an ``just_foo_cols()``
110
-
method to the dataframe class:
111
-
112
-
::
113
-
114
-
import pandas as pd
115
-
def just_foo_cols(self):
116
-
"""Get a list of column names containing the string 'foo'
117
-
118
-
"""
119
-
return [x for x in self.columns if 'foo' in x]
120
-
121
-
pd.DataFrame.just_foo_cols = just_foo_cols # monkey-patch the DataFrame class
0 commit comments