Skip to content
80 changes: 70 additions & 10 deletions doc/source/text.rst
Original file line number Diff line number Diff line change
Expand Up @@ -247,27 +247,87 @@ Missing values on either side will result in missing values in the result as wel
s.str.cat(t)
s.str.cat(t, na_rep='-')

Series are *not* aligned on their index before concatenation:
Concatenating a Series and something array-like into a Series
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. versionadded:: 0.23.0

The parameter ``others`` can also be two-dimensional. In this case, the number or rows must match the lengths of the calling ``Series`` (or ``Index``).

.. ipython:: python

u = pd.Series(['b', 'd', 'e', 'c'], index=[1, 3, 4, 2])
# without alignment
d = pd.concat([t, s], axis=1)
s
d
s.str.cat(d, na_rep='-')

Concatenating a Series and an indexed object into a Series, with alignment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. versionadded:: 0.23.0

For concatenation with a ``Series`` or ``DataFrame``, it is possible to align the indexes before concatenation by setting
the ``join``-keyword.

.. ipython:: python

u = pd.Series(['b', 'd', 'a', 'c'], index=[1, 3, 0, 2])
s
u
s.str.cat(u)
# with separate alignment
v, w = s.align(u)
v.str.cat(w, na_rep='-')
s.str.cat(u, join='left')

.. warning::

If the ``join`` keyword is not passed, the method :meth:`~Series.str.cat` will currently fall back to the behavior before version 0.23.0 (i.e. no alignment),
but a ``FutureWarning`` will be raised if any of the involved indexes differ, since this default will change to ``join='left'`` in a future version.

The usual options are available for ``join`` (one of ``'left', 'outer', 'inner', 'right'``).
In particular, alignment also means that the different lengths do not need to coincide anymore.

.. ipython:: python

v = pd.Series(['z', 'a', 'b', 'd', 'e'], index=[-1, 0, 1, 3, 4])
s
v
s.str.cat(v, join='left', na_rep='-')
s.str.cat(v, join='outer', na_rep='-')

The same alignment can be used when ``others`` is a ``DataFrame``:

.. ipython:: python

f = d.loc[[3, 2, 1, 0], :]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show f here

s
f
s.str.cat(f, join='left', na_rep='-')

Concatenating a Series and many objects into a Series
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

List-likes (excluding iterators, ``dict``-views, etc.) can be arbitrarily combined in a list.
All elements of the list must match in length to the calling ``Series`` (resp. ``Index``):
All one-dimensional list-likes can be arbitrarily combined in a list-like container (including iterators, ``dict``-views, etc.):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: what happens with a dict? Do we use the keys, or does it transform to a series and get aligned?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was talking about d.keys() or d.values(). For passing a dict directly, currently the keys would get read (as that's what x in d would return).

Copy link
Contributor Author

@h-vetinari h-vetinari May 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could of course add another safety that maps d to d.values().

However, I tend to think that this could be left to the python-skills of the user as well -- a dictionary is not "list-like" from the POV of normal python usage (whereas its keys and values are, see d = dict(zip(keys, values))).


.. ipython:: python

s
u
s.str.cat([u, pd.Index(u.values), ['A', 'B', 'C', 'D'], map(int, u.index)], na_rep='-')

All elements must match in length to the calling ``Series`` (or ``Index``), except those having an index if ``join`` is not None:

.. ipython:: python

v
s.str.cat([u, v, ['A', 'B', 'C', 'D']], join='outer', na_rep='-')

If using ``join='right'`` on a list of ``others`` that contains different indexes,
the union of these indexes will be used as the basis for the final concatenation:

.. ipython:: python

x = pd.Series([1, 2, 3, 4], index=['A', 'B', 'C', 'D'])
s.str.cat([['A', 'B', 'C', 'D'], s, s.values, x.index])
u.loc[[3]]
v.loc[[-1, 0]]
s.str.cat([u.loc[[3]], v.loc[[-1, 0]]], join='right', na_rep='-')

Indexing with ``.str``
----------------------
Expand Down
18 changes: 18 additions & 0 deletions doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -308,6 +308,24 @@ The :func:`DataFrame.assign` now accepts dependent keyword arguments for python

df.assign(A=df.A+1, C= lambda df: df.A* -1)

.. _whatsnew_0230.enhancements.str_cat_align:

``Series.str.cat`` has gained the ``join`` kwarg
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Previously, :meth:`Series.str.cat` did not -- in contrast to most of ``pandas`` -- align :class:`Series` on their index before concatenation (see :issue:`18657`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put a reference to the text.rst doc section (top of the .cat section is ok)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference is already there after the examples - or do you wanna open with it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be near the top

The method has now gained a keyword ``join`` to control the manner of alignment, see examples below and in :ref:`here <text.concatenate>`.

In v.0.23 `join` will default to None (meaning no alignment), but this default will change to ``'left'`` in a future version of pandas.

.. ipython:: python

s = pd.Series(['a', 'b', 'c', 'd'])
t = pd.Series(['b', 'd', 'e', 'c'], index=[1, 3, 4, 2])
s.str.cat(t)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too long examples here

s.str.cat(t, join='left', na_rep='-')

Furthermore, meth:`Series.str.cat` now works for ``CategoricalIndex`` as well (previously raised a ``ValueError``; see :issue:`20842`).

.. _whatsnew_0230.enhancements.astype_category:

Expand Down
Loading