pandas@705b677 (original) (raw)

`@@ -492,12 +492,11 @@ Similarly, you can get the most frequently occuring value(s) (the mode) of the v

492

493

`.. ipython:: python

494

495

data = [1, 1, 3, 3, 3, 5, 5, 7, 7, 7]

496

s = Series(data)

497

s.mode()

498

df = pd.DataFrame({"A": np.random.randint(0, 7, size=50),

499

"B": np.random.randint(-10, 15, size=50)})

500

df.mode()

495

s5 = Series([1, 1, 3, 3, 3, 5, 5, 7, 7, 7])

496

s5.mode()

497

df5 = DataFrame({"A": np.random.randint(0, 7, size=50),

498

"B": np.random.randint(-10, 15, size=50)})

499

df5.mode()

501

500

502

501

503

502

`Discretization and quantiling

``` @@ -613,11 +612,17 @@ another array or value), the methods applymap on DataFrame and analogously


`613`

`612`

``` ``map`` on Series accept any Python function taking a single value and

614

613

`returning a single value. For example:

615

614

615

.. ipython:: python

616

:suppress:

617

+

618

df4 = df_orig.copy()

619

+

616

620

`.. ipython:: python

617

621

622

df4

618

623

` f = lambda x: len(str(x))

619

df['one'].map(f)

620

df.applymap(f)

624

df4['one'].map(f)

625

df4.applymap(f)

621

626

622

627

``` Series.map has an additional feature which is that it can be used to easily


`623`

`628`

`"link" or "map" values defined by a secondary series. This is closely related

`

`@@ -712,13 +717,13 @@ make this simpler:

`

`712`

`717`

` :suppress:

`

`713`

`718`

``

`714`

`719`

` df2 = df.reindex(['a', 'b', 'c'], columns=['one', 'two'])

`

`715`

``

`-

df2 = df2 - df2.mean()

`

``

`720`

`+

df3 = df2 - df2.mean()

`

`716`

`721`

``

`717`

`722`

``

`718`

`723`

`.. ipython:: python

`

`719`

`724`

``

`720`

``

`-

df

`

`721`

`725`

` df2

`

``

`726`

`+

df3

`

`722`

`727`

` df.reindex_like(df2)

`

`723`

`728`

``

`724`

`729`

``` Reindexing with ``reindex_axis``

`@@ -1010,7 +1015,7 @@ Extracting Substrings

1010

1015

`~~~~~~~~~~~~~~~~~~~~~

1011

1016

1012

1017

``` The method extract (introduced in version 0.13) accepts regular expressions


`1013`

``

`-

with match groups. Extracting a regular expression with one group returns 

`

``

`1018`

`+

with match groups. Extracting a regular expression with one group returns

`

`1014`

`1019`

`a Series of strings.

`

`1015`

`1020`

``

`1016`

`1021`

`.. ipython:: python

`

`@@ -1043,7 +1048,7 @@ and optional groups like

`

`1043`

`1048`

``

`1044`

`1049`

`can also be used.

`

`1045`

`1050`

``

`1046`

``

`-

Testing for Strings that Match or Contain a Pattern 

`

``

`1051`

`+

Testing for Strings that Match or Contain a Pattern

`

`1047`

`1052`

`~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

`

`1048`

`1053`

``

`1049`

`1054`

``` In previous versions, *extracting* match groups was accomplished by ``match``,

`@@ -1055,8 +1060,8 @@ The distinction between

1055

1060

``` match and contains is strictness: match relies on


`1056`

`1061`

``` strict ``re.match`` while ``contains`` relies on ``re.search``.

1057

1062

1058


In version 0.13, ``match`` performs its old, deprecated behavior by default,

1059

but the new behavior is availabe through the keyword argument

1063


In version 0.13, ``match`` performs its old, deprecated behavior by default,

1064

but the new behavior is availabe through the keyword argument

1060

1065

``` as_indexer=True.


`1061`

`1066`

``

`1062`

`1067`

``` Methods like ``match``, ``contains``, ``startswith``, and ``endswith`` take

`@@ -1118,13 +1123,13 @@ determine the sort order:

1118

1123

1119

1124

`.. ipython:: python

1120

1125

1121

df.sort_index(by='two')

1126

df1 = DataFrame({'one':[2,1,1,1],'two':[1,3,2,4],'three':[5,4,3,2]})

1127

df1.sort_index(by='two')

1122

1128

1123

1129

``` The by argument can take a list of column names, e.g.:


`1124`

`1130`

``

`1125`

`1131`

`.. ipython:: python

`

`1126`

`1132`

``

`1127`

``

`-

df1 = DataFrame({'one':[2,1,1,1],'two':[1,3,2,4],'three':[5,4,3,2]})

`

`1128`

`1133`

` df1[['one', 'two', 'three']].sort_index(by=['one','two'])

`

`1129`

`1134`

``

`1130`

`1135`

``` Series has the method ``order`` (analogous to `R's order function