DOC: doc corrections basics.rst / io.rst · pandas-dev/pandas@705b677 (original) (raw)

`@@ -492,12 +492,11 @@ Similarly, you can get the most frequently occuring value(s) (the mode) of the v

`

492

492

``

493

493

`.. ipython:: python

`

494

494

``

495

``

`-

data = [1, 1, 3, 3, 3, 5, 5, 7, 7, 7]

`

496

``

`-

s = Series(data)

`

497

``

`-

s.mode()

`

498

``

`-

df = pd.DataFrame({"A": np.random.randint(0, 7, size=50),

`

499

``

`-

"B": np.random.randint(-10, 15, size=50)})

`

500

``

`-

df.mode()

`

``

495

`+

s5 = Series([1, 1, 3, 3, 3, 5, 5, 7, 7, 7])

`

``

496

`+

s5.mode()

`

``

497

`+

df5 = DataFrame({"A": np.random.randint(0, 7, size=50),

`

``

498

`+

"B": np.random.randint(-10, 15, size=50)})

`

``

499

`+

df5.mode()

`

501

500

``

502

501

``

503

502

`Discretization and quantiling

`

``` @@ -613,11 +612,17 @@ another array or value), the methods applymap on DataFrame and analogously


`613`

`612`

``` ``map`` on Series accept any Python function taking a single value and

614

613

`returning a single value. For example:

`

615

614

``

``

615

`+

.. ipython:: python

`

``

616

`+

:suppress:

`

``

617

+

``

618

`+

df4 = df_orig.copy()

`

``

619

+

616

620

`.. ipython:: python

`

617

621

``

``

622

`+

df4

`

618

623

` f = lambda x: len(str(x))

`

619

``

`-

df['one'].map(f)

`

620

``

`-

df.applymap(f)

`

``

624

`+

df4['one'].map(f)

`

``

625

`+

df4.applymap(f)

`

621

626

``

622

627

``` Series.map has an additional feature which is that it can be used to easily


`623`

`628`

`"link" or "map" values defined by a secondary series. This is closely related

`

`@@ -712,13 +717,13 @@ make this simpler:

`

`712`

`717`

` :suppress:

`

`713`

`718`

``

`714`

`719`

` df2 = df.reindex(['a', 'b', 'c'], columns=['one', 'two'])

`

`715`

``

`-

df2 = df2 - df2.mean()

`

``

`720`

`+

df3 = df2 - df2.mean()

`

`716`

`721`

``

`717`

`722`

``

`718`

`723`

`.. ipython:: python

`

`719`

`724`

``

`720`

``

`-

df

`

`721`

`725`

` df2

`

``

`726`

`+

df3

`

`722`

`727`

` df.reindex_like(df2)

`

`723`

`728`

``

`724`

`729`

``` Reindexing with ``reindex_axis``

`@@ -1010,7 +1015,7 @@ Extracting Substrings

`

1010

1015

`~~~~~~~~~~~~~~~~~~~~~

`

1011

1016

``

1012

1017

``` The method extract (introduced in version 0.13) accepts regular expressions


`1013`

``

`-

with match groups. Extracting a regular expression with one group returns 

`

``

`1018`

`+

with match groups. Extracting a regular expression with one group returns

`

`1014`

`1019`

`a Series of strings.

`

`1015`

`1020`

``

`1016`

`1021`

`.. ipython:: python

`

`@@ -1043,7 +1048,7 @@ and optional groups like

`

`1043`

`1048`

``

`1044`

`1049`

`can also be used.

`

`1045`

`1050`

``

`1046`

``

`-

Testing for Strings that Match or Contain a Pattern 

`

``

`1051`

`+

Testing for Strings that Match or Contain a Pattern

`

`1047`

`1052`

`~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

`

`1048`

`1053`

``

`1049`

`1054`

``` In previous versions, *extracting* match groups was accomplished by ``match``,

`@@ -1055,8 +1060,8 @@ The distinction between

`

1055

1060

``` match and contains is strictness: match relies on


`1056`

`1061`

``` strict ``re.match`` while ``contains`` relies on ``re.search``.

1057

1062

``

1058

``


In version 0.13, ``match`` performs its old, deprecated behavior by default, 

1059

``

`-

but the new behavior is availabe through the keyword argument

`

``

1063


In version 0.13, ``match`` performs its old, deprecated behavior by default,

``

1064

`+

but the new behavior is availabe through the keyword argument

`

1060

1065

``` as_indexer=True.


`1061`

`1066`

``

`1062`

`1067`

``` Methods like ``match``, ``contains``, ``startswith``, and ``endswith`` take

`@@ -1118,13 +1123,13 @@ determine the sort order:

`

1118

1123

``

1119

1124

`.. ipython:: python

`

1120

1125

``

1121

``

`-

df.sort_index(by='two')

`

``

1126

`+

df1 = DataFrame({'one':[2,1,1,1],'two':[1,3,2,4],'three':[5,4,3,2]})

`

``

1127

`+

df1.sort_index(by='two')

`

1122

1128

``

1123

1129

``` The by argument can take a list of column names, e.g.:


`1124`

`1130`

``

`1125`

`1131`

`.. ipython:: python

`

`1126`

`1132`

``

`1127`

``

`-

df1 = DataFrame({'one':[2,1,1,1],'two':[1,3,2,4],'three':[5,4,3,2]})

`

`1128`

`1133`

` df1[['one', 'two', 'three']].sort_index(by=['one','two'])

`

`1129`

`1134`

``

`1130`

`1135`

``` Series has the method ``order`` (analogous to `R's order function