DOC: doc corrections basics.rst / io.rst · pandas-dev/pandas@705b677 (original) (raw)
`@@ -492,12 +492,11 @@ Similarly, you can get the most frequently occuring value(s) (the mode) of the v
`
492
492
``
493
493
`.. ipython:: python
`
494
494
``
495
``
`-
data = [1, 1, 3, 3, 3, 5, 5, 7, 7, 7]
`
496
``
`-
s = Series(data)
`
497
``
`-
s.mode()
`
498
``
`-
df = pd.DataFrame({"A": np.random.randint(0, 7, size=50),
`
499
``
`-
"B": np.random.randint(-10, 15, size=50)})
`
500
``
`-
df.mode()
`
``
495
`+
s5 = Series([1, 1, 3, 3, 3, 5, 5, 7, 7, 7])
`
``
496
`+
s5.mode()
`
``
497
`+
df5 = DataFrame({"A": np.random.randint(0, 7, size=50),
`
``
498
`+
"B": np.random.randint(-10, 15, size=50)})
`
``
499
`+
df5.mode()
`
501
500
``
502
501
``
503
502
`Discretization and quantiling
`
``` @@ -613,11 +612,17 @@ another array or value), the methods applymap
on DataFrame and analogously
`613`
`612`
``` ``map`` on Series accept any Python function taking a single value and
614
613
`returning a single value. For example:
`
615
614
``
``
615
`+
.. ipython:: python
`
``
616
`+
:suppress:
`
``
617
+
``
618
`+
df4 = df_orig.copy()
`
``
619
+
616
620
`.. ipython:: python
`
617
621
``
``
622
`+
df4
`
618
623
` f = lambda x: len(str(x))
`
619
``
`-
df['one'].map(f)
`
620
``
`-
df.applymap(f)
`
``
624
`+
df4['one'].map(f)
`
``
625
`+
df4.applymap(f)
`
621
626
``
622
627
``` Series.map
has an additional feature which is that it can be used to easily
`623`
`628`
`"link" or "map" values defined by a secondary series. This is closely related
`
`@@ -712,13 +717,13 @@ make this simpler:
`
`712`
`717`
` :suppress:
`
`713`
`718`
``
`714`
`719`
` df2 = df.reindex(['a', 'b', 'c'], columns=['one', 'two'])
`
`715`
``
`-
df2 = df2 - df2.mean()
`
``
`720`
`+
df3 = df2 - df2.mean()
`
`716`
`721`
``
`717`
`722`
``
`718`
`723`
`.. ipython:: python
`
`719`
`724`
``
`720`
``
`-
df
`
`721`
`725`
` df2
`
``
`726`
`+
df3
`
`722`
`727`
` df.reindex_like(df2)
`
`723`
`728`
``
`724`
`729`
``` Reindexing with ``reindex_axis``
`@@ -1010,7 +1015,7 @@ Extracting Substrings
`
1010
1015
`~~~~~~~~~~~~~~~~~~~~~
`
1011
1016
``
1012
1017
``` The method extract
(introduced in version 0.13) accepts regular expressions
`1013`
``
`-
with match groups. Extracting a regular expression with one group returns
`
``
`1018`
`+
with match groups. Extracting a regular expression with one group returns
`
`1014`
`1019`
`a Series of strings.
`
`1015`
`1020`
``
`1016`
`1021`
`.. ipython:: python
`
`@@ -1043,7 +1048,7 @@ and optional groups like
`
`1043`
`1048`
``
`1044`
`1049`
`can also be used.
`
`1045`
`1050`
``
`1046`
``
`-
Testing for Strings that Match or Contain a Pattern
`
``
`1051`
`+
Testing for Strings that Match or Contain a Pattern
`
`1047`
`1052`
`~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
`
`1048`
`1053`
``
`1049`
`1054`
``` In previous versions, *extracting* match groups was accomplished by ``match``,
`@@ -1055,8 +1060,8 @@ The distinction between
`
1055
1060
``` match
and contains
is strictness: match
relies on
`1056`
`1061`
``` strict ``re.match`` while ``contains`` relies on ``re.search``.
1057
1062
``
1058
``
In version 0.13, ``match`` performs its old, deprecated behavior by default,
1059
``
`-
but the new behavior is availabe through the keyword argument
`
``
1063
In version 0.13, ``match`` performs its old, deprecated behavior by default,
``
1064
`+
but the new behavior is availabe through the keyword argument
`
1060
1065
``` as_indexer=True
.
`1061`
`1066`
``
`1062`
`1067`
``` Methods like ``match``, ``contains``, ``startswith``, and ``endswith`` take
`@@ -1118,13 +1123,13 @@ determine the sort order:
`
1118
1123
``
1119
1124
`.. ipython:: python
`
1120
1125
``
1121
``
`-
df.sort_index(by='two')
`
``
1126
`+
df1 = DataFrame({'one':[2,1,1,1],'two':[1,3,2,4],'three':[5,4,3,2]})
`
``
1127
`+
df1.sort_index(by='two')
`
1122
1128
``
1123
1129
``` The by
argument can take a list of column names, e.g.:
`1124`
`1130`
``
`1125`
`1131`
`.. ipython:: python
`
`1126`
`1132`
``
`1127`
``
`-
df1 = DataFrame({'one':[2,1,1,1],'two':[1,3,2,4],'three':[5,4,3,2]})
`
`1128`
`1133`
` df1[['one', 'two', 'three']].sort_index(by=['one','two'])
`
`1129`
`1134`
``
`1130`
`1135`
``` Series has the method ``order`` (analogous to `R's order function