Skip to content

Commit c33eb36

Browse files
committed
DOC: minor whatsnew edits
1 parent f4de157 commit c33eb36

File tree

2 files changed

+55
-51
lines changed

2 files changed

+55
-51
lines changed

doc/source/reshaping.rst

Lines changed: 37 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ Reshaping by pivoting DataFrame objects
2828
...: 'variable' : np.asarray(frame.columns).repeat(N),
2929
...: 'date' : np.tile(np.asarray(frame.index), K)}
3030
...: columns = ['date', 'variable', 'value']
31-
...: return DataFrame(data, columns=columns)
31+
...: return pd.DataFrame(data, columns=columns)
3232
...:
3333

3434
In [3]: df = unpivot(tm.makeTimeDataFrame())
@@ -318,8 +318,8 @@ some very expressive and fast data manipulations.
318318
df.mean().unstack(0)
319319
320320
321-
Pivot tables and cross-tabulations
322-
----------------------------------
321+
Pivot tables
322+
------------
323323

324324
.. _reshaping.pivot:
325325

@@ -371,7 +371,7 @@ Also, you can use ``Grouper`` for ``index`` and ``columns`` keywords. For detail
371371

372372
.. ipython:: python
373373
374-
pd.pivot_table(df, values='D', index=Grouper(freq='M', key='F'), columns='C')
374+
pd.pivot_table(df, values='D', index=pd.Grouper(freq='M', key='F'), columns='C')
375375
376376
You can render a nice output of the table omitting the missing values by
377377
calling ``to_string`` if you wish:
@@ -383,11 +383,23 @@ calling ``to_string`` if you wish:
383383
384384
Note that ``pivot_table`` is also available as an instance method on DataFrame.
385385

386+
.. _reshaping.pivot.margins:
387+
388+
Adding margins
389+
~~~~~~~~~~~~~~
390+
391+
If you pass ``margins=True`` to ``pivot_table``, special ``All`` columns and
392+
rows will be added with partial group aggregates across the categories on the
393+
rows and columns:
394+
395+
.. ipython:: python
396+
397+
df.pivot_table(index=['A', 'B'], columns='C', margins=True, aggfunc=np.std)
398+
386399
.. _reshaping.crosstabulations:
387400

388401
Cross tabulations
389-
~~~~~~~~~~~~~~~~~
390-
402+
-----------------
391403

392404
Use the ``crosstab`` function to compute a cross-tabulation of two (or more)
393405
factors. By default ``crosstab`` computes a frequency table of the factors
@@ -401,11 +413,11 @@ It takes a number of arguments
401413
the factors
402414
- ``aggfunc``: function, optional, If no values array is passed, computes a
403415
frequency table
404-
- ``rownames``: sequence, default None, must match number of row arrays passed
405-
- ``colnames``: sequence, default None, if passed, must match number of column
416+
- ``rownames``: sequence, default ``None``, must match number of row arrays passed
417+
- ``colnames``: sequence, default ``None``, if passed, must match number of column
406418
arrays passed
407-
- ``margins``: boolean, default False, Add row/column margins (subtotals)
408-
- ``normalize``: boolean, {'all', 'index', 'columns'}, or {0,1}, default False.
419+
- ``margins``: boolean, default ``False``, Add row/column margins (subtotals)
420+
- ``normalize``: boolean, {'all', 'index', 'columns'}, or {0,1}, default ``False``.
409421
Normalize by dividing all values by the sum of values.
410422

411423

@@ -427,11 +439,14 @@ If ``crosstab`` receives only two Series, it will provide a frequency table.
427439

428440
.. ipython:: python
429441
430-
df = pd.DataFrame({'a': [1, 2, 2, 2, 2], 'b': [3, 3, 4, 4, 4],
431-
'c': [1, 1, np.nan, 1, 1]})
442+
df = pd.DataFrame({'A': [1, 2, 2, 2, 2], 'B': [3, 3, 4, 4, 4],
443+
'C': [1, 1, np.nan, 1, 1]})
432444
df
433445
434-
pd.crosstab(df.a, df.b)
446+
pd.crosstab(df.A, df.B)
447+
448+
Normalization
449+
~~~~~~~~~~~~~
435450

436451
.. versionadded:: 0.18.1
437452

@@ -440,49 +455,38 @@ using the ``normalize`` argument:
440455

441456
.. ipython:: python
442457
443-
pd.crosstab(df.a, df.b, normalize=True)
458+
pd.crosstab(df.A, df.B, normalize=True)
444459
445460
``normalize`` can also normalize values within each row or within each column:
446461

447462
.. ipython:: python
448463
449-
pd.crosstab(df.a, df.b, normalize='columns')
464+
pd.crosstab(df.A, df.B, normalize='columns')
450465
451466
``crosstab`` can also be passed a third Series and an aggregation function
452467
(``aggfunc``) that will be applied to the values of the third Series within each
453468
group defined by the first two Series:
454469

455470
.. ipython:: python
456471
457-
pd.crosstab(df.a, df.b, values=df.c, aggfunc=np.sum)
472+
pd.crosstab(df.A, df.B, values=df.C, aggfunc=np.sum)
458473
459-
And finally, one can also add margins or normalize this output.
474+
Adding Margins
475+
~~~~~~~~~~~~~~
460476

461-
.. ipython:: python
462-
463-
pd.crosstab(df.a, df.b, values=df.c, aggfunc=np.sum, normalize=True,
464-
margins=True)
465-
466-
.. _reshaping.pivot.margins:
467-
468-
Adding margins (partial aggregates)
469-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
470-
471-
If you pass ``margins=True`` to ``pivot_table``, special ``All`` columns and
472-
rows will be added with partial group aggregates across the categories on the
473-
rows and columns:
477+
Finally, one can also add margins or normalize this output.
474478

475479
.. ipython:: python
476480
477-
df.pivot_table(index=['A', 'B'], columns='C', margins=True, aggfunc=np.std)
481+
pd.crosstab(df.A, df.B, values=df.C, aggfunc=np.sum, normalize=True,
482+
margins=True)
478483
479484
.. _reshaping.tile:
485+
.. _reshaping.tile.cut:
480486

481487
Tiling
482488
------
483489

484-
.. _reshaping.tile.cut:
485-
486490
The ``cut`` function computes groupings for the values of the input array and
487491
is often used to transform continuous variables to discrete or categorical
488492
variables:
@@ -491,7 +495,6 @@ variables:
491495
492496
ages = np.array([10, 15, 13, 12, 23, 25, 28, 59, 60])
493497
494-
495498
pd.cut(ages, bins=3)
496499
497500
If the ``bins`` keyword is an integer, then equal-width bins are formed.

doc/source/whatsnew/v0.18.1.txt

Lines changed: 18 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,25 @@
11
.. _whatsnew_0181:
22

3-
v0.18.1 (April ??, 2016)
4-
------------------------
3+
v0.18.1 (Mayl ??, 2016)
4+
-----------------------
55

66
This is a minor bug-fix release from 0.18.0 and includes a large number of
77
bug fixes along with several new features, enhancements, and performance improvements.
88
We recommend that all users upgrade to this version.
99

1010
Highlights include:
1111

12+
- ``.groupby(...)`` has been enhanced to provide convenient syntax when working with ``.rolling(..)``, ``.expanding(..)`` and ``.resample(..)`` per group, see :ref:`here <whatsnew_0181.deferred_ops>`
13+
- ``pd.to_datetime()`` has gained the ability to assemble dates from a ``DataFrame``, see :ref:`here <whatsnew_0181.enhancements.assembling>`
1214
- Custom business hour offset, see :ref:`here <whatsnew_0181.enhancements.custombusinesshour>`.
13-
15+
- Many bug fixes in the handling of ``sparse``, see :ref:`here <whatsnew_0181.sparse>`
1416

1517
.. contents:: What's new in v0.18.1
1618
:local:
1719
:backlinks: none
1820

1921
.. _whatsnew_0181.new_features:
2022

21-
- ``.groupby(...)`` has been enhanced to provide convenient syntax when working with ``.rolling(..)``, ``.expanding(..)`` and ``.resample(..)`` per group, see :ref:`here <whatsnew_0181.deferred_ops>`
2223

2324
New features
2425
~~~~~~~~~~~~
@@ -149,6 +150,7 @@ Other Enhancements
149150
- ``pd.read_msgpack()`` now always gives writeable ndarrays even when compression is used (:issue:`12359`).
150151
- ``pd.read_msgpack()`` now supports serializing and de-serializing categoricals with msgpack (:issue:`12573`)
151152
- ``interpolate()`` now supports ``method='akima'`` (:issue:`7588`).
153+
- ``pd.read_excel()`` now accepts path objects (e.g. ``pathlib.Path``, ``py.path.local``) for the file path, in line with other ``read_*`` functions (:issue:`12655`)
152154
- ``Index.take`` now handles ``allow_fill`` and ``fill_value`` consistently (:issue:`12631`)
153155
- Added ``weekday_name`` as a component to ``DatetimeIndex`` and ``.dt`` accessor. (:issue:`11128`)
154156

@@ -375,11 +377,11 @@ New behaviour:
375377
In addition to this error change, several others have been made as well:
376378

377379
- ``CParserError`` is now a ``ValueError`` instead of just an ``Exception`` (:issue:`12551`)
378-
- A ``CParserError`` is now raised instead of a generic ``Exception`` in ``read_csv`` when the C engine cannot parse a column
379-
- A ``ValueError`` is now raised instead of a generic ``Exception`` in ``read_csv`` when the C engine encounters a ``NaN`` value in an integer column
380-
- A ``ValueError`` is now raised instead of a generic ``Exception`` in ``read_csv`` when ``true_values`` is specified, and the C engine encounters an element in a column containing unencodable bytes
381-
- ``pandas.parser.OverflowError`` exception has been removed and has been replaced with Python's built-in ``OverflowError`` exception
382-
- ``read_csv`` no longer allows a combination of strings and integers for the ``usecols`` parameter (:issue:`12678`)
380+
- A ``CParserError`` is now raised instead of a generic ``Exception`` in ``read_csv`` when the C engine cannot parse a column (:issue:`12506`)
381+
- A ``ValueError`` is now raised instead of a generic ``Exception`` in ``read_csv`` when the C engine encounters a ``NaN`` value in an integer column (:issue:`12506`)
382+
- A ``ValueError`` is now raised instead of a generic ``Exception`` in ``read_csv`` when ``true_values`` is specified, and the C engine encounters an element in a column containing unencodable bytes (:issue:`12506`)
383+
- ``pandas.parser.OverflowError`` exception has been removed and has been replaced with Python's built-in ``OverflowError`` exception (:issue:`12506`)
384+
- ``pd.read_csv()`` no longer allows a combination of strings and integers for the ``usecols`` parameter (:issue:`12678`)
383385

384386
.. _whatsnew_0181.deprecations:
385387

@@ -459,23 +461,23 @@ Bug Fixes
459461

460462

461463

462-
- Bug in ``read_csv`` with the C engine when specifying ``skiprows`` with newlines in quoted items (:issue:`10911`, `12775`)
463-
- Bug in ``DataFrame`` timezone lost when assigning tz-aware datetime ``Series`` with alignment (:issue `12981`)
464+
- Bug in ``read_csv`` with the C engine when specifying ``skiprows`` with newlines in quoted items (:issue:`10911`, :issue:`12775`)
465+
- Bug in ``DataFrame`` timezone lost when assigning tz-aware datetime ``Series`` with alignment (:issue:`12981`)
464466

465467

466468

467469

468470
- Bug in ``value_counts`` when ``normalize=True`` and ``dropna=True`` where nulls still contributed to the normalized count (:issue:`12558`)
469471
- Bug in ``Panel.fillna()`` ignoring ``inplace=True`` (:issue:`12633`)
470-
- Bug in ``read_csv`` when specifying ``names``, ```usecols``, and ``parse_dates`` simultaneously with the C engine (:issue:`9755`)
472+
- Bug in ``read_csv`` when specifying ``names``, ``usecols``, and ``parse_dates`` simultaneously with the C engine (:issue:`9755`)
471473
- Bug in ``read_csv`` when specifying ``delim_whitespace=True`` and ``lineterminator`` simultaneously with the C engine (:issue:`12912`)
472474
- Bug in ``Series.rename``, ``DataFrame.rename`` and ``DataFrame.rename_axis`` not treating ``Series`` as mappings to relabel (:issue:`12623`).
473475
- Clean in ``.rolling.min`` and ``.rolling.max`` to enhance dtype handling (:issue:`12373`)
474476
- Bug in ``groupby`` where complex types are coerced to float (:issue:`12902`)
475477
- Bug in ``Series.map`` raises ``TypeError`` if its dtype is ``category`` or tz-aware ``datetime`` (:issue:`12473`)
476478

477479

478-
- Bug in index coercion when falling back from ```RangeIndex``` construction (:issue:`12893`)
480+
- Bug in index coercion when falling back from ``RangeIndex`` construction (:issue:`12893`)
479481

480482
- Bug in slicing subclassed ``DataFrame`` defined to return subclassed ``Series`` may return normal ``Series`` (:issue:`11559`)
481483

@@ -494,9 +496,9 @@ Bug Fixes
494496

495497

496498

497-
- Bug in ``fill_value`` is ignored if the argument to a binary operator is a constant (:issue `12723`)
499+
- Bug in ``fill_value`` is ignored if the argument to a binary operator is a constant (:issue:`12723`)
498500

499-
- Bug in ``pd.read_html`` when using bs4 flavor and parsing table with a header and only one column (:issue `9178`)
501+
- Bug in ``pd.read_html`` when using bs4 flavor and parsing table with a header and only one column (:issue:`9178`)
500502

501503
- Bug in ``pivot_table`` when ``margins=True`` and ``dropna=True`` where nulls still contributed to margin count (:issue:`12577`)
502504
- Bug in ``pivot_table`` when ``dropna=False`` where table index/column names disappear (:issue:`12133`)
@@ -505,5 +507,4 @@ Bug Fixes
505507
- Bug in ``Series.name`` when ``name`` attribute can be a hashable type (:issue:`12610`)
506508
- Bug in ``.describe()`` resets categorical columns information (:issue:`11558`)
507509
- Bug where ``loffset`` argument was not applied when calling ``resample().count()`` on a timeseries (:issue:`12725`)
508-
- ``pd.read_excel()`` now accepts path objects (e.g. ``pathlib.Path``, ``py.path.local``) for the file path, in line with other ``read_*`` functions (:issue:`12655`)
509-
- ``pd.read_excel()`` now accepts column names associated with keyword argument ``names``(:issue `12870`)
510+
- ``pd.read_excel()`` now accepts column names associated with keyword argument ``names``(:issue:`12870`)

0 commit comments

Comments
 (0)