Skip to content

Index repr changes to make them consistent #9901

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 9, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 1 addition & 4 deletions doc/source/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -675,10 +675,7 @@ values NOT in the categories, similarly to how you can reindex ANY pandas index.
}).set_index('B')

In [11]: df3.index
Out[11]:
CategoricalIndex([u'a', u'a', u'b', u'b', u'c', u'a'],
categories=[u'a', u'b', u'c'],
ordered=False)
Out[11]: CategoricalIndex([u'a', u'a', u'b', u'b', u'c', u'a'], categories=[u'a', u'b', u'c'], ordered=False, name=u'B', dtype='category')

In [12]: pd.concat([df2,df3]
TypeError: categories must match existing categories when appending
Expand Down
51 changes: 50 additions & 1 deletion doc/source/whatsnew/v0.16.1.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,10 @@ Highlights include:
- New section on how-to-contribute to *pandas*, see :ref:`here <contributing>`
- Revised "Merge, join, and concatenate" documentation, including graphical examples to make it easier to understand each operations, see :ref:`here <merging>`
- New method ``sample`` for drawing random samples from Series, DataFrames and Panels. See :ref:`here <whatsnew_0161.enhancements.sample>`
- ``BusinessHour`` date-offset is now supported, see :ref:`here <timeseries.businesshour>`
- The default ``Index`` printing has changed to a more uniform format, see :ref:`here <whatsnew_0161.index_repr>`
- ``BusinessHour`` datetime-offset is now supported, see :ref:`here <timeseries.businesshour>`

>>>>>>> more fixes
- Further enhancement to the ``.str`` accessor to make string operations easier, see :ref:`here <whatsnew_0161.enhancements.string>`

.. contents:: What's new in v0.16.1
Expand Down Expand Up @@ -268,6 +271,52 @@ API changes

- By default, ``read_csv`` and ``read_table`` will now try to infer the compression type based on the file extension. Set ``compression=None`` to restore the previous behavior (no decompression). (:issue:`9770`)

.. _whatsnew_0161.index_repr:

Index Representation
~~~~~~~~~~~~~~~~~~~~

The string representation of ``Index`` and its sub-classes have now been unified. These will show a single-line display if there are few values; a wrapped multi-line display for a lot of values (but less than ``display.max_seq_items``; if lots of items (> ``display.max_seq_items``) will show a truncated display (the head and tail of the data). The formatting for ``MultiIndex`` is unchanges (a multi-line wrapped display). The display width responds to the option ``display.max_seq_items``, which is defaulted to 100. (:issue:`6482`)

Previous Behavior

.. code-block:: python

In [2]: pd.Index(range(4),name='foo')
Out[2]: Int64Index([0, 1, 2, 3], dtype='int64')

In [3]: pd.Index(range(104),name='foo')
Out[3]: Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, ...], dtype='int64')

In [4]: pd.date_range('20130101',periods=4,name='foo',tz='US/Eastern')
Out[4]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 00:00:00-05:00, ..., 2013-01-04 00:00:00-05:00]
Length: 4, Freq: D, Timezone: US/Eastern

In [5]: pd.date_range('20130101',periods=104,name='foo',tz='US/Eastern')
Out[5]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-01-01 00:00:00-05:00, ..., 2013-04-14 00:00:00-04:00]
Length: 104, Freq: D, Timezone: US/Eastern

New Behavior

.. ipython:: python

pd.set_option('display.width',100)
pd.Index(range(4),name='foo')
pd.Index(range(25),name='foo')
pd.Index(range(104),name='foo')
pd.Index(['datetime', 'sA', 'sB', 'sC', 'flow', 'error', 'temp', 'ref', 'a_bit_a_longer_one']*2)
pd.CategoricalIndex(['a','bb','ccc','dddd'],ordered=True,name='foobar')
pd.CategoricalIndex(['a','bb','ccc','dddd']*10,ordered=True,name='foobar')
pd.CategoricalIndex(['a','bb','ccc','dddd']*100,ordered=True,name='foobar')
pd.CategoricalIndex(np.arange(1000),ordered=True,name='foobar')
pd.date_range('20130101',periods=4,name='foo',tz='US/Eastern')
pd.date_range('20130101',periods=25,name='foo',tz='US/Eastern')
pd.date_range('20130101',periods=104,name='foo',tz='US/Eastern')

.. _whatsnew_0161.deprecations:

Deprecations
Expand Down
28 changes: 18 additions & 10 deletions pandas/core/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -3132,7 +3132,7 @@ def in_ipython_frontend():
# working with straight ascii.


def _pprint_seq(seq, _nest_lvl=0, **kwds):
def _pprint_seq(seq, _nest_lvl=0, max_seq_items=None, **kwds):
"""
internal. pprinter for iterables. you should probably use pprint_thing()
rather then calling this directly.
Expand All @@ -3144,12 +3144,15 @@ def _pprint_seq(seq, _nest_lvl=0, **kwds):
else:
fmt = u("[%s]") if hasattr(seq, '__setitem__') else u("(%s)")

nitems = get_option("max_seq_items") or len(seq)
if max_seq_items is False:
nitems = len(seq)
else:
nitems = max_seq_items or get_option("max_seq_items") or len(seq)

s = iter(seq)
r = []
for i in range(min(nitems, len(seq))): # handle sets, no slicing
r.append(pprint_thing(next(s), _nest_lvl + 1, **kwds))
r.append(pprint_thing(next(s), _nest_lvl + 1, max_seq_items=max_seq_items, **kwds))
body = ", ".join(r)

if nitems < len(seq):
Expand All @@ -3160,7 +3163,7 @@ def _pprint_seq(seq, _nest_lvl=0, **kwds):
return fmt % body


def _pprint_dict(seq, _nest_lvl=0, **kwds):
def _pprint_dict(seq, _nest_lvl=0, max_seq_items=None, **kwds):
"""
internal. pprinter for iterables. you should probably use pprint_thing()
rather then calling this directly.
Expand All @@ -3170,11 +3173,14 @@ def _pprint_dict(seq, _nest_lvl=0, **kwds):

pfmt = u("%s: %s")

nitems = get_option("max_seq_items") or len(seq)
if max_seq_items is False:
nitems = len(seq)
else:
nitems = max_seq_items or get_option("max_seq_items") or len(seq)

for k, v in list(seq.items())[:nitems]:
pairs.append(pfmt % (pprint_thing(k, _nest_lvl + 1, **kwds),
pprint_thing(v, _nest_lvl + 1, **kwds)))
pairs.append(pfmt % (pprint_thing(k, _nest_lvl + 1, max_seq_items=max_seq_items, **kwds),
pprint_thing(v, _nest_lvl + 1, max_seq_items=max_seq_items, **kwds)))

if nitems < len(seq):
return fmt % (", ".join(pairs) + ", ...")
Expand All @@ -3183,7 +3189,7 @@ def _pprint_dict(seq, _nest_lvl=0, **kwds):


def pprint_thing(thing, _nest_lvl=0, escape_chars=None, default_escapes=False,
quote_strings=False):
quote_strings=False, max_seq_items=None):
"""
This function is the sanctioned way of converting objects
to a unicode representation.
Expand All @@ -3202,6 +3208,8 @@ def pprint_thing(thing, _nest_lvl=0, escape_chars=None, default_escapes=False,
replacements
default_escapes : bool, default False
Whether the input escape characters replaces or adds to the defaults
max_seq_items : False, int, default None
Pass thru to other pretty printers to limit sequence printing

Returns
-------
Expand Down Expand Up @@ -3240,11 +3248,11 @@ def as_escaped_unicode(thing, escape_chars=escape_chars):
return compat.text_type(thing)
elif (isinstance(thing, dict) and
_nest_lvl < get_option("display.pprint_nest_depth")):
result = _pprint_dict(thing, _nest_lvl, quote_strings=True)
result = _pprint_dict(thing, _nest_lvl, quote_strings=True, max_seq_items=max_seq_items)
elif is_sequence(thing) and _nest_lvl < \
get_option("display.pprint_nest_depth"):
result = _pprint_seq(thing, _nest_lvl, escape_chars=escape_chars,
quote_strings=quote_strings)
quote_strings=quote_strings, max_seq_items=max_seq_items)
elif isinstance(thing, compat.string_types) and quote_strings:
if compat.PY3:
fmt = "'%s'"
Expand Down
Loading