Skip to content

Add an ends function that shows both head and tail of the df #18691

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
icfly2 opened this issue Dec 8, 2017 · 7 comments · Fixed by #18749
Closed

Add an ends function that shows both head and tail of the df #18691

icfly2 opened this issue Dec 8, 2017 · 7 comments · Fixed by #18749
Labels
Docs Output-Formatting __repr__ of pandas objects, to_string
Milestone

Comments

@icfly2
Copy link

icfly2 commented Dec 8, 2017

Add the following function to pd. dataFrame and pd.Series

def ends(df, x=5):
    """Returns both head and tail of the dataframe or series.
    
    Args:
        x (int): Optional number of rows to return for each head and tail
    """
    print('{} rows x {} columns'.format(np.shape(df)[0],np.shape(df)[1]))
    return df.head(x).append(df.tail(x))

Problem description

Often both the beginning and end of a df are of interest, fore example in a time series.

This leads to calling df.head() df.tail() in two seperate notebook cells. This is not only tedious, but also leads to a cluttered notebook. A function that returns both of these + a print on the number of rows and columns, thus allowing a check if the index matches the number of rows.

Example

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.rand(1500,6))
print(ends(df,2))

1500 rows x 6 columns

0 1 2 3 4 5
0 0.949695 0.160928 0.434134 0.943103 0.477830 0.903479
1 0.736711 0.103746 0.028694 0.205910 0.226061 0.458452
1498 0.362950 0.586887 0.399681 0.115366 0.239049 0.386281
1499 0.018102 0.852198 0.880993 0.671604 0.705586 0.802237

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Windows
OS-release: 8.1
machine: AMD64
processor: Intel64 Family 6 Model 61 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.21.0
pytest: 3.3.1
pip: 9.0.1
setuptools: 38.2.4
Cython: 0.26.1
numpy: 1.12.1
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 6.1.0
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.8
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 0.9.8
lxml: 3.8.0
bs4: 4.6.0
html5lib: 0.999999999
sqlalchemy: 1.1.13
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@jorisvandenbossche
Copy link
Member

I thought we discussed this before, but can't find such issue.

That said, this is closely related to #17023. Because if we reduce the default number of rows, I think such a feature becomes less needed IMO.

In general, we are not very keen on adding more methods to DataFrame, unless there is a very clear need.

@TomAugspurger
Copy link
Contributor

FWIW, my utils file has a context manager for this:

def show(df):
    with pd.option_context("display.max_rows", 10):
        display(df)

And I called it with df.pipe(show). Agreed that we shouldn't add a new method, especially given #17023.

@jreback
Copy link
Contributor

jreback commented Dec 9, 2017

see #4193 and #1889

@jreback jreback added the Output-Formatting __repr__ of pandas objects, to_string label Dec 9, 2017
@jreback
Copy link
Contributor

jreback commented Dec 9, 2017

this seems quite duplicate of what we already have, by default the repr already does this. using display.max_rows to control the break of this.

@icfly2
Copy link
Author

icfly2 commented Dec 13, 2017

Perhaps it might be worth adding a line to the docs as it seems quite an obscure but very useful trick.

I would suggest referring to head in the docs for tail and vice versa and adding the comment about display.max_rows

I'm happy to try my hand at that addition if there is some general support for the idea.

@jreback
Copy link
Contributor

jreback commented Dec 13, 2017

@icfly2 #18749 already is updating he doc-strings, so natural to add there.

Love to have you contribute on other issues!

@adamrossnelson
Copy link

adamrossnelson commented Aug 3, 2021

Worth a mention that this issue also relates to: #42837

The idea is that there is a way to preserve vertical screen space when inspecting 'tall' (many rows, few cols) data frames.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
5 participants