Skip to content

Feature Request: .rand() to call random rows to compliment .head() and .tail() #9569

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nickeubank opened this issue Mar 1, 2015 · 1 comment · Fixed by #45302
Closed

Comments

@nickeubank
Copy link
Contributor

.head() and .tail() are great tools for quick data interrogations, but when data is sorted they are often far from representative. It would be great if there was a simple command to pull an arbitrary number of random rows and display them for a more representative way to spotcheck data.

It would behave something like:

def rand_rows(df, num_rows = 5):
    from numpy import random as rm
    subset = rm.choice(df.index.values, size = num_rows)    
    return df.loc[subset]

a_data_frame = pd.DataFrame({'col1':range(10,20), 'col2':range(20,30)})
rand_rows(a_data_frame)
rand_rows(a_data_frame, 6)
@TomAugspurger
Copy link
Contributor

We already have an issue for that: #2419
It's just a matter of someone implementing it. Give it a go if you want try! I don't think anyone has started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants