Skip to content

ENH: Implement DataFrame.value_counts #31247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 44 commits into from
Feb 26, 2020
Merged

ENH: Implement DataFrame.value_counts #31247

merged 44 commits into from
Feb 26, 2020

Conversation

dsaxton
Copy link
Member

@dsaxton dsaxton commented Jan 23, 2020

This is picking up where #27350 left off because I think it'd be a nice feature to have. At least one thing that still needs to be done is implementing bins when we have only a single column in subset, in which case maybe we can just delegate to Series.value_counts.

@dsaxton dsaxton requested review from jreback and WillAyd January 23, 2020 14:40
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor nits on annotations but otherwise lgtm

@WillAyd WillAyd added the DataFrame DataFrame data structure label Jan 24, 2020
if subset is None:
subset = self.columns.tolist()

# Some features not supported yet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would remove these args then

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @pandas-dev/pandas-core if any comments


.. versionadded:: 1.1.0

The returned Series will have a MultiIndex with one level per input
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should go in Notes

@jreback
Copy link
Contributor

jreback commented Feb 9, 2020

also pls merge master

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments, ping on green.

@@ -0,0 +1,102 @@
import numpy as np

import pandas as pd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move to the methods/ subdir

@jreback jreback added this to the 1.1 milestone Feb 23, 2020
@dsaxton
Copy link
Member Author

dsaxton commented Feb 23, 2020

Getting some seemingly unrelated test failures from test_value_counts_unique_nunique_null in pandas/tests/base/test_ops.py after merging master (I don't think any of this PR touches code tested there since the Series method is staying the same). Could this be related to the refactoring from #32046?

@dsaxton
Copy link
Member Author

dsaxton commented Feb 25, 2020

@jreback Green after the CI fixes, thanks for the review

@jreback jreback merged commit 8b200c1 into pandas-dev:master Feb 26, 2020
@jreback
Copy link
Contributor

jreback commented Feb 26, 2020

thanks @dsaxton

very nice!

@dsaxton dsaxton deleted the df-val-counts branch February 26, 2020 02:24
roberthdevries pushed a commit to roberthdevries/pandas that referenced this pull request Mar 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataFrame DataFrame data structure Enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: DataFrame.value_counts()
3 participants