-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
[Improvement] Deterministic value_counts #15833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If you need this for testing, the easiest / best is simply to |
actually this is a duplicate of #12679 the guarantee on The
If you'd like to do a PR for #12679 would be great. |
Yes, I would need this for testing and would like to have the highest count of the data. With just getting the first item of the |
love to have a PR as above! |
Code Sample, a copy-pastable example if possible
Problem description
Using value_counts in a testsuite can be a problem, when the resulting values have the same count as they permutade on each call, e.g.:
Expected Output
Some stable/deterministic output or optionally additionally sorting of the keys, if they have the same counts
Output of
pd.show_versions()
pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 2.0.0
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.2
bs4: 4.5.3
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.5
pymysql: None
psycopg2: None
jinja2: 2.9.4
boto: 2.45.0
pandas_datareader: None
The text was updated successfully, but these errors were encountered: