-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Rank by multiple columns #4311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
you could set the columns as a multi index then just sort the index (though I think u have an issue about multi sort ) |
This was possible until 0.23, but is semi-broken currently. (Understandable since I guess it's a bit of a hack.) df['rankby'] = pd.Series(df[['foo', 'bar']].itertuples(index=False)).values
df['ranking'] = inner.groupby('quux')['rankby'].rank(method='min') Still works if you don't have a groupby. |
@imre-kerr this is a pretty old issue that in the current iteration of pandas mixes a few concepts. If you want row-numbering within a group you should use cumcount: https://stackoverflow.com/questions/37997668/pandas-number-rows-within-group The behavior you are referring to of using rank would work if dealing only with numeric data. Rank within GroupBy operations on strings will raise an error, and should in the future raise also when called from a Series / Frame (see #19560) |
Closing as this issue is no longer relevant - rank can be used on numeric data in combination with groupby. For object data cumcount can be used, though it would be up to the user to specify the desired order first |
@WillAyd |
I don't think this is possible atm, but would be a nice enhancement.
Similar to how you can pass list of columns to sort*.
http://stackoverflow.com/questions/17775935/sql-like-window-functions-in-pandas-row-numbering-in-python-pandas-dataframe
__probably there is a cheeky answer/hack to this q using sort...*
The text was updated successfully, but these errors were encountered: