-
Notifications
You must be signed in to change notification settings - Fork 0
Some refinements #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some refinements #5
Conversation
@johnzangwill - friendly ping. |
Thanks, Richard. This also raises the question: just how pathological data do I need to support?... |
There will be another issue with this PR: just how to make Jeff happy! |
@johnzangwill Certainly. I'm not concerned - it appears to me what you're doing is the correct way. Explaining why this is novel to pandas (no other op groups by all the remaining columns!) will go a long way, and I'll be happy to help out there. Wanted to get things in a good state first though. |
I'll take a look at the reset index in the next few days. |
I changed frame.py to cope with duplicate column labels. This makes |
@rhshadrach I am not sure that I am winning this one pandas-dev#44755 (comment) which was started by you and your duplicate label examples. Please, either explain to Jeff why he is wrong, or tell me to just do it his way and forget about trapping duplicates in existing code. Thanks! |
I've done some work and wanted to share, this seemed like the easiest way. No need to merge, just steal whatever you like. It is certainly a work in progress. The main changes are moving away from name-based logic to positional logic. This allows the method to work when the columns have duplicate or unexpected names, e.g.
It also avoids alignment when normalize is True, giving somewhat of a speedup in, e.g.
gives
Finally, it simplifies some of the logic involved with dropna and as_index.