Skip to content

PERF: Improve performance CustmBusinessDay #8253

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

bjonen
Copy link
Contributor

@bjonen bjonen commented Sep 12, 2014

Closes #8236

CustomBusinessDay is now fast with holiday calendar and incrementing in different ways.
import pandas as pd
import numpy as np
import datetime as dt

date = pd.Timestamp('20120101')
cbday = pd.offsets.CustomBusinessDay()
date + cbday
%timeit date + cbday
10000 loops, best of 3: 17.2 µs per loop

hdays = [dt.datetime(2013,1,1) for ele in range(1000)]
cbdayh = pd.offsets.CustomBusinessDay.from_various(holidays=hdays)
%timeit date + cbdayh
%timeit date + 2 * cbdayh
%timeit date - 1 * cbdayh
%timeit date - 10 * cbday

100000 loops, best of 3: 14.6 µs per loop
100000 loops, best of 3: 19.7 µs per loop
10000 loops, best of 3: 22.7 µs per loop
10000 loops, best of 3: 22.3 µs per loop

Please check whether you agree with the way I initialize custom offsets now (using classmethod).

@bjonen
Copy link
Contributor Author

bjonen commented Sep 12, 2014

I see the tests on several python versions failing. Is there a way to run tox locally and quickly?

I tried using tox --develop -e py34 but I get errors related to hashtable. I guess then the c-extensions of the different python versions are in conflict.

return dt

def from_various_func(cls,**kwds):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this function needed? this completely changes the API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason increments >1 are slow is because a new instance of CustomBusinessDay is created when applying operators.
e.g.

    date + 2 * cbday

Before the constructor had to parse the holidays which is very costly. This new function does the heavy lifting when you initialize CustomBusinessDay. After that all computations are fast now because the constructor is very short.

Leaving the API unchanged would mean changing the logic behind DateOffset.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can simply pass in an already created holidays instance I think. You can't change the API like this. Totally not intuitve and introducing a weird creation scheme. You might also be able to cache the holiday (in the class), might be the best option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I agree that the proposed API is not intuitive. I can try if we can have np.busdaycalendar cached in kwds.

@jreback
Copy link
Contributor

jreback commented Sep 17, 2014

replaced by #8293

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CustomBusinessDay slow for increments <0 and >1
2 participants