GroupBy like API for resample #1269

shoyer · 2017-02-14T17:46:02Z

Since we wrote resample in xarray, pandas updated resample to have a groupyby-like API (e.g., df.resample('24H').mean() vs. the old df.resample('24H') that uses the mean by default).

It would be nice to redo the xarray resample API to match, e.g., ds.resample(time='24H').mean() vs ds.resample('time', '24H'). This would solve a few use cases, including grouped-resample arithmetic, iterating over groups and (mostly) take care of the need for pd.TimeGrouper support (#364). If we use **kwargs for matching dimension names, this could be done with a minimally painful deprecation cycle.

The text was updated successfully, but these errors were encountered:

darothen · 2017-02-14T19:32:01Z

Let me dig into this a bit right now. My analysis project for this afternoon was already going to require digging into pandas' resampling in more depth anyways.

darothen · 2017-02-14T21:44:11Z

Assuming we want to stick with pd.TimeGrouper under the hood, the only sticking point I've come across so far is how to have the resulting Data{Array,set}GroupBy object "remember" the resampling dimension, e.g. if you have multi-dimensional data and want to compute time means you have to call

ds.resample(time='24H').mean('time')

or else mean will operate across all dimensions. Any thoughts, @shoyer?

max-sixty · 2017-02-15T18:47:20Z

Would be great to test for these sorts of issues if we redo this: #1269

max-sixty · 2017-02-15T18:49:32Z

the only sticking point I've come across so far is how to have the resulting Data{Array,set}GroupBy object "remember" the resampling dimension

I think an interface like ds.resample(time='24H').mean() would be much better. We could do that with a wrapper of pd.TimeGrouper that also had a dim field. Or inheritance 😨

darothen · 2017-02-15T18:59:17Z

@MaximilianR Oh, the interface is easy enough to do, even maintaining backwards-compatibility (already have that working). I was considering going the route done with GroupBy and the classes that compose it, like DatasetGroupBy... basically, we just record the wanted resampling dimension and inject the grouping/resampling operations we want. Also adds the ability to specialize methods like .first() and .last(), which is done under the current implementation.

But.... if there's a simpler way, that might be preferable!

shoyer · 2017-02-15T20:04:07Z

I think this could be done with minimal GroupBy subclasses to supply the default dimension argument for aggregation functions. All the machinery on groupby should already be there.

…

On Wed, Feb 15, 2017 at 10:59 AM Daniel Rothenberg ***@***.***> wrote: @MaximilianR <https://github.com/MaximilianR> Oh, the interface is easy enough to do, even maintaining backwards-compatibility (already have that working). I was considering going the route done with GroupBy <https://github.com/pydata/xarray/blob/93d6963315026f87841c7cf39cc39bb78f555345/xarray/core/groupby.py#L165> and the classes that compose it, like DatasetGroupBy <https://github.com/pydata/xarray/blob/93d6963315026f87841c7cf39cc39bb78f555345/xarray/core/groupby.py#L586>... basically, we just record the wanted resampling dimension and inject the grouping/resampling operations we want. Also adds the ability to specialize methods like .first() and .last(), which is done under the current implementation. *But*.... if there's a simpler way, that might be preferable! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1269 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABKS1mAUBUkz7ig3fijFmqg6IeDnGgdeks5rc0sJgaJpZM4MAyE5> .

shoyer added contrib-help-wanted topic-pandas-like labels Feb 14, 2017

darothen mentioned this issue Feb 16, 2017

Groupby-like API for resampling #1272

Merged

9 tasks

darothen mentioned this issue Mar 24, 2017

Add 'count' as option for how in dataset resample #1327

Closed

jhamman closed this as completed in #1272 Sep 22, 2017

dcherian mentioned this issue Aug 12, 2018

New Resample-Syntax leading to cancellation of dimensions #2356

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GroupBy like API for resample #1269

GroupBy like API for resample #1269

shoyer commented Feb 14, 2017 •

edited

Loading

darothen commented Feb 14, 2017

darothen commented Feb 14, 2017

max-sixty commented Feb 15, 2017

max-sixty commented Feb 15, 2017

darothen commented Feb 15, 2017

shoyer commented Feb 15, 2017 via email

GroupBy like API for resample #1269

GroupBy like API for resample #1269

Comments

shoyer commented Feb 14, 2017 • edited Loading

darothen commented Feb 14, 2017

darothen commented Feb 14, 2017

max-sixty commented Feb 15, 2017

max-sixty commented Feb 15, 2017

darothen commented Feb 15, 2017

shoyer commented Feb 15, 2017 via email

shoyer commented Feb 14, 2017 •

edited

Loading