Change grid level #18

benbovy · 2023-11-10T10:42:09Z

It would be useful to have utility methods to change the grid resolution, very much like this and this.

Proposed behavior

This would be pretty similar to Xarray .reindex(), but here DGGS-aware.

When downgrading the resolution, only the cell coordinate would change with child cell ids replaced by their parent cell id at the given resolution. The resulting coordinate has the same size but may have duplicate labels. Users could then perform aggregation with the method of their choice just by using Xarray's .groupby().

When upgrading the resolution, the new cell coordinate has new labels (child cell ids) and the cell dimension may have an increased size, in which case the values of the data variables must be repeated according to the new cell ids along the cell dimension.

These .change_resolution() utility functions might be actually just what we need in order to align, merge or do other operations with multiple Datasets / DataArrays on the same DGGS but at different resolutions. Those are pretty simple and composable functions.

For simplicity, there would be no regridding or resampling involved here. There are two caveats, though:

extenstive vs. intensive quantities: The behavior detailed above is correct for intensive quantities (i.e., independent of the cell area) but not for extensive quantities. For the latter, one generic solution could be to optionally output a "weights" coordinate (same dimension than cells) computed from the cell areas and by counting duplicate cell ids. This weights coordinate could then be used to update the values of certain data variables (simple arithmetic) after upgrading the resolution. Unfortunately, in the case of resolution downgrading weighted groupby is not yet supported in Xarray: compose weighted with groupby, coarsen, resample, rolling etc. pydata/xarray#3937.
In some grid systems (like H3), the boundaries of the cells do not match exactly across different resolutions. We might need more advanced regridding in this case, although the solutions above may already provide good enough, first-order approximation.

The text was updated successfully, but these errors were encountered:

VeckoTheGecko · 2025-05-14T15:26:56Z

Just had an in depth chat with @surgura about this. These are our thoughts:

When downgrading the resolution, only the cell coordinate would change with child cell ids replaced by their parent cell id at the given resolution. The resulting coordinate has the same size but may have duplicate labels. Users could then perform aggregation with the method of their choice just by using Xarray's .groupby().

This doesn't really make sense for us.
- Why would users want a dataset with just the relabeling instead of outright doing the groupby and applying an aggregator function? If users want to know parent IDs for cells they can use a different method
- For all(?) usecases this would mean users have to do .ddgs.change_resolution(level=2).groupby("cell_ids").mean() if they're downscaling or .ddgs.change_resolution(level=6) if they're upscaling. The asymetry in the calls between down and upscaling seems unfriendly for the user

Our proposal: ds.dggs.upscale/downscale/rescale

upscale(level: int) -> xr.DataArray | xr.Dataset
- does the upscaling to the level of interest, duplicating data from the parent to the child cells, returns dataset with level level
- Errors out if level < grid.level
- just returns if level == grid.level
downscale(level: int, agg: npfunc = np.mean) -> xr.DataArray | xr.Dataset
- does the downscaling to the level of interest, and does a (user provided, but defaulting to mean) non-weighted aggregation of the child to the parent cells, returns dataset with level level
- Errors if level > grid.level
- just returns if level == grid.level
rescale(level: int, downscale_agg: npfunc | None = None) -> xr.DataArray | xr.Dataset
- Calls upscale or downscale depending on level < grid.level or level > grid.level. Defaults aggregator to that of downscale, but is customisable. returns dataset with level level
-> This API seems clearer, and the user gets the output dataset in the format that they expect. Personally I really like the upscale/downscale/rescale naming since that seems quite clear to me (not sure if there is convention).

Anticipating questions:

Why upscale/downscale and not just rescale
- Having upscale and downscale also as part of the public API is nice sugar for users so that their code is more readable (after all, these are very different operations and allowing users to explicitly choose one execution path is nice)
What about non-HealPix grids? (or grids that aren't strictly hierarchical)
- idk, NotImplementedError for now. I think this functionality and API for now is super powerful to users and also quite easy to implement.
Why level?
- In line with discussion hierarchical operations on cell ids: parents #62 (comment) . I like it as well

Can I get cracking on this feature? Thoughts?

tinaok · 2025-05-14T15:47:38Z

@keewis @benbovy

benbovy · 2025-05-14T16:32:52Z

Can I get cracking on this feature? Thoughts?

Yes please go ahead!

I like your proposal very much.

keewis · 2025-05-14T16:46:24Z

I think there's two distinct (but interdependent) operations that both would make sense:

change the level of the cell ids (in line with hierarchical operations on cell ids: parents #62)
modify the data to change dimensions

(to be clear, I think 1 depends on 2, so I'd start with that)

The exact API will probably need some refinement (not everything that xarray provides has a numpy equivalent), but otherwise this looks fine. cdshealpix (rust) recently gained methods for hierarchical operations, and there's keewis/healpix-geo#24 which added low-level functions for 1 (I just didn't have time to make progress there recently)

strobpr · 2025-05-15T09:11:37Z

Apologies to intervene here very briefly, but I see the terminology that is being used above quite critical:

'upscaling' and 'downscaling' are terms used in opposite way by different geoscience user communities. A much clearer choice would be 'coarsening' and 'refining'.

'resolution' is simply the wrong term to use when it comes to 'grid spacing'. The resolution of gridded data is to same extent related (and limited) by the grid spacing, but the grid spacing is no useful indication for resolution (and there is no such thing as a 'grid resolution'). E.g. if you interpolate data into a finer grid (smaller spacing) you do NOT change the resolution!!! I see an increasing misuse of these terms which undermines basic data understanding. A suitable neutral term would be re-gridding.

Please consider that words matter and help (or hamper) to understand concepts correctly. If possible stick as close as possible to DGGS terminology as proposed in the standard. If unsure please ask around before jumping on some jargon which might lead to further confusion and misunderstanding in our science.

Thanks!

d70-t · 2025-05-15T17:27:58Z

I think this would be a good feature. However I also agree that naming matters, and up/down isn't obviously defined.
I'm also wondering if we might want to support weighting? There are many cases where just some of the to-be aggregated cells contain data, and especially when doing multiple levels of aggregation across partially covered cells, un-weighted aggregation might return false results.

benbovy · 2025-05-15T19:02:19Z

I also like coarsen and refine as method names. I'm wondering if we really need a rescale method?

(note about resolution: this issue is quite old and since then we changed it to level).

@keewis by API refinement are you thinking about ds.dggs.coarsen(...).mean() vs. ds.dggs.coarsen(..., agg=np.mean)? What are the pros and cons of these options here, apart that the first one is closer to Xarray's groupby, coarsen, rolling, etc.?

Agreed for supporting weights, this will also be useful if we eventually support DGGSs where the cell structures are not congruent across refinement levels (e.g., ISEA3H). Probably we'd need it too for refine, along with a pluggable partition function?

tinaok · 2025-05-16T00:31:06Z

cc @allixender, If you have comment from OGC DGGS SWG point of view for the naming issue.

VeckoTheGecko · 2025-05-16T09:02:06Z

I also like coarsen and refine as method names. I'm wondering if we really need a rescale method?

Agreed. Rescale is not particularly needed. I'll look to update my PR draft when I have more time (next weekend).

benbovy · 2025-05-16T09:08:14Z

What about a rescale or regrid method to remap data on a given absolute level value, while coarsen and refine both accept relative level values?

VeckoTheGecko · 2025-05-16T09:12:34Z

My proposal was only talking about absolute, but if we want to also support relative I think the following API would be nice (instead of a new method)

def coarsen(*, by:int , level: int):
    """
    by: relative
    level: absolute
    """

benbovy · 2025-05-16T11:05:05Z

nit: to_level seems even more clearer than level.

keewis · 2025-05-16T11:41:14Z

what if we used to and by? Both are given in "level space" so I don't think people would be confused, and most importantly they read as "coarsen by {value}" and "coarsen to {value}", which feels pretty natural?

strobpr · 2025-05-16T16:35:30Z

'coarsen to' of course implies that you know which level you're at otherwise it might be void.

BTW 'refinement level' is the correct term for a specific grid instance in a DGGS from the SWG perspective. A possible issue with shortening it to 'level' is that there are many 'levels' in EO terminology. E.g. changing the 'refinement level' is different from changing the 'processing level'.

I believe I remember a discussion on 'resolution' somewhere on Github, but the issue tends to be forgotten and the term is hard to kill now that's so widely (and wrongly) used.

Regarding changing the gridding of data (aka 're-sampling'), that is always critical and requires certain conditions to be done safely. E.g. is coarsening only allowed if the coarser zone is populated in a representative way by finer samples (ideally it is continuously sampled). If some samples are missing that is okay as long as the remainder is not biased (no longer representative). Refining is even more complex, leads to far here...

Generally, please keep in mind that in the near future we will have to consider uncertainty in all operations we do with values (in fact 'data' consists always of 'value' and 'uncertainty', although the latter is often neglected). The effect of re-sampling on uncertainty is actually more important than the one on the values, and even if the values change only slightly, the uncertainty always increases which each re-sampling. That's why it should be limited to a minimum and done only with utmost care.

allixender · 2025-05-16T17:09:20Z

cc @allixender, If you have comment from OGC DGGS SWG point of view for the naming issue.

@strobpr is also an trusty advisor/contributor/member in the OGC DGGS SWG, his comments on level , refinement ratio / refinement level and resolution are on point.

The wording of level for implies a better logic, because really we don't change the original data, we are also not regridding in the traditional/conservative sense, we are mostly resampling and stay within the same DGGS grid logic. Thus, a user is in data fidelity sense only moving/zooming up or down. My 2 cents.

keewis · 2025-05-16T17:10:40Z

'coarsen to' of course implies that you know which level you're at otherwise it might be void.

xdggs has been built on the assumption that the level is known, so a lot of other things will fail if we don't know the level (and can't infer it from the cell ids)

I believe I remember a discussion on 'resolution' somewhere on Github

indeed, the discussion was mostly in #62, but also in #64 and #65. The conclusion was that in the context of a grid (and xdggs, at least, only deals with grids) the name level is unambiguous.

benbovy mentioned this issue Nov 13, 2023

Design docs #20

Merged

benbovy mentioned this issue May 22, 2024

cell hierarchy operations #38

Open

1 task

keewis mentioned this issue Jul 26, 2024

hierarchical operations on cell ids: parents #62

Draft

VeckoTheGecko mentioned this issue May 14, 2025

Add .downscale(), .upscale(), and .rescale() methods #141

Draft

4 tasks

benbovy changed the title ~~Change grid resolution~~ Change grid level May 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Change grid level #18

Change grid level #18

benbovy commented Nov 10, 2023 •

edited

Loading

VeckoTheGecko commented May 14, 2025 •

edited

Loading

Uh oh!

tinaok commented May 14, 2025

Uh oh!

benbovy commented May 14, 2025

Uh oh!

keewis commented May 14, 2025 •

edited

Loading

Uh oh!

strobpr commented May 15, 2025

Uh oh!

d70-t commented May 15, 2025

Uh oh!

benbovy commented May 15, 2025

Uh oh!

tinaok commented May 16, 2025 •

edited

Loading

Uh oh!

VeckoTheGecko commented May 16, 2025

Uh oh!

benbovy commented May 16, 2025

Uh oh!

VeckoTheGecko commented May 16, 2025 •

edited

Loading

Uh oh!

benbovy commented May 16, 2025

Uh oh!

keewis commented May 16, 2025

Uh oh!

strobpr commented May 16, 2025

Uh oh!

allixender commented May 16, 2025

Uh oh!

keewis commented May 16, 2025

Uh oh!

Change grid level #18

Change grid level #18

Comments

benbovy commented Nov 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed behavior

VeckoTheGecko commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tinaok commented May 14, 2025

Uh oh!

benbovy commented May 14, 2025

Uh oh!

keewis commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

strobpr commented May 15, 2025

Uh oh!

d70-t commented May 15, 2025

Uh oh!

benbovy commented May 15, 2025

Uh oh!

tinaok commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VeckoTheGecko commented May 16, 2025

Uh oh!

benbovy commented May 16, 2025

Uh oh!

VeckoTheGecko commented May 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benbovy commented May 16, 2025

Uh oh!

keewis commented May 16, 2025

Uh oh!

strobpr commented May 16, 2025

Uh oh!

allixender commented May 16, 2025

Uh oh!

keewis commented May 16, 2025

Uh oh!

benbovy commented Nov 10, 2023 •

edited

Loading

VeckoTheGecko commented May 14, 2025 •

edited

Loading

keewis commented May 14, 2025 •

edited

Loading

tinaok commented May 16, 2025 •

edited

Loading

VeckoTheGecko commented May 16, 2025 •

edited

Loading