Skip to content

Leap time handling #445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 11, 2025
Merged

Leap time handling #445

merged 6 commits into from
Mar 11, 2025

Conversation

dsweber2
Copy link
Contributor

Checklist

Please:

  • Make sure this PR is against "dev", not "main".
  • Request a review from one of the current epipredict main reviewers:
    dajmcdon.
  • Make sure to bump the version number in DESCRIPTION and NEWS.md.
    Always increment the patch version number (the third number), unless you are
    making a release PR from dev to main, in which case increment the minor
    version number (the second number).
  • Describe changes made in NEWS.md, making sure breaking changes
    (backwards-incompatible changes to the documented interface) are noted.
    Collect the changes under the next release number (e.g. if you are on
    0.7.2, then write your changes under the 0.8 heading).
  • Consider pinning the epiprocess version in the DESCRIPTION file if
    • You anticipate breaking changes in epiprocess soon
    • You want to co-develop features in epipredict and epiprocess

Change explanations for reviewer

So weeks and days both have to deal with inconsistent numbers of days. What this does:

  1. separate out the "weird" day/week into a separate bin, labeled 999 and only use up to the "normal" number of days/weeks (so 365/52)
  2. if the window goes over the boundary (so includes both week 52 and week 1, or both day 59 and day 60), then add in the data from
  3. if the window is around the leap day/week, include the window days before and the window days after (so days 57, 58, 59, leap_day, 60, 61, 62)

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

  • Resolves #{issue number}

@dsweber2 dsweber2 requested a review from dajmcdon as a code owner February 27, 2025 23:28
@dsweber2 dsweber2 self-assigned this Feb 27, 2025
Copy link
Contributor

@dajmcdon dajmcdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions, but I think there's a logic error when rolling with the iter. (Probably was there before as well)

R/step_climate.R Outdated
@@ -341,12 +350,63 @@ roll_modular_multivec <- function(col, .idx, weights, aggr, window_size, modulus
tidyr::nest(data = c(col, weights), .by = .idx)
out <- double(nrow(tib))
for (iter in seq_along(out)) {
entries <- (iter - window_size):(iter + window_size) %% modulus
# +1 from 1-indexing
entries <- ((iter - window_size):(iter + window_size) %% modulus)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
entries <- ((iter - window_size):(iter + window_size) %% modulus)
entries <- (iter - window_size):(iter + window_size) %% modulus

Don't need the outer paren do we?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm concerned this logic fails if given partial years. Same with keying on the iteration below. Suppose that I give you only data for January-March for 25 years (and ask for daily). Then max(iter)=91 so the leap branch will never happen. The same would be true if I removed the summer for every year and asked for epiweek climate. I think it needs to key on the .idx rather than iter.

The tests all presume you have every possible .idx covered and arranged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Less than a full year of data is definitely a super pathological case. Switching it to matching .idx values rather than assuming the rows are right does seem to work though. I'm including a test in the next commit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this resolved? (I don't think this is pathological at all. Giving flu seasons, August to April, every year for multiple years and cutting out the summer would break it.)

R/step_climate.R Outdated
# we need to grab just the window around the leap day on the leap day
if (iter == 366) {
entries <- ((59 - window_size):(59 + window_size - 1) %% modulus)
entries <- c(entries, 366)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a

entries[entries == 0] <- 365

here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I suppose so, though it requires window_size = 60 to become relevant

dsweber2 and others added 2 commits March 5, 2025 12:43
Co-authored-by: Daniel McDonald <[email protected]>
Co-authored-by: Daniel McDonald <[email protected]>
@dsweber2 dsweber2 merged commit 7e10b52 into climatological Mar 11, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants