Skip to content

Move pipeline manual to repo #1983

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jul 9, 2024
Merged

Move pipeline manual to repo #1983

merged 15 commits into from
Jul 9, 2024

Conversation

nmdefries
Copy link
Contributor

@nmdefries nmdefries commented Jul 1, 2024

Description

Migrating from Gdrive, plus adding more detail about statistical review and documentation.

@nmdefries nmdefries requested a review from dsweber2 July 2, 2024 00:45
@nmdefries
Copy link
Contributor Author

@dsweber2 feel free to do as light or thorough a review as you want (or make changes directly if you prefer). I will be doing followup PRs to flesh out some of the sections.

@dsweber2
Copy link
Contributor

dsweber2 commented Jul 8, 2024

general suggestion: either do linebreaks at 80 characters, or put every sentence on a separate line (I generally prefer this strategy). This makes diffs easier to work with.

@nmdefries nmdefries force-pushed the ndefries/pipeline-manual branch from d36c69d to 8ecea58 Compare July 8, 2024 19:48

* `raw`: unsmoothed, _no longer used; if no smoothing is specified the signal is assumed to be "raw"_
* `7dav`: smoothed using a average over a rolling 7-day window; comes at the end of the name
* `smoothed`: smoothed using a more complex smoothing algorithm
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* `smoothed`: smoothed using a more complex smoothing algorithm
* `smoothed`: smoothed using a more complex smoothing algorithm; comes at the beginning of the name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to introduce these "tags" to describe how we've historically named signals, and also how we want to name signals in the future.

Example historical names:

JHU raw: confirmed_cumulative_prop
JHU smoothed: confirmed_7dav_cumulative_prop

Google raw: s01_raw_search
Google smoothed: s01_smoothed_search

Change smoothed: smoothed_adj_outpatient_covid

HHS raw: confirmed_admissions_covid_1d
HHS smoothed: confirmed_admissions_covid_1d_7dav

So "smoothed" shows up in various places. "7dav" has a similar meaning and in my mind should be in the same position. It also shows up in various positions in signal names but most consistently shows up at the end, as in the HHS smoothed signal.

Does that mean that we should say "smoothed" also comes at the end of the name, so that people use it like that in the future? I don't think any existing names put it at the end.

I haven't come up with an entire naming strategy, just some general tips. Do you have thoughts on naming format? Or should we leave it kind of open-ended?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I guess I just didn't do a thorough enough survey to notice it showed up in the middle sometimes.

I would change the wording to be prescriptive if that's what you're trying to communicate, and saying it should be at the end does sound like a good idea.

It may be worth mentioning if we're mirroring another processed dataset, that keeping their naming conventions is probably preferable.

@dsweber2
Copy link
Contributor

dsweber2 commented Jul 8, 2024

retag me when you're happy w/it (or just want another review).
fun fact: if you're writing a review and new commits come in, it just disappears your saved comments without deleting them. Hopefully I got the duplicates.

@nmdefries nmdefries requested a review from dsweber2 July 9, 2024 20:21
Copy link
Contributor

@dsweber2 dsweber2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna mark as approved, though Minh's docs should get merged before this gets merged of course.

@nmdefries nmdefries merged commit 257179b into main Jul 9, 2024
16 checks passed
@nmdefries nmdefries deleted the ndefries/pipeline-manual branch July 9, 2024 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants