-
Notifications
You must be signed in to change notification settings - Fork 16
Move pipeline manual to repo #1983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@dsweber2 feel free to do as light or thorough a review as you want (or make changes directly if you prefer). I will be doing followup PRs to flesh out some of the sections. |
general suggestion: either do linebreaks at 80 characters, or put every sentence on a separate line (I generally prefer this strategy). This makes diffs easier to work with. |
d36c69d
to
8ecea58
Compare
|
||
* `raw`: unsmoothed, _no longer used; if no smoothing is specified the signal is assumed to be "raw"_ | ||
* `7dav`: smoothed using a average over a rolling 7-day window; comes at the end of the name | ||
* `smoothed`: smoothed using a more complex smoothing algorithm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* `smoothed`: smoothed using a more complex smoothing algorithm | |
* `smoothed`: smoothed using a more complex smoothing algorithm; comes at the beginning of the name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to introduce these "tags" to describe how we've historically named signals, and also how we want to name signals in the future.
Example historical names:
JHU raw: confirmed_cumulative_prop
JHU smoothed: confirmed_7dav_cumulative_prop
Google raw: s01_raw_search
Google smoothed: s01_smoothed_search
Change smoothed: smoothed_adj_outpatient_covid
HHS raw: confirmed_admissions_covid_1d
HHS smoothed: confirmed_admissions_covid_1d_7dav
So "smoothed" shows up in various places. "7dav" has a similar meaning and in my mind should be in the same position. It also shows up in various positions in signal names but most consistently shows up at the end, as in the HHS smoothed signal.
Does that mean that we should say "smoothed" also comes at the end of the name, so that people use it like that in the future? I don't think any existing names put it at the end.
I haven't come up with an entire naming strategy, just some general tips. Do you have thoughts on naming format? Or should we leave it kind of open-ended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I guess I just didn't do a thorough enough survey to notice it showed up in the middle sometimes.
I would change the wording to be prescriptive if that's what you're trying to communicate, and saying it should be at the end does sound like a good idea.
It may be worth mentioning if we're mirroring another processed dataset, that keeping their naming conventions is probably preferable.
retag me when you're happy w/it (or just want another review). |
Co-authored-by: David Weber <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gonna mark as approved, though Minh's docs should get merged before this gets merged of course.
Description
Migrating from Gdrive, plus adding more detail about statistical review and documentation.