Visual checks: CPR num tests and test positivity at the county level #1513

nmdefries · 2022-02-04T18:06:00Z

Originating Asana task with line plots and choropleths for hospital admissions, test volume, and test positivity.

Outstanding question: Should we suppress test positivity for small counts? Small counts cause the signal to be highly variable.

Number of tests and test positivity at the county level need further visualization, stratified in some way to make plots easier to interpret.

nmdefries · 2022-02-11T15:22:01Z

Here are some additional county-level test positivity visualizations, broken down by high-medium-low test volume.

@ryantibs @krivard Let me know if any other plots would be useful.

In terms of variability of individual signals, things look okay (by eyeball) by the ~6th %ile mark. But if we used that as a threshold, we'd be discarding a lot of data. Is there precedent for this from any of the other indicators?

krivard · 2022-02-16T19:02:05Z

Can you produce the following:

x axis	y axis	line color
day	number of counties with test volume <=Z	Z in 1, 2, 3, 4, 5, 10, 20, 50
day	percentage of available counties with test volume <=Z	Z in 1, 2, 3, 4, 5, 10, 20, 50

where "available counties" are all counties with a numeric value reported for that day

if we used that as a threshold, we'd be discarding a lot of data. Is there precedent for this from any of the other indicators?

There's definitely precedent; almost all our sample-based indicators do not report if a minimum sample size is not met. For testing, test volume === sample size.

nmdefries · 2022-02-16T23:36:04Z

krivard · 2022-02-18T18:03:21Z

My gut feeling is to go with a threshold of 6 (ie drop everything with 5 or fewer total tests). That would give us a worst-case minimum nonzero test positivity of ~16%, which is higher than I'd like, but I also don't like the idea of suppressing more than 20% of our data even if it's only during the slow times.

@ryantibs thoughts?

ryantibs · 2022-02-19T16:37:44Z

As a working solution to get us to move forward: this is fine with me. Thanks!

nmdefries · 2022-02-21T20:12:00Z

Most of the time (when not backfilling new signals), the pipeline processes a single new spreadsheet, for the current day.

Within a spreadsheet, test positivity is reported for a time period that overlaps with but is not the same as the period that test volume is reported for. For example, the 2022-01-07 spreadsheet reports test positivity for Dec 29-Jan 4 and test volume for Dec 25-31. We report these as 7-day averages assigned to the last day in the range, so positivity would be for Jan 4 and test volume for Dec 31.

This means that sample size (test volume) values aren't the most appropriate to use for thresholding test positivity values found in the same spreadsheet. Potential approaches:

Do it anyway.
- The simplest approach. The time periods are only a few days different.
Don't report new positivity values right away. Wait until test volume values for the right date range are published (4 days later), then threshold and publish positivity.
- This requires re-processing the last few days of spreadsheets. Without additional logic, will add duplicate entries to the API.
- It's possible that a corresponding time period for test volume will never exist since spreadsheets aren't published on the weekend.
Do something else?

krivard · 2022-02-21T21:02:09Z

oh yuck, i'd forgotten about that.

@ryantibs this means that we also can't easily fill in testing volume for the sample_size column in test positivity either. I'll add this as a discussion item for the next Leads meeting.

krivard · 2022-02-24T21:06:23Z

Discussion of decision and alternatives in PRD

TL;DR: given test positivity reference date X and test volume reference date Y coming from a single CPR file,

generate test positivity signal files named for date X containing value from test positivity at X and stdev/sample_size from test volume at Y
censor based on test volume at Y
document this choice obsessively in code and API docs

nmdefries · 2024-05-06T17:28:24Z

Add dsew keyword for searchability

nmdefries self-assigned this Feb 4, 2022

krivard mentioned this issue Feb 28, 2022

[cpr] Activate vaccination signals in production #1540

Merged

nmdefries mentioned this issue Mar 3, 2022

Filter out small test-volume signals in CPR #1548

Merged

krivard closed this as completed in #1548 Mar 9, 2022

nmdefries mentioned this issue Apr 26, 2024

nssp pipeline code #1952

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Visual checks: CPR num tests and test positivity at the county level #1513

Visual checks: CPR num tests and test positivity at the county level #1513

nmdefries commented Feb 4, 2022

nmdefries commented Feb 11, 2022

krivard commented Feb 16, 2022

nmdefries commented Feb 16, 2022

krivard commented Feb 18, 2022

ryantibs commented Feb 19, 2022

nmdefries commented Feb 21, 2022

krivard commented Feb 21, 2022

krivard commented Feb 24, 2022

nmdefries commented May 6, 2024

Visual checks: CPR num tests and test positivity at the county level #1513

Visual checks: CPR num tests and test positivity at the county level #1513

Comments

nmdefries commented Feb 4, 2022

nmdefries commented Feb 11, 2022

krivard commented Feb 16, 2022

nmdefries commented Feb 16, 2022

krivard commented Feb 18, 2022

ryantibs commented Feb 19, 2022

nmdefries commented Feb 21, 2022

krivard commented Feb 21, 2022

krivard commented Feb 24, 2022

nmdefries commented May 6, 2024