Skip to content

-Zinstrument-coverage showing missed coverage on _part_ of function signature (with proc-macro?) #86972

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alex opened this issue Jul 8, 2021 · 22 comments
Labels
A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) C-bug Category: This is a bug.

Comments

@alex
Copy link
Member

alex commented Jul 8, 2021

Starting with an apology that I have not yet figured out how to reduce this to something smaller than "our entire project".

We use -Zinstrument-coverage in pyca/cryptography for capturing coverage. With a recent upgrade to pyo3, we're seeing missing coverage for part of the function signature. A screenshot from codecov (https://codecov.io/gh/pyca/cryptography/src/120e804d8a36b4adfa71b2e5700a773ff9cfd19a/src/rust/src/ocsp.rs):

image

Having missing coverage on a function argument doesn't make a ton of sense.

The process by which we run this can be found here: https://github.com/pyca/cryptography/blob/main/.github/workflows/ci.yml#L207-L267

I am fearful that this is a challenging intersection between coverage and proc-macros, but as I said I regrettably haven't minimized this test case yet.

@alex alex added the C-bug Category: This is a bug. label Jul 8, 2021
@jonas-schievink jonas-schievink added the A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) label Jul 8, 2021
@alex
Copy link
Member Author

alex commented Jul 8, 2021

FYI: @richkadel

@richkadel
Copy link
Contributor

proc_macro is a very reasonable guess. Producing accurate coverage for macros, in general, is very hard to get perfect since they can generate almost anything. If you're able to reduce this at some point in the future, I'll try to take a deeper look, but I can't promise a quick fix.

@alex
Copy link
Member Author

alex commented Jul 8, 2021

I tried minimizing with the following macro:

#[proc_macro_attribute]
pub fn pyfunction(attr: proc_macro::TokenStream, input: proc_macro::TokenStream) -> proc_macro::TokenStream {
    let ast = syn::parse_macro_input!(input as syn::ItemFn);
    quote::quote!(
        #ast
    ).into()
}

This produces the following extremely confusing coverage report, where it seems that not only does the function not get covered, but also the test function isn't covered at all!

image

@alex
Copy link
Member Author

alex commented Jul 12, 2021

@davidhewitt this happened when we upgraded to pyo3 0.14. I skimmed that diff and couldn't see what might have led to a change in a coverage. Do you have any ideas?

@richkadel
Copy link
Contributor

Upgraded from which earlier version?

@alex
Copy link
Member Author

alex commented Jul 12, 2021

0.13.1 -- pyca/cryptography@120e804 is the commit that cause the loss of coverage

@richkadel
Copy link
Contributor

This produces the following extremely confusing coverage report, where it seems that not only does the function not get covered, but also the test function isn't covered at all!

Obviously I don't know how you invoked this, but based on the output you're showing here, it looks like you didn't invoke the test, when running the binary that produced those coverage results; in which case, the results are correct. The compiler compiled the test and the proc_macro, and shows the correct coverage spans, and if they were not executed, then 0 is correct.

I know that some proc_macros do work, but I'm sure the coverage results can be "fooled" by other proc macros. Things that factor into it are: Does the generated code (e.g., from quote!() or quote_spanned! produce a syntacticly similar result or not (for example, in this case, from something that looks like a function to something that is a function), and are the spans mapped correctly (or at least sensibly).

Since coverage has no insight into the semantic intent of a proc_macro, it may be impossible to infer a mapping that looks reasonable to the human eye.

In the case of your original example, since half of the function signature appears to have been executed, but the full function signature appears to have coverage, with the second half showing 0 executions, that likely means there are at least two execution paths (could be 2 or more functions, or could be branching logic in a single function, but my guess is the former).

My guess is there is a span produced for the full function signature for a function or closure that is not counted (covered by not called?) and another span produced for only part of the function signature (not covering the last argument or the return type) for some reason.

It could be that there is logic to why the proc_macro returns quoted result spans this way, but if not, then it may be just a less than ideal computation of the mapped span (a bug only noticeable in coverage results or perhaps in an diagnostic messages from the compiler) that, if corrected, would fix your coverage.

For example, I recently fixed a problem with a few instances of the following "mistake":

    let item = parse_macro_input!(item as syn::ItemFn);
    let syn::ItemFn { attrs, sig, vis: _, block } = item;
...
    let ret_type = sig.output;
    let inputs = sig.inputs;
    let span = sig.ident.span();
    let ident = sig.ident;
    let output = quote_spanned! {span=>
        fn #ident () #ret_type {
            async fn func(#inputs) #ret_type {      <-- These braces are redundant, and
                #block
            }                        <-- create an inconsistent span map of the function body
            ...
        }
    }

The #block includes the function body braces in the original source. The redundant braces in the proc_macro produce a function body that is in a different syntax context from the function signature, so coverage can't associate the function signature span with the function body span in the same way it does without a proc_macro, breaking the ability to show coverage results of anything in the original body!

This is pretty hard to figure out, and your problem is not that similar to this one, as far as I can tell (looking at the pyo library proc_macro code just a bit). No obvious fix here.

I'm not actively working on the coverage code right now (I have had to shift to other projects at work), but at some point I'd like to get some time to try to address problems like these by solving #85000. I don't know if this will work, but it's a possible general solution for some macro issues, and at the moment the only thing I can think of for your issue, since a root cause is not jumping out at me.

Maybe the proc_macro author can see how the macro spans could be improved?

@alex
Copy link
Member Author

alex commented Jul 12, 2021 via email

@richkadel
Copy link
Contributor

Is there a good way for us to debug the spans generated by a macro?

I'm not sure if there's a "good" way, but I believe there is a compiler --unpretty option to expand proc_macros. Other than that, I'm not sure what to suggest.

@alex
Copy link
Member Author

alex commented Jul 12, 2021

Ok, I've put a full reproducer in: https://github.com/alex/temp-rust-coverage

With teh current contents of the repo sh get-coverage.sh shows the all lines uncovered issue. If you comment out https://github.com/alex/temp-rust-coverage/blob/main/src/lib.rs#L1 then you get full coverage (as expected).

So somehow the no-op proc-macro is causing a loss of coverage, even in spans it doesn't touch. I can't imagine what could cause that. But also I can't imagine what I did wrong in this reproducer.

@richkadel
Copy link
Contributor

Thank you. This is helpful!

@richkadel
Copy link
Contributor

OK, I tried your sample, and I think the problem you're having is related to how you are running and extracting the results.

Your script works for me and does produce the results with missing coverage, but when I changed the first cargo test command to add --tests, I actually get correct coverage.

Dig into the documentation in the Rust Unstable Book to get more insights on this, but I think there are a couple of issues with your script.

  1. Without --tests you are running the unit tests and the doctests, and they run as separate binaries. Both produce default.profraw, so the doc tests are overwriting your coverage results. You need to set the LLVM_PROFILE_FILE with some descriptor value to distinguish them.
  2. Then you also need to add the -object prefix in llvm-cov in front of all binaries (including the doctest binaries, which you will need to save).

This should fix your test program, but that doesn't mean it will fix your original issue unless your coverage results from your original program are also being generated incorrectly.

Let me know if you can reproduce that.

Thanks!

@alex
Copy link
Member Author

alex commented Jul 14, 2021

Fascinating... I can confirm that with --tests everything works ok.

I don't really understand why having a proc-macro or not impacts this.

Unfortunately this doesn't impact the underlying issue that caused me to file the bug. But at least now I have a reproduction environment where I can continue to pursue the real issue here.

@alex
Copy link
Member Author

alex commented Jul 14, 2021

Ok, that repo has now been updated to reproduce the original issue here (partial coverage of arguments).

If you change foo's arguments to be on a single line, you'll that the line is now claimed to have coverage, but it's still really only partial coverage.

I suspect, but can't prove yet, that what's happening behind the scenes is that pyo3 is generating other code that's uses the spans of these arguments (which is common in proc-macros) so that if there's type-checking errors, the original line will be attributed.

@richkadel
Copy link
Contributor

For future reference, the results from your new sample are shown below as a screenshot, so the specific coverage spans show up (by character).

Also, FYI, I inserted a blank line at the top to reduce confusion. At least in my environment (using the llvm-cov build from rust sources), there appears to be a bug in the llvm-cov output. Line 1 is duplicated (on the same line). I'm ignoring that for now, but it's easier to see the coverage of the attribute this way.

pyo3 coverage issue

@alex
Copy link
Member Author

alex commented Jul 15, 2021

I haven't attempted to officially prove it by reverting the patch (since like 900 follow up patches rely on it)... but I'll bet a nickle that PyO3/pyo3#1440 is what introduced this issue, since it started generating code with spans on the ty of the args.

I don't know exactly how the spans work for rust's coverage, but I'll bet something is going on like:

  1. Rust generates one big span that covers the entire function definition. This is covered always (or maybe anytime the function is entered)
  2. pyo3 generates some code and assigns it spans that are subsets of those (covering individual arguments). These are not always run in practice.

The coverage machinery sees the big span as covered from (1) but the sub-span of that from (2) as not covered and draws something what we see in that screenshot.

Am I anywhere within 10 country miles of something plausible?

@richkadel
Copy link
Contributor

Yes plausible. Not sure but it fits what we see.

Did you notice this issue?

PyO3/pyo3#1506

A fix was submitted. Is that fix in the version you upgraded to? If not, just maybe that fix could help, but maybe a long shot.

@alex
Copy link
Member Author

alex commented Jul 15, 2021 via email

@alex
Copy link
Member Author

alex commented Jul 15, 2021

A git bisect tells me that in fact PyO3/pyo3@ce851ad is the commit that introduce the coverage issues.

Going to stare at the commit intently and see which part triggers it.

alex added a commit to alex/pyo3 that referenced this issue Jul 15, 2021
…hey are uncovered

rust-lang/rust#86972 describes this issue in some detail and links to a minimal reproducer

making this change, and this change alone, resolves the coverage issues
@alex
Copy link
Member Author

alex commented Jul 15, 2021

I've now sent PyO3/pyo3#1726 as a workaround.

@davidhewitt
Copy link
Contributor

Sorry I haven't participated in the discussion. Thanks both for all the digging, this is really interesting.

@alex
Copy link
Member Author

alex commented Jun 30, 2024

At this point I have no idea if this is current, I'm going to close this.

@alex alex closed this as not planned Won't fix, can't repro, duplicate, stale Jun 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-code-coverage Area: Source-based code coverage (-Cinstrument-coverage) C-bug Category: This is a bug.
Projects
None yet
Development

No branches or pull requests

4 participants