Skip to content

Using gnu_asm_goto_with_outputs_full without optimizations returns uninitialized memory #60855

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
josephcsible opened this issue Feb 19, 2023 · 4 comments
Assignees

Comments

@josephcsible
Copy link

Consider this C code:

#if !__has_extension(gnu_asm_goto_with_outputs_full)
#error Need the gnu_asm_goto_with_outputs_full extension
#endif
int main(void) {
    int x = 123;
    asm goto("mov %1, %0\n\tjmp %l[label]" : "=r" (x) : "r" (45) : : label);
    x = 6;
    label:
    return x;
}

It's supposed to return 45. But when compiled on Clang trunk (4be1764) without optimizations, it produces this assembly:

main:                                   # @main
        pushq   %rbp
        movq    %rsp, %rbp
        movl    $0, -4(%rbp)
        movl    $123, -8(%rbp)
        movl    $45, %eax
        movl    %eax, %eax
        jmp     .LBB0_3
        movl    %eax, -16(%rbp)                 # 4-byte Spill
        movl    %eax, -12(%rbp)                 # 4-byte Spill
        jmp     .LBB0_1
.LBB0_1:
        movl    -12(%rbp), %eax                 # 4-byte Reload
        movl    %eax, -8(%rbp)
        movl    $6, -8(%rbp)
.LBB0_2:
        movl    -8(%rbp), %eax
        popq    %rbp
        retq
.LBB0_3:                                # Block address taken
        movl    -16(%rbp), %eax                 # 4-byte Reload
        movl    %eax, -8(%rbp)
        jmp     .LBB0_2

Which ends up returning uninitialized memory (the -16(%rbp) from the third-to-last line) instead. If I instead compile with any optimization level (other than -O0), then it produces correct assembly.

https://godbolt.org/z/fnPcYY44G

@josephcsible
Copy link
Author

@nickdesaulniers asked me to assign this issue to him, but it looks like I don't have permission to do so.

@nickdesaulniers
Copy link
Member

nickdesaulniers commented Feb 27, 2023

Thanks for the report.

IIUC, It looks like regallocfast is creating distinct "Frame Objects" for the two branches, but hoisting them into the block with the INLINEASM_BR. Probably should not do that if a MachineBasicBlock may contain an INLINEASM_BR.

Specifically, we're missing a store in the indirect edges, so we're loading uninitialized stack slots.

@nickdesaulniers
Copy link
Member

nickdesaulniers added a commit that referenced this issue Mar 1, 2023
This test demonstrates an issue with callbr outputs being used along
indirect edges when using regallocfast.

Link: #60855

Differential Revision: https://reviews.llvm.org/D144906
@nickdesaulniers
Copy link
Member

Thanks again for the report @josephcsible , please keep hammering!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants