Skip to content

Debug info demangling workaround #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from

Conversation

tempdragon
Copy link

@tempdragon tempdragon commented Mar 30, 2024

TL;DR

This series of commit is to provide a temporary workaround for demangling of cg_gcc. It will enable the demangling in GDB for backtrace & function names but DISABLE the mangled name printing even with set print demangle off.

Dependency

Required BY rust-lang/gccjit.rs#31

Details

This part of code will be used even in the final version that supports both demangled and mangled names.

Currently this will disable the generation of mangled name information (just for GCCJIT, if the option GCC_JIT_BOOL_OPTION_MANGLED_FUNCTION_NAME is set, making it of NO overhead for those who don't use this option.)
(In my case, when co-generating both the mangled name(as DW_AT_linkage_name) and the function name (as DW_AT_name) are generated, the GDB will only print the DW_AT_name for a backtrace, despite the DW_AT_lang. However, this problem doesn't turn up in LLVM-generated code).

Further research and research is still to be done over how exactly GDB decides whether to demangle the functions names.

@antoyo
Copy link

antoyo commented Mar 30, 2024

Thanks!

Could you please explain why you decided to go for a function attribute instead of simply adding a new API gcc_jit_function_set_short_name?

@tempdragon
Copy link
Author

tempdragon commented Mar 30, 2024

It seems to me that I will ultimately have to move the short name somewhere in the tree to make it accessible by the dwarf gen phase. It's either an attribute or a field in the tree struct.

Of course it is possible that you generate a certain short name with demangling. This can potentially to computational time but reduce memory use.

There is a chance where you come up with a much better way than mine. If so, please tell me.

@antoyo
Copy link

antoyo commented Mar 30, 2024

I would be very interested in seeing how the other frontends do this.
I'll take a look.

@tempdragon
Copy link
Author

tempdragon commented Mar 30, 2024

LLVM generates it directly.
GCCRS seems to set the short name directly as the names(It looks way too complicated to me to tailor cg_gcc names to something like this, which I didn't succeed in(some strange linking & compilation problem emerged when I first tried it). Maybe it makes sense to use the complete name as a function name and the passed function name as a DECL_ASSEMBLER_NAME).

@tempdragon
Copy link
Author

tempdragon commented Mar 30, 2024

Another problem in renaming the tree is, the LLVM used a (short name, mangled name)->(DW_AT_name, DW_AT_linkage_name) mapping and the short name they use is effectively shorter than the complete name I am using in the series. I will use the same scheme after investigation into GDB demangling rules is done, which seems to require some separate space to store in.

If the short name is found, set `DECL_ASSEMBLER_NAME` to the function
name and make the fndecl tree public.
@antoyo
Copy link

antoyo commented Apr 1, 2024

Just so you know, it make take me a couple of days before I can take a look.

@tempdragon
Copy link
Author

tempdragon commented Apr 1, 2024

It doesn't matter. It is always up to you to decide your time to take a look. I will reduce my comments. Sorry if I disturbed you.

@antoyo
Copy link

antoyo commented Apr 1, 2024

No, it's OK, you can continue commenting.
I just wanted to let you know that I cannot review right now :) .

@tempdragon
Copy link
Author

I guess I misunderstood something and I need further investigation now.

@tempdragon
Copy link
Author

tempdragon commented Apr 2, 2024

It seems that LLVM uses an incremental approach to demangled name construction.

I traced GDB using GDB itself, and noticed that LLVM-generated debuginfo only has the function name without the namespace prefix in DW_AT_name. However, GDB itself constructs the name from namespaces rather than directly demangling the linkage name or getting it from the DW_AT_name.(to replicate this, set breakpoints on the appropriate positions of new_symbol, dwarf2_physname, determine_prefix of dwarf2/read.c when running a gdb debugging a cg_gcc/cg_llvm generated rust ELF).

If that is true, to implement a similar mechanism, great modifications should be done to code generation since they are tightly coupled now. So keeping the full name in DW_AT_name may be the only way to implement a similar functionality now. However, that keeps the issue of ignoring the mangled name in GDB.

In commit 906bb4c of GDB, Tom Tromey gave the reason for this: Sometimes rustc gives different demangled names from the one constructed from the hierarchy, causing some tests to fail. Maybe we should try to have libgccjit support namespace in the long round.

@antoyo
Copy link

antoyo commented Sep 3, 2024

Sorry, I completely forgot about this issue (you can ping me in these cases :) ).

What's the status of this? Are you waiting for my review?

@tempdragon
Copy link
Author

No, I think I should close this and for personal reasons I will have to wait a long time(a yr or so?) before I can really return to cg_gcc work. Thank you so much for your work in cg_gcc.

@tempdragon tempdragon closed this Sep 3, 2024
@antoyo
Copy link

antoyo commented Sep 3, 2024

Ok, I understand. Thank you to your contributions.

Is this something I could reuse if I want to implement this myself?

@tempdragon
Copy link
Author

tempdragon commented Sep 3, 2024

Probably not. I can't remenber many details in this branch and probably the same process and ideas should be repeated for someone to implement this feature. The successor to this module will probably need to debug gdb someday in the future as I did... This branch is for reference only.

If anyone is interested in taking over this part of job, FYI, the point is to create a new data structure for debug namspace(Otherwise the gccjit context should be used and we will have to rewrite a lot of code) to record names of parent modules for correct demangling of each level of module. GDB uses a tree walking strategy rather than demangling to generate the full path and then the debug name of each symbol. (Based on what I read from GDB code.) You need to rebuild the whole module dbg name info hierachy on the cg_gcc then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants