Skip to content

Teach debuginfo to generate namespaces for each component of a path #1541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jdm opened this issue Jan 16, 2012 · 9 comments
Closed

Teach debuginfo to generate namespaces for each component of a path #1541

jdm opened this issue Jan 16, 2012 · 9 comments
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) C-enhancement Category: An issue proposing an enhancement or a PR with one.

Comments

@jdm
Copy link
Contributor

jdm commented Jan 16, 2012

Currently function information is generated with the fully-qualified path. Instead, we should link to the parent module in the context field, using DW_AT_namespace to represent modules.

@ghost ghost assigned jdm Apr 12, 2012
@catamorphism
Copy link
Contributor

Clearing milestone. Just a bug, IMO. Re-nominate if you disagree.

@graydon
Copy link
Contributor

graydon commented Jun 17, 2013

@catamorphism Please adjust milestones via nomination-to-change, not direct edits. Milestones represent group decisions.

@graydon
Copy link
Contributor

graydon commented Jun 20, 2013

just a bug, removing milestone/nomination.

@bblum
Copy link
Contributor

bblum commented Aug 16, 2013

Not sure whether this is "enhancement" or "wrong", so tagging with both.

@michaelwoerister
Copy link
Member

I have a prototype for this. However, I stumbled upon a GDB peculiarity while working on the implementation. GDB seems to use the linkage_name attribute (if present) to find out which namespace a function is in. That is, it de-mangles the linkage name to produce ABC::BBB::uuu from _ZN3ABC3BBB3uuuEv. For C++ and Clang this works fine but rustc adds a hash value and version number to the linkage name. If we were to use these actual names, GDB would display the above function as void ABC::BBB::uuu::_723b201b7ff6bc3f::_0$x2e0(void) which clearly can't be the way to go.

If the linkage_name attribute is not present in the debug info, then GDB will display the name correctly. So, I would opt for not setting the linkage_name instead of either
(1) setting it correctly and obfuscating object names in the process or
(2) setting it without the hash and version suffix, leading to incorrect information in the debug info.

@jdm
Copy link
Contributor Author

jdm commented Aug 21, 2013

I'm happy to see what happens if we don't include a linkage name.

@michaelwoerister
Copy link
Member

OK, I've investigated this a bit more systematically now and I can't reproduce the 'no linkage name' behaviour. That is, with correct description of the namespace hierarchy but missing DW_AT_linkage_name attribute, GDB does not display function names correctly.

I've tested many different combinations and the only way to get GDB to display function names correctly seems to be via the linkage name attribute. That also means that we have to set the incorrect linkage name (missing hash and version number suffix).

There are still two design decisions to make that I would like to have some input on:
(1) Should we create a namespace for the crate name too?
E.g. should it be ast::Block or syntax::ast::Block? The first variant is more terse and correct most of the time but might lead to ambiguities in multi-crate scenarios.

(2) Is it important to set the correct source line attribute for namespaces?
Doing so would have quite some impact on the implementation because often it is necessary to create namespace entries for modules that have not been translated yet. However, we usually just have the ast_map::path to construct the entry from, which does not contain any code location information.

I see three alternatives here:

  1. Always set the line attribute to zero because it is not that important.
  2. Add another table to the ast_map that allows to find the parent of an AST node.
  3. Do a translation pre pass that only creates namespace entries and makes them easy to look up later.

Option (1) would be the most simple but creates slightly incorrect debug info.
Option (2) would increase the memory footprint and seems a bit heavy-handed.
Option (3) slightly complicates the usage protocol for the debuginfo module and relies on the assumption that items declared within a function can't escape the declaring function's scope (which is true now, I think).

I would prefer option (3).

Any comments?

@DiamondLovesYou
Copy link
Contributor

Rust's lifetimes assure us that option 3's criterion will always hold,
right?
On Aug 23, 2013 5:00 AM, "Michael Woerister" [email protected]
wrote:

OK, I've investigated this a bit more systematically now and I can't
reproduce the 'no linkage name' behaviour. That is, with correct
description of the namespace hierarchy but missing DW_AT_linkage_nameattribute, GDB does
not display function names correctly.

I've tested many different combinations and the only way to get GDB to
display function names correctly seems to be via the linkage name
attribute. That also means that we have to set the incorrect linkage name
(missing hash and version number suffix).

There are still two design decisions to make that I would like to have
some input on:
(1) Should we create a namespace for the crate name too?
E.g. should it be ast::Block or syntax::ast::Block? The first variant is
more terse and correct most of the time but might lead to ambiguities in
multi-crate scenarios.

(2) Is it important to set the correct source line attribute for
namespaces?
Doing so would have quite some impact on the implementation because often
it is necessary to create namespace entries for modules that have not been
translated yet. However, we usually just have the ast_map::path to
construct the entry from, which does not contain any code location
information.

I see three alternatives here:

  1. Always set the line attribute to zero because it is not that important.
  2. Add another table to the ast_map that allows to find the parent of an
    AST node.
  3. Do a translation pre pass that only creates namespace entries and
    makes them easy to look up later.

Option (1) would be the most simple but creates slightly incorrect debug
info.
Option (2) would increase the memory footprint and seems a bit
heavy-handed.
Option (3) slightly complicates the usage protocol for the debuginfo
module and relies on the assumption that items declared within a function
can't escape the declaring function's scope (which is true now, I think).

I would prefer option (3).

Any comments?


Reply to this email directly or view it on GitHubhttps://github.com//issues/1541#issuecomment-23154195
.

@michaelwoerister
Copy link
Member

Rust's lifetimes assure us that option 3's criterion will always hold, right?

In this case we are not concerned with values or references outliving the function scope they were defined in, but with types and functions not being visible outside the function they are defined in. For example, the following code is perfectly legal as far as the current rustc is concerned:

pub fn aaa() -> int {
    pub mod bbb {
        pub struct CCC {
            x: int
        }

        pub fn ddd() -> CCC {
            return CCC { x: 0 };
        }
    }

    return bbb::ddd().x;
}

That is, we have a module defined in the body of a function. This module again can contain arbitrary items. This is fine for the pre-pass algorithm as long as it can stop whenever it encounters a function. But if the following code were legal, the pre-pass algorithm would also have to handle functions:

// code from example above...

fn xxx() -> aaa::bbb::CCC {
   ...
}

But to fully instantiate function debug info, the pre-pass algorithm would also have to handle type debuginfo, which can lead to cyclic dependencies. This would make the whole algorithm a lot more complicated. Consider the following example:

mod aaa {
    fn bbb() -> ddd::eee::fff {
        struct ccc { ... }    
    }
}

mod ddd {
    fn eee -> aaa::bbb::ccc {
        struct fff { ... }
        ...
    }
}

We would start at aaa, then aaa::bbb, but for handling aaa::bbb, we would first have to handle ddd::eee::fff, so we have to remember that aaa::bbb is yet to handle and continue at ddd, then do ddd::eee. But here we need aaa::bbb::ccc, which can only be done after aaa::bbb is finished, which in turn is blocked ddd::eee::fff. Thus we have a cycle dependency and cannot proceed.

It would be doable, of course, by introducing stubs for functions and later patching those with the actual references. But I'd rather keep everything dead simple if possible. Since modules alone just form a tree they can't introduce cyclic dependencies, so a simple tree traversal would do the trick.

@bors bors closed this as completed in 67ed30c Sep 11, 2013
@jdm jdm removed their assignment Jun 16, 2014
bjorn3 added a commit to bjorn3/rust that referenced this issue Nov 9, 2024
Use a BufWriter in emit_module to reduce syscall overhead
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) C-enhancement Category: An issue proposing an enhancement or a PR with one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants