Skip to content

Are dwp files the right choice for split-debuginfo=packed on Linux? #105991

@mstange

Description

@mstange

@davidtwco @bjorn3 @philipc @Gankra @khuey

Hi all,
I was reading up on split dwarf and DWP files and came away rather confused.

What is a situation in which using split dwarf + DWP is preferable over the traditional way of splitting an ELF file into a binary and a debug file with the help of objcopy --only-keep-debug?

It seems to me that DWP files don't address either of these situations:

  1. For local development I want fast builds, and I don't mind keeping around a whole lot of intermediate files in order to have a working debugger.
  2. For releases I want a small binary and a single redistributable debug file.

On macOS and Windows, split-debuginfo=packed addresses the second case. The binary is small because all debug information has been removed from it. And the binary contains an ID (and on Windows also a file path) which allows looking up the debug file (i.e. the PDB file / dSYM bundle). I can drop the debug file in a big directory with the debug files from all my other releases, and it's easy to find the right one for a given binary based on its ID; on Windows I can even put the PDB file on a symbol server and my debugger will find it based on the information in the binary. Also, if I want to look up symbol information for an address, for example for a crash stack, I can get it from the debug file and don't need the binary.

DWP files do not appear to address the second use case at all:

  • The binary still contains a small amount of debug information (e.g. "skeleton units"). This information is needed to make sense of the information in the DWP file. So the binary is not as small as it could be. It also contains the debug information for the Rust standard library, which is not moved into the DWP file. If I want to strip off the debug information from the binary, I end up with three files: The stripped binary, the file with the skeleton + std debug information, and the DWP file.
  • The binary does not contain a pointer to the DWP file. It contains DWO IDs and paths to the original DWO files, but the point of the DWP file is that I shouldn't need those DWO files. So those paths are not useful. It sounds like gdb expects the DWP file to be located in the same directory as the binary, with the binary's filename plus a .dwp extension. If I have a binary and a corresponding DWP file, at least I can validate that it's the correct DWP file by checking for matching DWO IDs, I think.
  • Because the DWP file is identified by being in the same directory as the binary, I cannot drop it into a big directory with my other symbol files.
  • I cannot use a DWP file on a symbol server, for example with debuginfod, because afaik the DWP file has no build ID.
  • To resolve the symbol / debug info for an address, the DWP file is not sufficient on its own; I have to consult both the binary with the skeleton units + std debug info, as well as the DWP file.

Please correct me if any of the above is incorrect. I'm basing these statements on my read of the DebugFissionDWP document and on the fact that gimli's dwarfdump example requires the --dwp argument to be paired with a --dwo-parent argument. I haven't read the DWARF 5 spec yet, and I might also be mixing up GNU extension dwp with DWARF 5 dwp.

Anyway, all of this leaves me wondering whether it would have a better choice to make split-debuginfo=packed behave in the "Linux distro packager" way: Compile everything without split dwarf, and then run the commands from #34651 (comment) to split the resulting file into a binary and a .debug file. This would satisfy all the use cases I listed above:

  • The binary contains no debug information.
  • The binary and the debug file have the same ELF build ID.
  • The binary contains a "debuglink" with the filename of the debug file.
  • I can put the debug file on a debuginfod server, and it is found by gdb / perf / etc. by its build ID.
  • To symbolicate an address, I only need the debug file.

Also, just to clarify, I think split dwarf is perfectly fine for split-debuginfo=unpacked. My issue is only with split-debuginfo=packed.

Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-debuginfoArea: Debugging information in compiled programs (DWARF, PDB, etc.)C-discussionCategory: Discussion or questions that doesn't represent real issues.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions