-
Notifications
You must be signed in to change notification settings - Fork 13.4k
DEBUGINFOD based DWP acquisition for LLDB #70996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lldb @llvm/pr-subscribers-debuginfo Author: Kevin Frei (kevinfrei) ChangesI've plumbed the LLVM DebugInfoD client into LLDB, and added automatic downloading of DWP files to the SymbolFileDWARF.cpp plugin. If you have DEBUGINFOD_URLS set to a space delimited set of web servers, LLDB will try to use them as a last resort when searching for DWP files. If you do not have that environment variable set, nothing should be changed. There's also a setting, per @clayborg 's suggestion, that will override the environment variable, or can be used instead of the environment variable. The setting is why I also needed to add an API to the llvm-debuginfod library Test Plan:Suggestions are welcome here. I should probably have some positive and negative tests, but I wanted to get the diff up for people who have a clue what they're doing to rip it to pieces before spending too much time validating the initial implementation. Full diff: https://github.com/llvm/llvm-project/pull/70996.diff 10 Files Affected:
diff --git a/lldb/include/lldb/Target/Target.h b/lldb/include/lldb/Target/Target.h
index 82045988018b606..cd5c88767c900d1 100644
--- a/lldb/include/lldb/Target/Target.h
+++ b/lldb/include/lldb/Target/Target.h
@@ -258,6 +258,8 @@ class TargetProperties : public Properties {
bool GetDebugUtilityExpression() const;
+ Args GetDebugInfoDURLs() const;
+
private:
// Callbacks for m_launch_info.
void Arg0ValueChangedCallback();
@@ -270,6 +272,7 @@ class TargetProperties : public Properties {
void DisableASLRValueChangedCallback();
void InheritTCCValueChangedCallback();
void DisableSTDIOValueChangedCallback();
+ void DebugInfoDURLsChangedCallback();
// Settings checker for target.jit-save-objects-dir:
void CheckJITObjectsDir();
diff --git a/lldb/source/Core/CoreProperties.td b/lldb/source/Core/CoreProperties.td
index 92884258347e9be..865030b0133bbb2 100644
--- a/lldb/source/Core/CoreProperties.td
+++ b/lldb/source/Core/CoreProperties.td
@@ -4,7 +4,7 @@ let Definition = "modulelist" in {
def EnableExternalLookup: Property<"enable-external-lookup", "Boolean">,
Global,
DefaultTrue,
- Desc<"Control the use of external tools and repositories to locate symbol files. Directories listed in target.debug-file-search-paths and directory of the executable are always checked first for separate debug info files. Then depending on this setting: On macOS, Spotlight would be also used to locate a matching .dSYM bundle based on the UUID of the executable. On NetBSD, directory /usr/libdata/debug would be also searched. On platforms other than NetBSD directory /usr/lib/debug would be also searched.">;
+ Desc<"Control the use of external tools and repositories to locate symbol files. Directories listed in target.debug-file-search-paths and directory of the executable are always checked first for separate debug info files. Then depending on this setting: On macOS, Spotlight would be also used to locate a matching .dSYM bundle based on the UUID of the executable. On NetBSD, directory /usr/libdata/debug would be also searched. On platforms other than NetBSD directory /usr/lib/debug would be also searched. If all other methods fail, and the DEBUGINFOD_URLS environment variable is specified, the Debuginfod protocol is used to acquire symbols from a compatible Debuginfod service.">;
def EnableBackgroundLookup: Property<"enable-background-lookup", "Boolean">,
Global,
DefaultFalse,
diff --git a/lldb/source/Core/Debugger.cpp b/lldb/source/Core/Debugger.cpp
index 21f71e449ca5ed0..9a3e82f3e6a2adf 100644
--- a/lldb/source/Core/Debugger.cpp
+++ b/lldb/source/Core/Debugger.cpp
@@ -61,6 +61,8 @@
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/iterator.h"
+#include "llvm/Debuginfod/Debuginfod.h"
+#include "llvm/Debuginfod/HTTPClient.h"
#include "llvm/Support/DynamicLibrary.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/Process.h"
@@ -594,6 +596,9 @@ lldb::DWIMPrintVerbosity Debugger::GetDWIMPrintVerbosity() const {
void Debugger::Initialize(LoadPluginCallbackType load_plugin_callback) {
assert(g_debugger_list_ptr == nullptr &&
"Debugger::Initialize called more than once!");
+ // We might be using the Debuginfod service, so we have to initialize the
+ // HTTPClient *before* any new threads start.
+ llvm::HTTPClient::initialize();
g_debugger_list_mutex_ptr = new std::recursive_mutex();
g_debugger_list_ptr = new DebuggerList();
g_thread_pool = new llvm::ThreadPool(llvm::optimal_concurrency());
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index ee7164d2f050ed1..c036963a1ec6e87 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -4325,6 +4325,7 @@ const std::shared_ptr<SymbolFileDWARFDwo> &SymbolFileDWARF::GetDwpSymbolFile() {
module_spec.GetSymbolFileSpec() =
FileSpec(m_objfile_sp->GetModule()->GetFileSpec().GetPath() + ".dwp");
+ module_spec.GetUUID() = m_objfile_sp->GetUUID();
FileSpecList search_paths = Target::GetDefaultDebugFileSearchPaths();
FileSpec dwp_filespec =
Symbols::LocateExecutableSymbolFile(module_spec, search_paths);
diff --git a/lldb/source/Symbol/CMakeLists.txt b/lldb/source/Symbol/CMakeLists.txt
index cec49b8b2cb4b63..91569f103cf86c8 100644
--- a/lldb/source/Symbol/CMakeLists.txt
+++ b/lldb/source/Symbol/CMakeLists.txt
@@ -48,6 +48,7 @@ add_lldb_library(lldbSymbol NO_PLUGIN_DEPENDENCIES
lldbHost
lldbTarget
lldbUtility
+ LLVMDebuginfod
LINK_COMPONENTS
Support
diff --git a/lldb/source/Symbol/LocateSymbolFile.cpp b/lldb/source/Symbol/LocateSymbolFile.cpp
index 66ee7589ac60499..907287f4b4100b8 100644
--- a/lldb/source/Symbol/LocateSymbolFile.cpp
+++ b/lldb/source/Symbol/LocateSymbolFile.cpp
@@ -25,6 +25,8 @@
#include "lldb/Utility/UUID.h"
#include "llvm/ADT/SmallSet.h"
+#include "llvm/Debuginfod/Debuginfod.h"
+#include "llvm/Object/BuildID.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/ThreadPool.h"
@@ -396,8 +398,22 @@ Symbols::LocateExecutableSymbolFile(const ModuleSpec &module_spec,
}
}
}
-
- return LocateExecutableSymbolFileDsym(module_spec);
+ FileSpec dsym_bundle = LocateExecutableSymbolFileDsym(module_spec);
+ if (dsym_bundle)
+ return dsym_bundle;
+
+ // If we didn't find anything by looking locally, let's try Debuginfod.
+ if (module_uuid.IsValid() && llvm::canUseDebuginfod()) {
+ llvm::object::BuildID build_id(module_uuid.GetBytes());
+ llvm::Expected<std::string> result =
+ llvm::getCachedOrDownloadDebuginfo(build_id);
+ if (result)
+ return FileSpec(*result);
+ // An error is just fine, here...
+ consumeError(result.takeError());
+ }
+ // Just return the empty FileSpec if nothing was found.
+ return dsym_bundle;
}
void Symbols::DownloadSymbolFileAsync(const UUID &uuid) {
diff --git a/lldb/source/Target/Target.cpp b/lldb/source/Target/Target.cpp
index 5f8756c57675c95..1c0ead3677ea386 100644
--- a/lldb/source/Target/Target.cpp
+++ b/lldb/source/Target/Target.cpp
@@ -65,6 +65,7 @@
#include "llvm/ADT/ScopeExit.h"
#include "llvm/ADT/SetVector.h"
+#include "llvm/Debuginfod/Debuginfod.h"
#include <memory>
#include <mutex>
@@ -4180,7 +4181,8 @@ TargetProperties::TargetProperties(Target *target)
ePropertyInheritTCC, [this] { InheritTCCValueChangedCallback(); });
m_collection_sp->SetValueChangedCallback(
ePropertyDisableSTDIO, [this] { DisableSTDIOValueChangedCallback(); });
-
+ m_collection_sp->SetValueChangedCallback(
+ ePropertyDebugInfoDURLs, [this] { DebugInfoDURLsChangedCallback(); });
m_collection_sp->SetValueChangedCallback(
ePropertySaveObjectsDir, [this] { CheckJITObjectsDir(); });
m_experimental_properties_up =
@@ -4892,6 +4894,21 @@ void TargetProperties::SetDebugUtilityExpression(bool debug) {
SetPropertyAtIndex(idx, debug);
}
+Args TargetProperties::GetDebugInfoDURLs() const {
+ Args urls;
+ m_collection_sp->GetPropertyAtIndexAsArgs(ePropertyDebugInfoDURLs, urls);
+ return urls;
+}
+
+void TargetProperties::DebugInfoDURLsChangedCallback() {
+ Args urls = GetDebugInfoDURLs();
+ llvm::SmallVector<llvm::StringRef> dbginfod_urls;
+ std::transform(urls.begin(), urls.end(), dbginfod_urls.end(),
+ [](const auto &obj) { return obj.ref(); });
+ llvm::setDefaultDebuginfodUrls(dbginfod_urls);
+}
+
+
// Target::TargetEventData
Target::TargetEventData::TargetEventData(const lldb::TargetSP &target_sp)
diff --git a/lldb/source/Target/TargetProperties.td b/lldb/source/Target/TargetProperties.td
index 154a6e5919ab0cd..c21c9d86c416c34 100644
--- a/lldb/source/Target/TargetProperties.td
+++ b/lldb/source/Target/TargetProperties.td
@@ -195,6 +195,10 @@ let Definition = "target" in {
def DebugUtilityExpression: Property<"debug-utility-expression", "Boolean">,
DefaultFalse,
Desc<"Enable debugging of LLDB-internal utility expressions.">;
+ def DebugInfoDURLs: Property<"debuginfod-urls", "Array">,
+ Global,
+ ElementType<"String">,
+ Desc<"A list valid debuginfod server URLs that can be used to locate symbol files.">;
}
let Definition = "process_experimental" in {
diff --git a/llvm/include/llvm/Debuginfod/Debuginfod.h b/llvm/include/llvm/Debuginfod/Debuginfod.h
index ec7f5691dda4fbf..9351af27cc5fe2c 100644
--- a/llvm/include/llvm/Debuginfod/Debuginfod.h
+++ b/llvm/include/llvm/Debuginfod/Debuginfod.h
@@ -46,6 +46,10 @@ bool canUseDebuginfod();
/// environment variable.
SmallVector<StringRef> getDefaultDebuginfodUrls();
+/// Sets the list of debuginfod server URLs to query. This overrides the
+/// environment variable DEBUGINFOD_URLS.
+void setDefaultDebuginfodUrls(SmallVector<StringRef> URLs);
+
/// Finds a default local file caching directory for the debuginfod client,
/// first checking DEBUGINFOD_CACHE_PATH.
Expected<std::string> getDefaultDebuginfodCacheDirectory();
diff --git a/llvm/lib/Debuginfod/Debuginfod.cpp b/llvm/lib/Debuginfod/Debuginfod.cpp
index fa4c1a0499f059e..a74dfc5900cdaf5 100644
--- a/llvm/lib/Debuginfod/Debuginfod.cpp
+++ b/llvm/lib/Debuginfod/Debuginfod.cpp
@@ -47,6 +47,10 @@ namespace llvm {
using llvm::object::BuildIDRef;
+SmallVector<StringRef> DebuginfodUrls;
+
+bool DebuginfodUrlsSet = false;
+
static std::string uniqueKey(llvm::StringRef S) {
return utostr(xxh3_64bits(S));
}
@@ -62,15 +66,25 @@ bool canUseDebuginfod() {
}
SmallVector<StringRef> getDefaultDebuginfodUrls() {
- const char *DebuginfodUrlsEnv = std::getenv("DEBUGINFOD_URLS");
- if (DebuginfodUrlsEnv == nullptr)
- return SmallVector<StringRef>();
-
- SmallVector<StringRef> DebuginfodUrls;
- StringRef(DebuginfodUrlsEnv).split(DebuginfodUrls, " ");
+ if (!DebuginfodUrlsSet) {
+ // Only read from the environment variable if the user hasn't already
+ // set the value
+ const char *DebuginfodUrlsEnv = std::getenv("DEBUGINFOD_URLS");
+ if (DebuginfodUrlsEnv != nullptr) {
+ StringRef(DebuginfodUrlsEnv).split(DebuginfodUrls, " ", -1, false);
+ }
+ DebuginfodUrlsSet = true;
+ }
return DebuginfodUrls;
}
+// Override the default debuginfod URL list.
+void setDefaultDebuginfodUrls(SmallVector<StringRef> URLs) {
+ DebuginfodUrls.clear();
+ DebuginfodUrls.insert(DebuginfodUrls.begin(), URLs.begin(), URLs.end());
+ DebuginfodUrlsSet = true;
+}
+
/// Finds a default local file caching directory for the debuginfod client,
/// first checking DEBUGINFOD_CACHE_PATH.
Expected<std::string> getDefaultDebuginfodCacheDirectory() {
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pretty nice feature!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, why did you choose the delimiter as ' ' instead of something like ';'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see a new setting added to lldb that should be the default way to enable the debuginfod support. Just adding something to lldb/source/Target/TargetProperties.td
like:
def DebuginfodURLs: Property<"debuginfod-urls", "Args">,
Global,
DefaultStringValue<"">,
Desc<"A list of debuginfod URLs that will be linearly called to search for debug info.">;
Doesn't need to live in "target.*" though.
If this was added in target, then we could do:
|
First off, thank you for working on this. The functionality The one caveat is that the current implementation is really tied to a platform. All the |
Yes, that specific kind of refactoring seemed like a good idea, but given that this is my first real foray into the LLDB space, I didn't want to bite off that much work to start with. Once I've got the full DEBUGINFOD capabilities plumbed, refactoring it out into a SymbolServer plug-in makes lots of sense. You can also read about the full DebugInfoD protocol (it's not very complicated) from the RedHat blog introducing it in 2019. |
I'd be happy to help with that and I'm sure @clayborg wouldn't mind providing guidance in this space.
That's fine if the new code can be more self contained. This patch is adding things like a |
Because that's how the environment variable works. It was less a choice and more 🤷 Also, ChatGPT tells me that URL's can include semicolons, so maybe it's not a great idea to use that as a delimiter? (I don't think Debuginfod server URL's can include semicolons, but I know they can't include spaces...) |
Are you asking me to create a SymbolServer class for this change, or do you want me to get this diff polished & landed, then create the abstraction before adding anything more? (Either is fine: I just can't quite tell what you're looking for right now) I also agree: it made me someone uncomfortable adding those dependencies where they were added. Even just hiding them in a SymbolServer.h file would help. |
I'd also recommend having a LLDB setting reuse the DEBUGINFOD_URLS syntax, just to avoid creating an unnecessary difference between the LLDB and GDB syntax. |
I was suggesting the latter. As I was thinking about this a bit more and found some spare time, I gave the plugin conversion a try. I've created a PR (#71151) with the initial scaffolding and
|
FWIW the "plugin-ification" has landed so this should be able to move forward as a plugin. |
7409896
to
27f526a
Compare
I updated the diff with the plugin-ified version of the work. It's much cleaner with no debugger-wide changes to speak of (Thanks for the set up for that, @JDevlieghere!). I did not add Debuginfod logging in the LLDB part of the code, as I intend to add diagnostic debugging to the llvm-debuginfod library instead, such that all users will benefit from it. |
lldb/source/Plugins/SymbolLocator/Debuginfod/SymbolLocatorDebuginfod.cpp
Outdated
Show resolved
Hide resolved
lldb/source/Plugins/SymbolLocator/Debuginfod/SymbolLocatorDebuginfod.cpp
Outdated
Show resolved
Hide resolved
lldb/source/Plugins/SymbolLocator/Debuginfod/SymbolLocatorDebuginfod.cpp
Outdated
Show resolved
Hide resolved
lldb/source/Plugins/SymbolLocator/Debuginfod/SymbolLocatorDebuginfod.cpp
Outdated
Show resolved
Hide resolved
lldb/source/Plugins/SymbolLocator/Debuginfod/SymbolLocatorDebuginfodProperties.td
Outdated
Show resolved
Hide resolved
Looks like I missed committing some staged edits. Hold on before reviewing... |
d09be01
to
e9b6240
Compare
lldb/source/Plugins/SymbolLocator/Debuginfod/SymbolLocatorDebuginfod.cpp
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, a few suggestions in llvm/lib/Debuginfod/Debuginfod.cpp
. I will let Jonas comment if the settings are done the right way in the plug-in manager. Plug-in code looks fine.
This breaks targeting macOS versions older than 10.12 for unnecessary reasons:
You should be able to use llvm::sys::RWMutex so please switch to that. See https://reviews.llvm.org/D138423 for an example. |
Noooooooooooo! I'll get this fixed up ASAP. Should I open an issue, or just put up a new PR? (Thanks for the example!) |
This change broke building against LLVM dylib:
Full build log: |
Sounds like I get to dig in deeper to LLVM's build, now. Hooray 🤣 |
@mgorny Can you please point me at where to find all the configuration for those builds? I'm stuck on a weirdo CentOS build (or MacOS) and can't manage to get my CMake configuration stuff to let me build a dylib/so. I'll get on this first thing tomorrow. |
@mgorny I've burned 3 hours this morning trying to get the visible configuration flags from that log file to function before my patch on (my weirdo CentOS 9) Linux box, or cause problems after my patch on macOS, and failed on both fronts. I'm stuck until you can get me the contents of |
Sounds like some configs don't create |
Nothing that I could find. The library is written such that if the configuration doesn't include either libcurl or httplib then all the functions just fail to do anything, so from what I can tell there'd never be a reason to disable it. |
I'm sorry for not getting back to you earlier. The log relevant to building LLVM itself is this (22M uncompressed): sys-devel:llvm-18.0.0_pre20231206:20231206-072428.log.gz I think the most relevant part is:
We don't install At this point, I think the only reasonable solution here is to include |
I've also confirmed that installing
I suspect that even if we hadn't used the dylib at all and installed all static libraries, it would be impossible to build LLDB against this install of LLVM as it wouldn't know about |
Is there some documentation that both me & google are missing somewhere? I'm trying to read between the lines of the scenario that's broken, with nothing but a log (that refers to a bunch of pretty important configuration files from somewhere I can't find) with a repo configuration that, again, I can't figure out. You say that we can't "build against LLMV as a dylib". Is this trying to build LLDB stand-alone, without a full LLVM repo around it? I'm also assuming you're actually talking about an SO and not a dylib, since the build is on Linux, not macOS. I really want to help, but I'm recently new to this space, so I'm missing a bunch of tribal knowledge. Alternatively, if you have a repro, and can fix it by adding LLVMDebuginfod to libLLVM, that seems reasonable to me, given the design of LLVMDebuginfod (small, falls back to no functionality without libcurl, etc...) I've burned all day today trying to get a build & repro going, and have failed (my build is quite complicated). I'll try this on a personal machine tomorrow which should be simpler (though a couple orders of magnitude slower) than my work behemoth... |
"Dylib" is LLVM-speech for And yeah, if you make a patch to add LLVMDebuginfod to libLLVM, I can test that here. |
I've spent 5 or 6 more hours fighting this and I'm fully stuck on "No Repro". I have a libLLVM-git18.so built, independent of building LLDB. Then I configured LLDB to build against that as a standalone entity and validated that it worked (all on my personal machine, which is running Kali Linux under WSL2: basically just Ubuntu). Below, you'll find shell scripts that I've been using to configure the build (both for LLVM and then for the standalone LLDB build). Please tell me what I need to change in these scripts to repro the problem.
and this one for the lldb build:
|
Did you |
Ran the first script, ninja'd, sudo nina install-distribution, then ran the second script and ninja'd again. I don't see the build depending on what's been installed, as it looks like setting LLVM_DIR makes the .so dependency on the binary sitting in the rel-llvm/lib directory instead. |
I'm afraid that log's non-verbose (i.e. missing |
I'll try to get this together tonight or tomorrow morning. |
doBuild.sh:
llvm-config.sh:
lldb-config.sh:
And, here's the output of that whole mess: |
It's linking to |
Ping. |
Sorry: nice long holiday break. So calling that library a "left over from a previous build" confuses me. What's the point of the |
Ah, sorry, I didn't notice that. We're not passing |
I've returned to this after getting some other work up for a PR, and I'm stuck again. If I remove LLVM_DIR, the thing doesn't get anywhere. It explicitly asks for LLVM_DIR, if I work through that, then it asks for Clang_DIR. I'm getting frustrated because it seems like the configuration you're stuck on isn't supported (anymore: I'm guessing it's a holdover from before the monorepo). I just can't find any documentation for building the way you're describing. Everything says to use LLVM_DIR. So, if I set LLVM_DIR, it (correctly) links against libDebuginfod.a just fine. And if I don't have LLVM_DIR set, there'a whole lot of stuff that fails to work properly. Is there some docker container I can spin up with your configuration already setup or something? I'm completely stuck again. |
No, I don't have a "ready" configuration. You could start off Perhaps I should just send the "obvious" fix without bothering you. I can reproduce it anytime, and I doubt there's any other fix possible than adding |
Sure. That's probably the solution. I just can't get a reliable repro, so I can only validate that it doesn't break the mainstream monorepo build. I'm happy to shepherd any diff you've indicated can tell me it's fixed on your configuration. I truly appreciate the different worlds, here. You toss out "install a particular build of gentoo, which you've probably never done" while complaining about having to clone a repo I have on 5 different machines 🤣 |
I'm sorry for the complexity. I've finally gotten around to figuring it out, and it turns out it's a problem with CMake files: #79305 fixes it for me. |
I've plumbed the LLVM DebugInfoD client into LLDB, and added automatic downloading of DWP files to the SymbolFileDWARF.cpp plugin. If you have DEBUGINFOD_URLS set to a space delimited set of web servers, LLDB will try to use them as a last resort when searching for DWP files. If you do not have that environment variable set, nothing should be changed. There's also a setting, per @clayborg 's suggestion, that will override the environment variable, or can be used instead of the environment variable. The setting is why I also needed to add an API to the llvm-debuginfod library
Test Plan:
Suggestions are welcome here. I should probably have some positive and negative tests, but I wanted to get the diff up for people who have a clue what they're doing to rip it to pieces before spending too much time validating the initial implementation.