Skip to content

DEBUGINFOD based DWP acquisition for LLDB #70996

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Dec 4, 2023
Merged

Conversation

kevinfrei
Copy link
Contributor

I've plumbed the LLVM DebugInfoD client into LLDB, and added automatic downloading of DWP files to the SymbolFileDWARF.cpp plugin. If you have DEBUGINFOD_URLS set to a space delimited set of web servers, LLDB will try to use them as a last resort when searching for DWP files. If you do not have that environment variable set, nothing should be changed. There's also a setting, per @clayborg 's suggestion, that will override the environment variable, or can be used instead of the environment variable. The setting is why I also needed to add an API to the llvm-debuginfod library

Test Plan:

Suggestions are welcome here. I should probably have some positive and negative tests, but I wanted to get the diff up for people who have a clue what they're doing to rip it to pieces before spending too much time validating the initial implementation.

@llvmbot
Copy link
Member

llvmbot commented Nov 1, 2023

@llvm/pr-subscribers-lldb

@llvm/pr-subscribers-debuginfo

Author: Kevin Frei (kevinfrei)

Changes

I've plumbed the LLVM DebugInfoD client into LLDB, and added automatic downloading of DWP files to the SymbolFileDWARF.cpp plugin. If you have DEBUGINFOD_URLS set to a space delimited set of web servers, LLDB will try to use them as a last resort when searching for DWP files. If you do not have that environment variable set, nothing should be changed. There's also a setting, per @clayborg 's suggestion, that will override the environment variable, or can be used instead of the environment variable. The setting is why I also needed to add an API to the llvm-debuginfod library

Test Plan:

Suggestions are welcome here. I should probably have some positive and negative tests, but I wanted to get the diff up for people who have a clue what they're doing to rip it to pieces before spending too much time validating the initial implementation.


Full diff: https://github.com/llvm/llvm-project/pull/70996.diff

10 Files Affected:

  • (modified) lldb/include/lldb/Target/Target.h (+3)
  • (modified) lldb/source/Core/CoreProperties.td (+1-1)
  • (modified) lldb/source/Core/Debugger.cpp (+5)
  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+1)
  • (modified) lldb/source/Symbol/CMakeLists.txt (+1)
  • (modified) lldb/source/Symbol/LocateSymbolFile.cpp (+18-2)
  • (modified) lldb/source/Target/Target.cpp (+18-1)
  • (modified) lldb/source/Target/TargetProperties.td (+4)
  • (modified) llvm/include/llvm/Debuginfod/Debuginfod.h (+4)
  • (modified) llvm/lib/Debuginfod/Debuginfod.cpp (+20-6)
diff --git a/lldb/include/lldb/Target/Target.h b/lldb/include/lldb/Target/Target.h
index 82045988018b606..cd5c88767c900d1 100644
--- a/lldb/include/lldb/Target/Target.h
+++ b/lldb/include/lldb/Target/Target.h
@@ -258,6 +258,8 @@ class TargetProperties : public Properties {
 
   bool GetDebugUtilityExpression() const;
 
+  Args GetDebugInfoDURLs() const;
+
 private:
   // Callbacks for m_launch_info.
   void Arg0ValueChangedCallback();
@@ -270,6 +272,7 @@ class TargetProperties : public Properties {
   void DisableASLRValueChangedCallback();
   void InheritTCCValueChangedCallback();
   void DisableSTDIOValueChangedCallback();
+  void DebugInfoDURLsChangedCallback();
 
   // Settings checker for target.jit-save-objects-dir:
   void CheckJITObjectsDir();
diff --git a/lldb/source/Core/CoreProperties.td b/lldb/source/Core/CoreProperties.td
index 92884258347e9be..865030b0133bbb2 100644
--- a/lldb/source/Core/CoreProperties.td
+++ b/lldb/source/Core/CoreProperties.td
@@ -4,7 +4,7 @@ let Definition = "modulelist" in {
   def EnableExternalLookup: Property<"enable-external-lookup", "Boolean">,
     Global,
     DefaultTrue,
-    Desc<"Control the use of external tools and repositories to locate symbol files. Directories listed in target.debug-file-search-paths and directory of the executable are always checked first for separate debug info files. Then depending on this setting: On macOS, Spotlight would be also used to locate a matching .dSYM bundle based on the UUID of the executable. On NetBSD, directory /usr/libdata/debug would be also searched. On platforms other than NetBSD directory /usr/lib/debug would be also searched.">;
+    Desc<"Control the use of external tools and repositories to locate symbol files. Directories listed in target.debug-file-search-paths and directory of the executable are always checked first for separate debug info files. Then depending on this setting: On macOS, Spotlight would be also used to locate a matching .dSYM bundle based on the UUID of the executable. On NetBSD, directory /usr/libdata/debug would be also searched. On platforms other than NetBSD directory /usr/lib/debug would be also searched. If all other methods fail, and the DEBUGINFOD_URLS environment variable is specified, the Debuginfod protocol is used to acquire symbols from a compatible Debuginfod service.">;
   def EnableBackgroundLookup: Property<"enable-background-lookup", "Boolean">,
     Global,
     DefaultFalse,
diff --git a/lldb/source/Core/Debugger.cpp b/lldb/source/Core/Debugger.cpp
index 21f71e449ca5ed0..9a3e82f3e6a2adf 100644
--- a/lldb/source/Core/Debugger.cpp
+++ b/lldb/source/Core/Debugger.cpp
@@ -61,6 +61,8 @@
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/iterator.h"
+#include "llvm/Debuginfod/Debuginfod.h"
+#include "llvm/Debuginfod/HTTPClient.h"
 #include "llvm/Support/DynamicLibrary.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/Process.h"
@@ -594,6 +596,9 @@ lldb::DWIMPrintVerbosity Debugger::GetDWIMPrintVerbosity() const {
 void Debugger::Initialize(LoadPluginCallbackType load_plugin_callback) {
   assert(g_debugger_list_ptr == nullptr &&
          "Debugger::Initialize called more than once!");
+  // We might be using the Debuginfod service, so we have to initialize the
+  // HTTPClient *before* any new threads start.
+  llvm::HTTPClient::initialize();
   g_debugger_list_mutex_ptr = new std::recursive_mutex();
   g_debugger_list_ptr = new DebuggerList();
   g_thread_pool = new llvm::ThreadPool(llvm::optimal_concurrency());
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index ee7164d2f050ed1..c036963a1ec6e87 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -4325,6 +4325,7 @@ const std::shared_ptr<SymbolFileDWARFDwo> &SymbolFileDWARF::GetDwpSymbolFile() {
     module_spec.GetSymbolFileSpec() =
         FileSpec(m_objfile_sp->GetModule()->GetFileSpec().GetPath() + ".dwp");
 
+    module_spec.GetUUID() = m_objfile_sp->GetUUID();
     FileSpecList search_paths = Target::GetDefaultDebugFileSearchPaths();
     FileSpec dwp_filespec =
         Symbols::LocateExecutableSymbolFile(module_spec, search_paths);
diff --git a/lldb/source/Symbol/CMakeLists.txt b/lldb/source/Symbol/CMakeLists.txt
index cec49b8b2cb4b63..91569f103cf86c8 100644
--- a/lldb/source/Symbol/CMakeLists.txt
+++ b/lldb/source/Symbol/CMakeLists.txt
@@ -48,6 +48,7 @@ add_lldb_library(lldbSymbol NO_PLUGIN_DEPENDENCIES
     lldbHost
     lldbTarget
     lldbUtility
+    LLVMDebuginfod
 
   LINK_COMPONENTS
     Support
diff --git a/lldb/source/Symbol/LocateSymbolFile.cpp b/lldb/source/Symbol/LocateSymbolFile.cpp
index 66ee7589ac60499..907287f4b4100b8 100644
--- a/lldb/source/Symbol/LocateSymbolFile.cpp
+++ b/lldb/source/Symbol/LocateSymbolFile.cpp
@@ -25,6 +25,8 @@
 #include "lldb/Utility/UUID.h"
 
 #include "llvm/ADT/SmallSet.h"
+#include "llvm/Debuginfod/Debuginfod.h"
+#include "llvm/Object/BuildID.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/ThreadPool.h"
 
@@ -396,8 +398,22 @@ Symbols::LocateExecutableSymbolFile(const ModuleSpec &module_spec,
       }
     }
   }
-
-  return LocateExecutableSymbolFileDsym(module_spec);
+  FileSpec dsym_bundle = LocateExecutableSymbolFileDsym(module_spec);
+  if (dsym_bundle)
+    return dsym_bundle;
+
+  // If we didn't find anything by looking locally, let's try Debuginfod.
+  if (module_uuid.IsValid() && llvm::canUseDebuginfod()) {
+    llvm::object::BuildID build_id(module_uuid.GetBytes());
+    llvm::Expected<std::string> result =
+        llvm::getCachedOrDownloadDebuginfo(build_id);
+    if (result)
+      return FileSpec(*result);
+    // An error is just fine, here...
+    consumeError(result.takeError());
+  }
+  // Just return the empty FileSpec if nothing was found.
+  return dsym_bundle;
 }
 
 void Symbols::DownloadSymbolFileAsync(const UUID &uuid) {
diff --git a/lldb/source/Target/Target.cpp b/lldb/source/Target/Target.cpp
index 5f8756c57675c95..1c0ead3677ea386 100644
--- a/lldb/source/Target/Target.cpp
+++ b/lldb/source/Target/Target.cpp
@@ -65,6 +65,7 @@
 
 #include "llvm/ADT/ScopeExit.h"
 #include "llvm/ADT/SetVector.h"
+#include "llvm/Debuginfod/Debuginfod.h"
 
 #include <memory>
 #include <mutex>
@@ -4180,7 +4181,8 @@ TargetProperties::TargetProperties(Target *target)
         ePropertyInheritTCC, [this] { InheritTCCValueChangedCallback(); });
     m_collection_sp->SetValueChangedCallback(
         ePropertyDisableSTDIO, [this] { DisableSTDIOValueChangedCallback(); });
-
+    m_collection_sp->SetValueChangedCallback(
+        ePropertyDebugInfoDURLs, [this] { DebugInfoDURLsChangedCallback(); });
     m_collection_sp->SetValueChangedCallback(
         ePropertySaveObjectsDir, [this] { CheckJITObjectsDir(); });
     m_experimental_properties_up =
@@ -4892,6 +4894,21 @@ void TargetProperties::SetDebugUtilityExpression(bool debug) {
   SetPropertyAtIndex(idx, debug);
 }
 
+Args TargetProperties::GetDebugInfoDURLs() const {
+  Args urls;
+  m_collection_sp->GetPropertyAtIndexAsArgs(ePropertyDebugInfoDURLs, urls);
+  return urls;
+}
+
+void TargetProperties::DebugInfoDURLsChangedCallback() {
+  Args urls = GetDebugInfoDURLs();
+  llvm::SmallVector<llvm::StringRef> dbginfod_urls;
+  std::transform(urls.begin(), urls.end(), dbginfod_urls.end(),
+                 [](const auto &obj) { return obj.ref(); });
+  llvm::setDefaultDebuginfodUrls(dbginfod_urls);
+}
+
+
 // Target::TargetEventData
 
 Target::TargetEventData::TargetEventData(const lldb::TargetSP &target_sp)
diff --git a/lldb/source/Target/TargetProperties.td b/lldb/source/Target/TargetProperties.td
index 154a6e5919ab0cd..c21c9d86c416c34 100644
--- a/lldb/source/Target/TargetProperties.td
+++ b/lldb/source/Target/TargetProperties.td
@@ -195,6 +195,10 @@ let Definition = "target" in {
   def DebugUtilityExpression: Property<"debug-utility-expression", "Boolean">,
     DefaultFalse,
     Desc<"Enable debugging of LLDB-internal utility expressions.">;
+  def DebugInfoDURLs: Property<"debuginfod-urls", "Array">,
+    Global,
+    ElementType<"String">,
+    Desc<"A list valid debuginfod server URLs that can be used to locate symbol files.">;
 }
 
 let Definition = "process_experimental" in {
diff --git a/llvm/include/llvm/Debuginfod/Debuginfod.h b/llvm/include/llvm/Debuginfod/Debuginfod.h
index ec7f5691dda4fbf..9351af27cc5fe2c 100644
--- a/llvm/include/llvm/Debuginfod/Debuginfod.h
+++ b/llvm/include/llvm/Debuginfod/Debuginfod.h
@@ -46,6 +46,10 @@ bool canUseDebuginfod();
 /// environment variable.
 SmallVector<StringRef> getDefaultDebuginfodUrls();
 
+/// Sets the list of debuginfod server URLs to query. This overrides the
+/// environment variable DEBUGINFOD_URLS.
+void setDefaultDebuginfodUrls(SmallVector<StringRef> URLs);
+
 /// Finds a default local file caching directory for the debuginfod client,
 /// first checking DEBUGINFOD_CACHE_PATH.
 Expected<std::string> getDefaultDebuginfodCacheDirectory();
diff --git a/llvm/lib/Debuginfod/Debuginfod.cpp b/llvm/lib/Debuginfod/Debuginfod.cpp
index fa4c1a0499f059e..a74dfc5900cdaf5 100644
--- a/llvm/lib/Debuginfod/Debuginfod.cpp
+++ b/llvm/lib/Debuginfod/Debuginfod.cpp
@@ -47,6 +47,10 @@ namespace llvm {
 
 using llvm::object::BuildIDRef;
 
+SmallVector<StringRef> DebuginfodUrls;
+
+bool DebuginfodUrlsSet = false;
+
 static std::string uniqueKey(llvm::StringRef S) {
   return utostr(xxh3_64bits(S));
 }
@@ -62,15 +66,25 @@ bool canUseDebuginfod() {
 }
 
 SmallVector<StringRef> getDefaultDebuginfodUrls() {
-  const char *DebuginfodUrlsEnv = std::getenv("DEBUGINFOD_URLS");
-  if (DebuginfodUrlsEnv == nullptr)
-    return SmallVector<StringRef>();
-
-  SmallVector<StringRef> DebuginfodUrls;
-  StringRef(DebuginfodUrlsEnv).split(DebuginfodUrls, " ");
+  if (!DebuginfodUrlsSet) {
+    // Only read from the environment variable if the user hasn't already
+    // set the value
+    const char *DebuginfodUrlsEnv = std::getenv("DEBUGINFOD_URLS");
+    if (DebuginfodUrlsEnv != nullptr) {
+      StringRef(DebuginfodUrlsEnv).split(DebuginfodUrls, " ", -1, false);
+    }
+    DebuginfodUrlsSet = true;
+  }
   return DebuginfodUrls;
 }
 
+// Override the default debuginfod URL list.
+void setDefaultDebuginfodUrls(SmallVector<StringRef> URLs) {
+  DebuginfodUrls.clear();
+  DebuginfodUrls.insert(DebuginfodUrls.begin(), URLs.begin(), URLs.end());
+  DebuginfodUrlsSet = true;
+}
+
 /// Finds a default local file caching directory for the debuginfod client,
 /// first checking DEBUGINFOD_CACHE_PATH.
 Expected<std::string> getDefaultDebuginfodCacheDirectory() {

Copy link

github-actions bot commented Nov 1, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Member

@walter-erquinigo walter-erquinigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty nice feature!

Copy link
Member

@bulbazord bulbazord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why did you choose the delimiter as ' ' instead of something like ';'?

Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a new setting added to lldb that should be the default way to enable the debuginfod support. Just adding something to lldb/source/Target/TargetProperties.td like:

  def DebuginfodURLs: Property<"debuginfod-urls", "Args">,
    Global,
    DefaultStringValue<"">,
    Desc<"A list of debuginfod URLs that will be linearly called to search for debug info.">;

Doesn't need to live in "target.*" though.

@clayborg
Copy link
Collaborator

clayborg commented Nov 2, 2023

I would like to see a new setting added to lldb that should be the default way to enable the debuginfod support. Just adding something to lldb/source/Target/TargetProperties.td like:

  def DebuginfodURLs: Property<"debuginfod-urls", "Args">,
    Global,
    DefaultStringValue<"">,
    Desc<"A list of debuginfod URLs that will be linearly called to search for debug info.">;

Doesn't need to live in "target.*" though.

If this was added in target, then we could do:

(lldb) settings set target.debuginfod-urls http://url1 http://url2 http://url3

@JDevlieghere
Copy link
Member

JDevlieghere commented Nov 2, 2023

First off, thank you for working on this. debuginfod has been on my radar since support was added to LLVM and I was curious at which point someone was going to add support for it to LLDB. I wasn't super familiar with what exactly it provides, and in case others here aren't either, I found the following documentation very helpful: https://llvm.org/doxygen/Debuginfod_8h.html.
TL;DR: it gives you a way to get sources, (symbol rich) executables and debug info files.

The functionality debuginfod offers seems very similar to dsymForUUID (see https://lldb.llvm.org/use/symbols.html). We have series of functions such as LocateExecutableObjectFile and LocateExecutableSymbolFile in LocateSymbolFile.h that abstract over this. It seems appropriate that we use the same abstractions for debuginfod which avoids the special casing this patch introduces and a lot of things will "just work".

The one caveat is that the current implementation is really tied to a platform. All the dsymForUUID stuff is implemented in LocateSymbolFileMacOSX.cpp. I think we might need to move this to a Plugin model where we can have multiple implementations. The plugin name SymbolVendor is already taken so I would call this the SymbolServer plugin. Converting the current model to a plugin should be done in separate patch and myself or other Apple folks can make sure this doesn't break the dsymForUUID paths (we have some tests but not enough given how critical this is for us). With that done, it should be trivial to add a plugin implementation that calls into debuginfod

@kevinfrei
Copy link
Contributor Author

First off, thank you for working on this. debuginfod has been on my radar since support was added to LLVM and I was curious at which point someone was going to add support for it to LLDB. I wasn't super familiar with what exactly it provides, and in case others here aren't either, I found the following documentation very helpful: https://llvm.org/doxygen/Debuginfod_8h.html. TL;DR: it gives you a way to get sources, (symbol rich) executables and debug info files.

The functionality debuginfod offers seems very similar to dsymForUUID (see https://lldb.llvm.org/use/symbols.html). We have series of functions such as LocateExecutableObjectFile and LocateExecutableSymbolFile in LocateSymbolFile.h that abstract over this. It seems appropriate that we use the same abstractions for debuginfod which avoids the special casing this patch introduces and a lot of things will "just work".

The one caveat is that the current implementation is really tied to a platform. All the dsymForUUID stuff is implemented in LocateSymbolFileMacOSX.cpp. I think we might need to move this to a Plugin model where we can have multiple implementations. The plugin name SymbolVendor is already taken so I would call this the SymbolServer plugin. Converting the current model to a plugin should be done in separate patch and myself or other Apple folks can make sure this doesn't break the dsymForUUID paths (we have some tests but not enough given how critical this is for us). With that done, it should be trivial to add a plugin implementation that calls into debuginfod

Yes, that specific kind of refactoring seemed like a good idea, but given that this is my first real foray into the LLDB space, I didn't want to bite off that much work to start with. Once I've got the full DEBUGINFOD capabilities plumbed, refactoring it out into a SymbolServer plug-in makes lots of sense.

You can also read about the full DebugInfoD protocol (it's not very complicated) from the RedHat blog introducing it in 2019.

@JDevlieghere
Copy link
Member

Yes, that specific kind of refactoring seemed like a good idea, but given that this is my first real foray into the LLDB space, I didn't want to bite off that much work to start with.

I'd be happy to help with that and I'm sure @clayborg wouldn't mind providing guidance in this space.

Once I've got the full DEBUGINFOD capabilities plumbed, refactoring it out into a SymbolServer plug-in makes lots of sense.

That's fine if the new code can be more self contained. This patch is adding things like a HTTPClient to the debugger or a dependency on Debuginfod.h in Target. I don't think either of those things belong there. At the very least should be abstracted behind LocateSymbol. All its functions are static which makes that hard, which is why I suggested symbol server plugin. Maybe as an intermediate step we can start by turning that class into something that can be instantiated?

@kevinfrei
Copy link
Contributor Author

kevinfrei commented Nov 2, 2023

Out of curiosity, why did you choose the delimiter as ' ' instead of something like ';'?

Because that's how the environment variable works. It was less a choice and more 🤷

Also, ChatGPT tells me that URL's can include semicolons, so maybe it's not a great idea to use that as a delimiter? (I don't think Debuginfod server URL's can include semicolons, but I know they can't include spaces...)

@kevinfrei
Copy link
Contributor Author

Yes, that specific kind of refactoring seemed like a good idea, but given that this is my first real foray into the LLDB space, I didn't want to bite off that much work to start with.

I'd be happy to help with that and I'm sure @clayborg wouldn't mind providing guidance in this space.

Once I've got the full DEBUGINFOD capabilities plumbed, refactoring it out into a SymbolServer plug-in makes lots of sense.

That's fine if the new code can be more self contained. This patch is adding things like a HTTPClient to the debugger or a dependency on Debuginfod.h in Target. I don't think either of those things belong there. At the very least should be abstracted behind LocateSymbol. All its functions are static which makes that hard, which is why I suggested symbol server plugin. Maybe as an intermediate step we can start by turning that class into something that can be instantiated?

Are you asking me to create a SymbolServer class for this change, or do you want me to get this diff polished & landed, then create the abstraction before adding anything more? (Either is fine: I just can't quite tell what you're looking for right now)

I also agree: it made me someone uncomfortable adding those dependencies where they were added. Even just hiding them in a SymbolServer.h file would help.

@mysterymath
Copy link
Contributor

Out of curiosity, why did you choose the delimiter as ' ' instead of something like ';'?

Because that's how the environment variable works. It was less a choice and more 🤷

Also, ChatGPT tells me that URL's can include semicolons, so maybe it's not a great idea to use that as a delimiter? (I don't think Debuginfod server URL's can include semicolons, but I know they can't include spaces...)

I'd also recommend having a LLDB setting reuse the DEBUGINFOD_URLS syntax, just to avoid creating an unnecessary difference between the LLDB and GDB syntax.

@JDevlieghere
Copy link
Member

Are you asking me to create a SymbolServer class for this change, or do you want me to get this diff polished & landed, then create the abstraction before adding anything more? (Either is fine: I just can't quite tell what you're looking for right now)

I was suggesting the latter.

As I was thinking about this a bit more and found some spare time, I gave the plugin conversion a try. I've created a PR (#71151) with the initial scaffolding and LocateExecutableObjectFile moved into the plugin. Once that's complete, everything you're adding should fit perfectly in a new Debuginfod SymbolLocator plugin:

  • The call to llvm::HTTPClient::initialize() can be done in the plugin instantiation.
  • Plugins can have their own settings so we can set the URLs there.
  • Because of the way plugins work, they are tried in the order defined in the CMakeLists.txt, so if you make sure the debuginfod comes last, it'll be the fallback automatically.

@JDevlieghere
Copy link
Member

FWIW the "plugin-ification" has landed so this should be able to move forward as a plugin.

@kevinfrei
Copy link
Contributor Author

I updated the diff with the plugin-ified version of the work. It's much cleaner with no debugger-wide changes to speak of (Thanks for the set up for that, @JDevlieghere!). I did not add Debuginfod logging in the LLDB part of the code, as I intend to add diagnostic debugging to the llvm-debuginfod library instead, such that all users will benefit from it.

@kevinfrei
Copy link
Contributor Author

Looks like I missed committing some staged edits. Hold on before reviewing...

Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few suggestions in llvm/lib/Debuginfod/Debuginfod.cpp. I will let Jonas comment if the settings are done the right way in the plug-in manager. Plug-in code looks fine.

@kevinfrei kevinfrei deleted the debuginfod branch December 4, 2023 19:47
@nico
Copy link
Contributor

nico commented Dec 4, 2023

This breaks targeting macOS versions older than 10.12 for unnecessary reasons:

../../llvm/lib/Debuginfod/Debuginfod.cpp:54:6: error: 'shared_mutex' is unavailable: introduced in macOS 10.12
std::shared_mutex UrlsMutex;
     ^

You should be able to use llvm::sys::RWMutex so please switch to that. See https://reviews.llvm.org/D138423 for an example.

@kevinfrei
Copy link
Contributor Author

This breaks targeting macOS versions older than 10.12 for unnecessary reasons:

../../llvm/lib/Debuginfod/Debuginfod.cpp:54:6: error: 'shared_mutex' is unavailable: introduced in macOS 10.12
std::shared_mutex UrlsMutex;
     ^

You should be able to use llvm::sys::RWMutex so please switch to that. See https://reviews.llvm.org/D138423 for an example.

Noooooooooooo!

I'll get this fixed up ASAP. Should I open an issue, or just put up a new PR? (Thanks for the example!)

@mgorny
Copy link
Member

mgorny commented Dec 6, 2023

This change broke building against LLVM dylib:

/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lLLVMDebuginfod: No such file or directory
collect2: error: ld returned 1 exit status
samu: subcommand failed

Full build log:
dev-util:lldb-18.0.0_pre20231206:20231206-163149.log

@kevinfrei
Copy link
Contributor Author

This change broke building against LLVM dylib:

/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lLLVMDebuginfod: No such file or directory
collect2: error: ld returned 1 exit status
samu: subcommand failed

Full build log: dev-util:lldb-18.0.0_pre20231206:20231206-163149.log

Sounds like I get to dig in deeper to LLVM's build, now. Hooray 🤣

@kevinfrei
Copy link
Contributor Author

This change broke building against LLVM dylib:

/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lLLVMDebuginfod: No such file or directory
collect2: error: ld returned 1 exit status
samu: subcommand failed

Full build log: dev-util:lldb-18.0.0_pre20231206:20231206-163149.log

@mgorny Can you please point me at where to find all the configuration for those builds? I'm stuck on a weirdo CentOS build (or MacOS) and can't manage to get my CMake configuration stuff to let me build a dylib/so. I'll get on this first thing tomorrow.

@kevinfrei
Copy link
Contributor Author

This change broke building against LLVM dylib:

/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lLLVMDebuginfod: No such file or directory
collect2: error: ld returned 1 exit status
samu: subcommand failed

Full build log: dev-util:lldb-18.0.0_pre20231206:20231206-163149.log

@mgorny Can you please point me at where to find all the configuration for those builds? I'm stuck on a weirdo CentOS build (or MacOS) and can't manage to get my CMake configuration stuff to let me build a dylib/so. I'll get on this first thing tomorrow.

This change broke building against LLVM dylib:

/usr/lib/gcc/x86_64-pc-linux-gnu/13/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lLLVMDebuginfod: No such file or directory
collect2: error: ld returned 1 exit status
samu: subcommand failed

Full build log: dev-util:lldb-18.0.0_pre20231206:20231206-163149.log

@mgorny I've burned 3 hours this morning trying to get the visible configuration flags from that log file to function before my patch on (my weirdo CentOS 9) Linux box, or cause problems after my patch on macOS, and failed on both fronts. I'm stuck until you can get me the contents of gentoo_common_config.cmake (or tell me where to find it...)

@clayborg
Copy link
Collaborator

clayborg commented Dec 7, 2023

Sounds like some configs don't create LLVMDebuginfod target possibly because it isn't enabled. Is there a cmake option to disable LLVMDebuginfod or some magic in the Cmake config file for this target?

@kevinfrei
Copy link
Contributor Author

Sounds like some configs don't create LLVMDebuginfod target possibly because it isn't enabled. Is there a cmake option to disable LLVMDebuginfod or some magic in the Cmake config file for this target?

Nothing that I could find. The library is written such that if the configuration doesn't include either libcurl or httplib then all the functions just fail to do anything, so from what I can tell there'd never be a reason to disable it.

@mgorny
Copy link
Member

mgorny commented Dec 10, 2023

I'm sorry for not getting back to you earlier. The log relevant to building LLVM itself is this (22M uncompressed):

sys-devel:llvm-18.0.0_pre20231206:20231206-072428.log.gz

I think the most relevant part is:

-DLLVM_DISTRIBUTION_COMPONENTS="LLVM;LTO;Remarks;llvm-config;cmake-exports;llvm-headers;LLVMDemangle;LLVMSupport;LLVMTableGen;llvm_gtest;llvm_gtest_main;LLVMTestingAnnotations;LLVMTestingSupport;llvm-tblgen;FileCheck;llvm-PerfectShuffle;count;not;yaml-bench;UnicodeNameMappingGenerator;bugpoint;dsymutil;llc;lli;lli-child-target;llvm-addr2line;llvm-ar;llvm-as;llvm-bcanalyzer;llvm-bitcode-strip;llvm-c-test;llvm-cat;llvm-cfi-verify;llvm-config;llvm-cov;llvm-cvtres;llvm-cxxdump;llvm-cxxfilt;llvm-cxxmap;llvm-debuginfo-analyzer;llvm-debuginfod-find;llvm-diff;llvm-dis;llvm-dlltool;llvm-dwarfdump;llvm-dwarfutil;llvm-dwp;llvm-exegesis;llvm-extract;llvm-gsymutil;llvm-ifs;llvm-install-name-tool;llvm-jitlink;llvm-jitlink-executor;llvm-lib;llvm-libtool-darwin;llvm-link;llvm-lipo;llvm-lto;llvm-lto2;llvm-mc;llvm-mca;llvm-ml;llvm-modextract;llvm-mt;llvm-nm;llvm-objcopy;llvm-objdump;llvm-opt-report;llvm-otool;llvm-pdbutil;llvm-profdata;llvm-profgen;llvm-ranlib;llvm-rc;llvm-readelf;llvm-readobj;llvm-readtapi;llvm-reduce;llvm-remarkutil;llvm-rtdyld;llvm-sim;llvm-size;llvm-split;llvm-stress;llvm-strings;llvm-strip;llvm-symbolizer;llvm-tli-checker;llvm-undname;llvm-windres;llvm-xray;obj2yaml;opt;sancov;sanstats;split-file;verify-uselistorder;yaml2obj;opt-viewer;docs-dsymutil-man;docs-llvm-dwarfdump-man;docs-llvm-man;docs-llvm-html;LLVMgold;llvm-debuginfod"

We don't install LLVMDebuginfod.a at all right now. I suppose I could try doing that but last we tried to do something similar, all hell broke loose because of its dependency on other libraries from LLVM. It's a mess, I'm afraid.

At this point, I think the only reasonable solution here is to include LLVMDebuginfod in libLLVM.

@mgorny
Copy link
Member

mgorny commented Dec 10, 2023

I've also confirmed that installing LLVMDebuginfod target won't help since it references other internal targets that are created at build-time and not included as part of the installed CMake configuration:

CMake Error at /usr/lib/llvm/18/lib64/cmake/llvm/LLVMExports.cmake:90 (set_target_properties):
  The link interface of target "LLVMDebuginfod" contains:

    httplib::httplib

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /usr/lib/llvm/18/lib64/cmake/llvm/LLVMConfig.cmake:355 (include)
  cmake/modules/LLDBStandalone.cmake:9 (find_package)
  CMakeLists.txt:34 (include)

I suspect that even if we hadn't used the dylib at all and installed all static libraries, it would be impossible to build LLDB against this install of LLVM as it wouldn't know about httplib::httplib — that is, unless LLVM was built without httplib support.

@kevinfrei
Copy link
Contributor Author

Is there some documentation that both me & google are missing somewhere? I'm trying to read between the lines of the scenario that's broken, with nothing but a log (that refers to a bunch of pretty important configuration files from somewhere I can't find) with a repo configuration that, again, I can't figure out. You say that we can't "build against LLMV as a dylib". Is this trying to build LLDB stand-alone, without a full LLVM repo around it? I'm also assuming you're actually talking about an SO and not a dylib, since the build is on Linux, not macOS. I really want to help, but I'm recently new to this space, so I'm missing a bunch of tribal knowledge.

Alternatively, if you have a repro, and can fix it by adding LLVMDebuginfod to libLLVM, that seems reasonable to me, given the design of LLVMDebuginfod (small, falls back to no functionality without libcurl, etc...) I've burned all day today trying to get a build & repro going, and have failed (my build is quite complicated). I'll try this on a personal machine tomorrow which should be simpler (though a couple orders of magnitude slower) than my work behemoth...

@mgorny
Copy link
Member

mgorny commented Dec 12, 2023

"Dylib" is LLVM-speech for libLLVM.so ("shared libraries" refer to individual split libraries, while "dylib" refers to the monolith), as in -DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_LINK_LLVM_DYLIB=ON. And yes, that + standalone build with full source repository but linking against LLVM+Clang installed with install-distribution target.

And yeah, if you make a patch to add LLVMDebuginfod to libLLVM, I can test that here.

@kevinfrei
Copy link
Contributor Author

I've spent 5 or 6 more hours fighting this and I'm fully stuck on "No Repro". I have a libLLVM-git18.so built, independent of building LLDB. Then I configured LLDB to build against that as a standalone entity and validated that it worked (all on my personal machine, which is running Kali Linux under WSL2: basically just Ubuntu). Below, you'll find shell scripts that I've been using to configure the build (both for LLVM and then for the standalone LLDB build). Please tell me what I need to change in these scripts to repro the problem.

#!/bin/sh
# Configure LLVM for subsequent use by an LLDB build
cmake -G Ninja \
  -B rel-llvm \
  -S llvm/llvm \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_ENABLE_PROJECTS=clang \
  -DCMAKE_INSTALL_PREFIX=/usr \
  -DLLVM_APPEND_VC_REV=OFF \
  -DCMAKE_INSTALL_PREFIX=/usr/lib/llvm/18 \
  -DBUILD_SHARED_LIBS=OFF \
  -DLLVM_BUILD_LLVM_DYLIB=ON \
  -DLLVM_LINK_LLVM_DYLIB=ON \
  -DLLVM_TARGETS_TO_BUILD="AArch64;X86" \
  -DLLVM_INCLUDE_BENCHMARKS=OFF \
  -DLLVM_ENABLE_ZSTD=yes \
  -DLLVM_ENABLE_CURL=yes \
  -DLLVM_HOST_TRIPLE=i686-pc-linux-gnu \
  -DLLVM_DISTRIBUTION_COMPONENTS="LLVM;LTO;Remarks;llvm-config;cmake-exports;llvm-headers;LLVMDemangle;LLVMSupport;LLVMTableGen;llvm-tblgen;dsymutil;llc;lli;llvm-addr2line;llvm-ar;llvm-as;llvm-bcanalyzer;llvm-bitcode-strip;llvm-c-test;llvm-cat;llvm-cfi-verify;llvm-config;llvm-cov;llvm-cvtres;llvm-cxxdump;llvm-cxxfilt;llvm-cxxmap;llvm-debuginfo-analyzer;llvm-debuginfod-find;llvm-diff;llvm-dis;llvm-dlltool;llvm-dwarfdump;llvm-dwarfutil;llvm-dwp;llvm-exegesis;llvm-extract;llvm-gsymutil;llvm-ifs;llvm-install-name-tool;llvm-jitlink;llvm-lib;llvm-libtool-darwin;llvm-link;llvm-lipo;llvm-lto;llvm-lto2;llvm-mc;llvm-mca;llvm-ml;llvm-modextract;llvm-mt;llvm-nm;llvm-objcopy;llvm-objdump;llvm-opt-report;llvm-otool;llvm-pdbutil;llvm-profdata;llvm-profgen;llvm-ranlib;llvm-rc;llvm-readelf;llvm-readobj;llvm-readtapi;llvm-reduce;llvm-remarkutil;llvm-rtdyld;llvm-sim;llvm-size;llvm-split;llvm-stress;llvm-strings;llvm-strip;llvm-symbolizer;llvm-tli-checker;llvm-undname;llvm-windres;llvm-xray;opt;sancov;sanstats;verify-uselistorder;opt-viewer;llvm-debuginfod"

and this one for the lldb build:

#!/bin/sh
cmake -G Ninja \
  -B rel-lldb \
  -S llvm/lldb \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_TARGETS_TO_BUILD="AArch64;X86" \
  -DCMAKE_INSTALL_PREFIX=/usr \
  -DLLVM_ENABLE_LIBEDIT=yes \
  -DLLVM_ENABLE_TERMINFO=yes \
  -DLLVM_ENABLE_LIBXML2=yes \
  -DLLVM_ENABLE_EH=ON \
  -DLLVM_ENABLE_RTTI=ON \
  -DLLVM_ENABLE_CURL=yes \
  -DLLVM_HOST_TRIPLE=i686-pc-linux-gnu \
  -DPython3_EXECUTABLE=/usr/bin/python3 \
  -DLLVM_DIR=/home/freik/src/rel-llvm/lib/cmake/llvm

@mgorny
Copy link
Member

mgorny commented Dec 12, 2023

Did you ninja install-distribution rather than regular install? Did it pick the right installation? Could you attach the build logs?

@kevinfrei
Copy link
Contributor Author

kevinfrei commented Dec 12, 2023

Ran the first script, ninja'd, sudo nina install-distribution, then ran the second script and ninja'd again.

I don't see the build depending on what's been installed, as it looks like setting LLVM_DIR makes the .so dependency on the binary sitting in the rel-llvm/lib directory instead.

build.log.gz

@mgorny
Copy link
Member

mgorny commented Dec 14, 2023

I'm afraid that log's non-verbose (i.e. missing -v), so it doesn't tell me which libraries are being linked. Also, I don't think I'll be able to tell much without logs for all of cmake, build and install of both LLVM and LLDB.

@kevinfrei
Copy link
Contributor Author

I'll try to get this together tonight or tomorrow morning.

@kevinfrei
Copy link
Contributor Author

doBuild.sh:

#!/bin/sh
./llvm-config.sh
cd rel-llvm
ninja -v
sudo ninja -v install-distribution
cd ..
./lldb-config.sh
cd rel-lldb
ninja -v

llvm-config.sh:

#!/bin/sh
cmake -G Ninja \
  -B rel-llvm \
  -S llvm/llvm \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_ENABLE_PROJECTS=clang \
  -DCMAKE_INSTALL_PREFIX=/usr \
  -DLLVM_APPEND_VC_REV=OFF \
  -DCMAKE_INSTALL_PREFIX=/usr/lib/llvm/18 \
  -DBUILD_SHARED_LIBS=OFF \
  -DLLVM_BUILD_LLVM_DYLIB=ON \
  -DLLVM_LINK_LLVM_DYLIB=ON \
  -DLLVM_TARGETS_TO_BUILD="AArch64;X86" \
  -DLLVM_INCLUDE_BENCHMARKS=OFF \
  -DLLVM_ENABLE_ZSTD=yes \
  -DLLVM_ENABLE_CURL=yes \
  -DLLVM_HOST_TRIPLE=i686-pc-linux-gnu \
  -DLLVM_DISTRIBUTION_COMPONENTS="LLVM;LTO;Remarks;llvm-config;cmake-exports;llvm-headers;LLVMDemangle;LLVMSupport;LLVMTableGen;llvm-tblgen;dsymutil;llc;lli;llvm-addr2line;llvm-ar;llvm-as;llvm-bcanalyzer;llvm-bitcode-strip;llvm-c-test;llvm-cat;llvm-cfi-verify;llvm-config;llvm-cov;llvm-cvtres;llvm-cxxdump;llvm-cxxfilt;llvm-cxxmap;llvm-debuginfo-analyzer;llvm-debuginfod-find;llvm-diff;llvm-dis;llvm-dlltool;llvm-dwarfdump;llvm-dwarfutil;llvm-dwp;llvm-exegesis;llvm-extract;llvm-gsymutil;llvm-ifs;llvm-install-name-tool;llvm-jitlink;llvm-lib;llvm-libtool-darwin;llvm-link;llvm-lipo;llvm-lto;llvm-lto2;llvm-mc;llvm-mca;llvm-ml;llvm-modextract;llvm-mt;llvm-nm;llvm-objcopy;llvm-objdump;llvm-opt-report;llvm-otool;llvm-pdbutil;llvm-profdata;llvm-profgen;llvm-ranlib;llvm-rc;llvm-readelf;llvm-readobj;llvm-readtapi;llvm-reduce;llvm-remarkutil;llvm-rtdyld;llvm-sim;llvm-size;llvm-split;llvm-stress;llvm-strings;llvm-strip;llvm-symbolizer;llvm-tli-checker;llvm-undname;llvm-windres;llvm-xray;opt;sancov;sanstats;verify-uselistorder;opt-viewer;llvm-debuginfod"

lldb-config.sh:

#!/bin/sh
cmake -G Ninja \
  -B rel-lldb \
  -S llvm/lldb \
  -DCMAKE_BUILD_TYPE=Release \
  -DLLVM_TARGETS_TO_BUILD="AArch64;X86" \
  -DCMAKE_INSTALL_PREFIX=/usr \
  -DLLVM_ENABLE_LIBEDIT=yes \
  -DLLVM_ENABLE_TERMINFO=yes \
  -DLLVM_ENABLE_LIBXML2=yes \
  -DLLVM_ENABLE_EH=ON \
  -DLLVM_ENABLE_RTTI=ON \
  -DLLVM_ENABLE_CURL=yes \
  -DLLVM_HOST_TRIPLE=i686-pc-linux-gnu \
  -DPython3_EXECUTABLE=/usr/bin/python3 \
  -DLLVM_DIR=/home/freik/src/rel-llvm/lib/cmake/llvm

And, here's the output of that whole mess:
hopefully-enough.log.gz

@mgorny
Copy link
Member

mgorny commented Dec 15, 2023

It's linking to /home/freik/src/rel-llvm/lib/libLLVMDebuginfod.a, i.e. the leftover build directory in your case. It doesn't exist anymore in our case. Try removing it, that should trigger the error.

@mgorny
Copy link
Member

mgorny commented Jan 4, 2024

Ping.

@kevinfrei
Copy link
Contributor Author

Sorry: nice long holiday break. So calling that library a "left over from a previous build" confuses me. What's the point of the -DLLVM_DIR=/home/freik/src/rel-llvm/lib/cmake/llvm argument to the second cmake config if not to instruct the build where to find stuff from the full LLVM build? If I delete that library, I can, indeed, reproduce the issue [now working on it] but that also leads me to believe that my repro still isn't actually doing what your configuration is. Because of that, I have no confidence that what I do to fix the issue is going to actually fix it 😞

@mgorny
Copy link
Member

mgorny commented Jan 6, 2024

Ah, sorry, I didn't notice that. We're not passing LLVM_DIR at all. LLDB is able to find installed LLVM & Clang via PATH, and that's all it needed.

@kevinfrei
Copy link
Contributor Author

I've returned to this after getting some other work up for a PR, and I'm stuck again. If I remove LLVM_DIR, the thing doesn't get anywhere. It explicitly asks for LLVM_DIR, if I work through that, then it asks for Clang_DIR. I'm getting frustrated because it seems like the configuration you're stuck on isn't supported (anymore: I'm guessing it's a holdover from before the monorepo). I just can't find any documentation for building the way you're describing. Everything says to use LLVM_DIR. So, if I set LLVM_DIR, it (correctly) links against libDebuginfod.a just fine. And if I don't have LLVM_DIR set, there'a whole lot of stuff that fails to work properly.

Is there some docker container I can spin up with your configuration already setup or something? I'm completely stuck again.

@mgorny
Copy link
Member

mgorny commented Jan 19, 2024

LLVM_DIR and Clang_DIR are merely hints to find_package(). If you need to pass them, it simply means you haven't set PATH correctly and CMake can't find installed LLVM/Clang. However, that shouldn't make any difference as long as you provide paths to installed LLVM/Clang, and not to the source/build directory. That said, you may actually need to remove the build directory because I don't know if things don't leak through.

No, I don't have a "ready" configuration. You could start off gentoo/stage3 and install LLVM/Clang/LLDB live ebuilds to reproduce but that's a lot of effort, and I don't think it's worth your time.

Perhaps I should just send the "obvious" fix without bothering you. I can reproduce it anytime, and I doubt there's any other fix possible than adding LLVMDebuginfod library to libLLVM. I just need a lot of energy to start contributing to LLVM because it's painful.

@kevinfrei
Copy link
Contributor Author

Sure. That's probably the solution. I just can't get a reliable repro, so I can only validate that it doesn't break the mainstream monorepo build. I'm happy to shepherd any diff you've indicated can tell me it's fixed on your configuration.

I truly appreciate the different worlds, here. You toss out "install a particular build of gentoo, which you've probably never done" while complaining about having to clone a repo I have on 5 different machines 🤣

@mgorny
Copy link
Member

mgorny commented Jan 24, 2024

I'm sorry for the complexity. I've finally gotten around to figuring it out, and it turns out it's a problem with CMake files: #79305 fixes it for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.