Skip to content

[LLDB] Add more helper functions to CompilerType class (second try). #73472

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Dec 14, 2023

Conversation

cmtice
Copy link
Contributor

@cmtice cmtice commented Nov 27, 2023

This adds 23 new helper functions to LLDB's CompilerType class, things like IsSmartPtrType, IsPromotableIntegerType,
GetNumberofNonEmptyBaseClasses, and GetTemplateArgumentType (to name a few).

It also has run clang-format on the files CompilerType.{h,cpp}.

These helper functions are needed as part of the implementation for the Data Inspection Language, (see
https://discourse.llvm.org/t/rfc-data-inspection-language/69893).

This adds 23 new helper functions to LLDB's CompilerType class, things
like IsSmartPtrType, IsPromotableIntegerType,
GetNumberofNonEmptyBaseClasses, and GetTemplateArgumentType (to name a
few).

It also has run clang-format on the files CompilerType.{h,cpp}.

These helper functions are needed as part of the implementation for
the Data Inspection Language, (see
https://discourse.llvm.org/t/rfc-data-inspection-language/69893).
@cmtice cmtice requested a review from JDevlieghere as a code owner November 27, 2023 01:29
@llvmbot llvmbot added the lldb label Nov 27, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 27, 2023

@llvm/pr-subscribers-lldb

Author: None (cmtice)

Changes

This adds 23 new helper functions to LLDB's CompilerType class, things like IsSmartPtrType, IsPromotableIntegerType,
GetNumberofNonEmptyBaseClasses, and GetTemplateArgumentType (to name a few).

It also has run clang-format on the files CompilerType.{h,cpp}.

These helper functions are needed as part of the implementation for the Data Inspection Language, (see
https://discourse.llvm.org/t/rfc-data-inspection-language/69893).


Full diff: https://github.com/llvm/llvm-project/pull/73472.diff

2 Files Affected:

  • (modified) lldb/include/lldb/Symbol/CompilerType.h (+51-5)
  • (modified) lldb/source/Symbol/CompilerType.cpp (+269-16)
diff --git a/lldb/include/lldb/Symbol/CompilerType.h b/lldb/include/lldb/Symbol/CompilerType.h
index 0a9533a1ac0efc1..a3331ad3269c01d 100644
--- a/lldb/include/lldb/Symbol/CompilerType.h
+++ b/lldb/include/lldb/Symbol/CompilerType.h
@@ -112,9 +112,7 @@ class CompilerType {
 
   /// Tests.
   /// \{
-  explicit operator bool() const {
-    return m_type_system.lock() && m_type;
-  }
+  explicit operator bool() const { return m_type_system.lock() && m_type; }
 
   bool IsValid() const { return (bool)*this; }
 
@@ -194,6 +192,54 @@ class CompilerType {
   bool IsTypedefType() const;
 
   bool IsVoidType() const;
+
+  bool IsSmartPtrType() const;
+
+  bool IsInteger() const;
+
+  bool IsFloat() const;
+
+  bool IsEnumerationType() const;
+
+  bool IsUnscopedEnumerationType() const;
+
+  bool IsIntegerOrUnscopedEnumerationType() const;
+
+  bool IsSigned() const;
+
+  bool IsNullPtrType() const;
+
+  bool IsBoolean() const;
+
+  bool IsEnumerationIntegerTypeSigned() const;
+
+  bool IsScalarOrUnscopedEnumerationType() const;
+
+  bool IsPromotableIntegerType() const;
+
+  bool IsPointerToVoid() const;
+
+  bool IsRecordType() const;
+
+  bool IsVirtualBase(CompilerType target_base, CompilerType *virtual_base,
+                     bool carry_virtual = false) const;
+
+  bool IsContextuallyConvertibleToBool() const;
+
+  bool IsBasicType() const;
+
+  std::string TypeDescription();
+
+  bool CompareTypes(CompilerType rhs) const;
+
+  const char *GetTypeTag();
+
+  uint32_t GetNumberOfNonEmptyBaseClasses();
+
+  CompilerType GetTemplateArgumentType(uint32_t idx);
+
+  CompilerType GetSmartPtrPointeeType();
+
   /// \}
 
   /// Type Completion.
@@ -436,8 +482,8 @@ class CompilerType {
                      ExecutionContextScope *exe_scope);
 
   /// Dump to stdout.
-  void DumpTypeDescription(lldb::DescriptionLevel level =
-                           lldb::eDescriptionLevelFull) const;
+  void DumpTypeDescription(
+      lldb::DescriptionLevel level = lldb::eDescriptionLevelFull) const;
 
   /// Print a description of the type to a stream. The exact implementation
   /// varies, but the expectation is that eDescriptionLevelFull returns a
diff --git a/lldb/source/Symbol/CompilerType.cpp b/lldb/source/Symbol/CompilerType.cpp
index 78cc8dad94a9c5f..854d6cab01b508e 100644
--- a/lldb/source/Symbol/CompilerType.cpp
+++ b/lldb/source/Symbol/CompilerType.cpp
@@ -54,7 +54,7 @@ bool CompilerType::IsArrayType(CompilerType *element_type_ptr, uint64_t *size,
   if (IsValid())
     if (auto type_system_sp = GetTypeSystem())
       return type_system_sp->IsArrayType(m_type, element_type_ptr, size,
-                                      is_incomplete);
+                                         is_incomplete);
 
   if (element_type_ptr)
     element_type_ptr->Clear();
@@ -157,7 +157,8 @@ bool CompilerType::IsBlockPointerType(
     CompilerType *function_pointer_type_ptr) const {
   if (IsValid())
     if (auto type_system_sp = GetTypeSystem())
-      return type_system_sp->IsBlockPointerType(m_type, function_pointer_type_ptr);
+      return type_system_sp->IsBlockPointerType(m_type,
+                                                function_pointer_type_ptr);
   return false;
 }
 
@@ -249,7 +250,7 @@ bool CompilerType::IsPossibleDynamicType(CompilerType *dynamic_pointee_type,
   if (IsValid())
     if (auto type_system_sp = GetTypeSystem())
       return type_system_sp->IsPossibleDynamicType(m_type, dynamic_pointee_type,
-                                                check_cplusplus, check_objc);
+                                                   check_cplusplus, check_objc);
   return false;
 }
 
@@ -302,6 +303,256 @@ bool CompilerType::IsBeingDefined() const {
   return false;
 }
 
+bool CompilerType::IsSmartPtrType() const {
+  // These regular expressions cover shared, unique and weak pointers both from
+  // stdlibc++ and libc+++.
+
+  static llvm::Regex k_libcxx_std_unique_ptr_regex(
+      "^std::__[[:alnum:]]+::unique_ptr<.+>(( )?&)?$");
+  static llvm::Regex k_libcxx_std_shared_ptr_regex(
+      "^std::__[[:alnum:]]+::shared_ptr<.+>(( )?&)?$");
+  static llvm::Regex k_libcxx_std_weak_ptr_regex(
+      "^std::__[[:alnum:]]+::weak_ptr<.+>(( )?&)?$");
+  //
+  static llvm::Regex k_libcxx_std_unique_ptr_regex_2(
+      "^std::unique_ptr<.+>(( )?&)?$");
+  static llvm::Regex k_libcxx_std_shared_ptr_regex_2(
+      "^std::shared_ptr<.+>(( )?&)?$");
+  static llvm::Regex k_libcxx_std_weak_ptr_regex_2(
+      "^std::weak_ptr<.+>(( )?&)?$");
+  //
+  llvm::StringRef name = GetTypeName();
+  return k_libcxx_std_unique_ptr_regex.match(name) ||
+         k_libcxx_std_shared_ptr_regex.match(name) ||
+         k_libcxx_std_weak_ptr_regex.match(name) ||
+         k_libcxx_std_unique_ptr_regex_2.match(name) ||
+         k_libcxx_std_shared_ptr_regex_2.match(name) ||
+         k_libcxx_std_weak_ptr_regex_2.match(name);
+}
+
+bool CompilerType::IsInteger() const {
+  // This is used when you don't care about the signedness of the integer.
+  bool is_signed;
+  return IsIntegerType(is_signed);
+}
+
+bool CompilerType::IsFloat() const {
+  uint32_t count = 0;
+  bool is_complex = false;
+  return IsFloatingPointType(count, is_complex);
+}
+
+bool CompilerType::IsEnumerationType() const {
+  // This is used when you don't care about the signedness of the enum.
+  bool is_signed;
+  return IsEnumerationType(is_signed);
+}
+
+bool CompilerType::IsUnscopedEnumerationType() const {
+  return IsEnumerationType() && !IsScopedEnumerationType();
+}
+
+bool CompilerType::IsIntegerOrUnscopedEnumerationType() const {
+  return IsInteger() || IsUnscopedEnumerationType();
+}
+
+bool CompilerType::IsSigned() const {
+  if (IsEnumerationType()) {
+    return IsEnumerationIntegerTypeSigned();
+  }
+  return GetTypeInfo() & lldb::eTypeIsSigned;
+}
+
+bool CompilerType::IsNullPtrType() const {
+  return GetCanonicalType().GetBasicTypeEnumeration() ==
+         lldb::eBasicTypeNullPtr;
+}
+
+bool CompilerType::IsBoolean() const {
+  return GetCanonicalType().GetBasicTypeEnumeration() == lldb::eBasicTypeBool;
+}
+
+bool CompilerType::IsEnumerationIntegerTypeSigned() const {
+  if (IsValid()) {
+    return GetEnumerationIntegerType().GetTypeInfo() & lldb::eTypeIsSigned;
+  }
+  return false;
+}
+
+bool CompilerType::IsScalarOrUnscopedEnumerationType() const {
+  return IsScalarType() || IsUnscopedEnumerationType();
+}
+
+bool CompilerType::IsPromotableIntegerType() const {
+  // Unscoped enums are always considered as promotable, even if their
+  // underlying type does not need to be promoted (e.g. "int").
+  if (IsUnscopedEnumerationType()) {
+    return true;
+  }
+
+  switch (GetCanonicalType().GetBasicTypeEnumeration()) {
+  case lldb::eBasicTypeBool:
+  case lldb::eBasicTypeChar:
+  case lldb::eBasicTypeSignedChar:
+  case lldb::eBasicTypeUnsignedChar:
+  case lldb::eBasicTypeShort:
+  case lldb::eBasicTypeUnsignedShort:
+  case lldb::eBasicTypeWChar:
+  case lldb::eBasicTypeSignedWChar:
+  case lldb::eBasicTypeUnsignedWChar:
+  case lldb::eBasicTypeChar16:
+  case lldb::eBasicTypeChar32:
+    return true;
+
+  default:
+    return false;
+  }
+}
+
+bool CompilerType::IsPointerToVoid() const {
+  if (!IsValid())
+    return false;
+
+  return IsPointerType() &&
+         GetPointeeType().GetBasicTypeEnumeration() == lldb::eBasicTypeVoid;
+}
+
+bool CompilerType::IsRecordType() const {
+  if (!IsValid())
+    return false;
+
+  return GetCanonicalType().GetTypeClass() &
+         (lldb::eTypeClassClass | lldb::eTypeClassStruct |
+          lldb::eTypeClassUnion);
+}
+
+// Checks whether `target_base` is a virtual base of `type` (direct or
+// indirect). If it is, stores the first virtual base type on the path from
+// `type` to `target_type`.
+bool CompilerType::IsVirtualBase(CompilerType target_base,
+                                 CompilerType *virtual_base,
+                                 bool carry_virtual) const {
+  if (CompareTypes(target_base)) {
+    return carry_virtual;
+  }
+
+  if (!carry_virtual) {
+    uint32_t num_virtual_bases = GetNumVirtualBaseClasses();
+    for (uint32_t i = 0; i < num_virtual_bases; ++i) {
+      uint32_t bit_offset;
+      auto base = GetVirtualBaseClassAtIndex(i, &bit_offset);
+      if (base.IsVirtualBase(target_base, virtual_base,
+                             /*carry_virtual*/ true)) {
+        if (virtual_base) {
+          *virtual_base = base;
+        }
+        return true;
+      }
+    }
+  }
+
+  uint32_t num_direct_bases = GetNumDirectBaseClasses();
+  for (uint32_t i = 0; i < num_direct_bases; ++i) {
+    uint32_t bit_offset;
+    auto base = GetDirectBaseClassAtIndex(i, &bit_offset);
+    if (base.IsVirtualBase(target_base, virtual_base, carry_virtual)) {
+      return true;
+    }
+  }
+
+  return false;
+}
+
+bool CompilerType::IsContextuallyConvertibleToBool() const {
+  return IsScalarType() || IsUnscopedEnumerationType() || IsPointerType() ||
+         IsNullPtrType() || IsArrayType();
+}
+
+bool CompilerType::IsBasicType() const {
+  return GetCanonicalType().GetBasicTypeEnumeration() !=
+         lldb::eBasicTypeInvalid;
+}
+
+std::string CompilerType::TypeDescription() {
+  auto name = GetTypeName();
+  auto canonical_name = GetCanonicalType().GetTypeName();
+  if (name.IsEmpty() || canonical_name.IsEmpty()) {
+    return "''"; // should not happen
+  }
+  if (name == canonical_name) {
+    return llvm::formatv("'{0}'", name);
+  }
+  return llvm::formatv("'{0}' (aka '{1}')", name, canonical_name);
+}
+
+bool CompilerType::CompareTypes(CompilerType rhs) const {
+  if (*this == rhs)
+    return true;
+
+  const ConstString name = GetFullyUnqualifiedType().GetTypeName();
+  const ConstString rhs_name = rhs.GetFullyUnqualifiedType().GetTypeName();
+  return name == rhs_name;
+}
+
+const char *CompilerType::GetTypeTag() {
+  switch (GetTypeClass()) {
+    // clang-format off
+    case lldb::eTypeClassClass:       return "class";
+    case lldb::eTypeClassEnumeration: return "enum";
+    case lldb::eTypeClassStruct:      return "struct";
+    case lldb::eTypeClassUnion:       return "union";
+    // clang-format on
+  default:
+    return "unknown";
+  }
+}
+
+uint32_t CompilerType::GetNumberOfNonEmptyBaseClasses() {
+  // Go through the base classes and count non-empty ones.
+  uint32_t ret = 0;
+  uint32_t num_direct_bases = GetNumDirectBaseClasses();
+
+  for (uint32_t i = 0; i < num_direct_bases; ++i) {
+    uint32_t bit_offset;
+    CompilerType base_type = GetDirectBaseClassAtIndex(i, &bit_offset);
+    if (base_type.GetNumFields() > 0 ||
+        base_type.GetNumberOfNonEmptyBaseClasses() > 0) {
+      ret += 1;
+    }
+  }
+  return ret;
+}
+
+CompilerType CompilerType::GetTemplateArgumentType(uint32_t idx) {
+  CompilerType empty_type;
+  if (!IsValid())
+    return empty_type;
+
+  CompilerType type;
+  const bool expand_pack = true;
+  switch (GetTemplateArgumentKind(idx, true)) {
+  case lldb::eTemplateArgumentKindType:
+    type = GetTypeTemplateArgument(idx, expand_pack);
+    break;
+  case lldb::eTemplateArgumentKindIntegral:
+    type = GetIntegralTemplateArgument(idx, expand_pack)->type;
+    break;
+  default:
+    break;
+  }
+  if (type.IsValid())
+    return type;
+  return empty_type;
+}
+
+CompilerType CompilerType::GetSmartPtrPointeeType() {
+  assert(IsSmartPtrType() &&
+         "the type should be a smart pointer (std::unique_ptr, std::shared_ptr "
+         "or std::weak_ptr");
+
+  return GetTemplateArgumentType(0);
+}
+
 // Type Completion
 
 bool CompilerType::GetCompleteType() const {
@@ -336,9 +587,9 @@ ConstString CompilerType::GetDisplayTypeName() const {
 uint32_t CompilerType::GetTypeInfo(
     CompilerType *pointee_or_element_compiler_type) const {
   if (IsValid())
-  if (auto type_system_sp = GetTypeSystem())
-    return type_system_sp->GetTypeInfo(m_type,
-                                       pointee_or_element_compiler_type);
+    if (auto type_system_sp = GetTypeSystem())
+      return type_system_sp->GetTypeInfo(m_type,
+                                         pointee_or_element_compiler_type);
   return 0;
 }
 
@@ -362,8 +613,9 @@ void CompilerType::SetCompilerType(lldb::TypeSystemWP type_system,
   m_type = type;
 }
 
-void CompilerType::SetCompilerType(CompilerType::TypeSystemSPWrapper type_system,
-                                   lldb::opaque_compiler_type_t type) {
+void CompilerType::SetCompilerType(
+    CompilerType::TypeSystemSPWrapper type_system,
+    lldb::opaque_compiler_type_t type) {
   m_type_system = type_system.GetSharedPointer();
   m_type = type;
 }
@@ -589,7 +841,7 @@ uint32_t CompilerType::GetNumChildren(bool omit_empty_base_classes,
   if (IsValid())
     if (auto type_system_sp = GetTypeSystem())
       return type_system_sp->GetNumChildren(m_type, omit_empty_base_classes,
-                                       exe_ctx);
+                                            exe_ctx);
   return 0;
 }
 
@@ -601,8 +853,7 @@ lldb::BasicType CompilerType::GetBasicTypeEnumeration() const {
 }
 
 void CompilerType::ForEachEnumerator(
-    std::function<bool(const CompilerType &integer_type,
-                       ConstString name,
+    std::function<bool(const CompilerType &integer_type, ConstString name,
                        const llvm::APSInt &value)> const &callback) const {
   if (IsValid())
     if (auto type_system_sp = GetTypeSystem())
@@ -623,7 +874,8 @@ CompilerType CompilerType::GetFieldAtIndex(size_t idx, std::string &name,
   if (IsValid())
     if (auto type_system_sp = GetTypeSystem())
       return type_system_sp->GetFieldAtIndex(m_type, idx, name, bit_offset_ptr,
-                                        bitfield_bit_size_ptr, is_bitfield_ptr);
+                                             bitfield_bit_size_ptr,
+                                             is_bitfield_ptr);
   return CompilerType();
 }
 
@@ -647,7 +899,7 @@ CompilerType::GetDirectBaseClassAtIndex(size_t idx,
   if (IsValid())
     if (auto type_system_sp = GetTypeSystem())
       return type_system_sp->GetDirectBaseClassAtIndex(m_type, idx,
-                                                    bit_offset_ptr);
+                                                       bit_offset_ptr);
   return CompilerType();
 }
 
@@ -657,7 +909,7 @@ CompilerType::GetVirtualBaseClassAtIndex(size_t idx,
   if (IsValid())
     if (auto type_system_sp = GetTypeSystem())
       return type_system_sp->GetVirtualBaseClassAtIndex(m_type, idx,
-                                                     bit_offset_ptr);
+                                                        bit_offset_ptr);
   return CompilerType();
 }
 
@@ -738,7 +990,7 @@ size_t CompilerType::GetIndexOfChildMemberWithName(
   if (IsValid() && !name.empty()) {
     if (auto type_system_sp = GetTypeSystem())
       return type_system_sp->GetIndexOfChildMemberWithName(
-        m_type, name, omit_empty_base_classes, child_indexes);
+          m_type, name, omit_empty_base_classes, child_indexes);
   }
   return 0;
 }
@@ -772,7 +1024,8 @@ std::optional<CompilerType::IntegralTemplateArgument>
 CompilerType::GetIntegralTemplateArgument(size_t idx, bool expand_pack) const {
   if (IsValid())
     if (auto type_system_sp = GetTypeSystem())
-      return type_system_sp->GetIntegralTemplateArgument(m_type, idx, expand_pack);
+      return type_system_sp->GetIntegralTemplateArgument(m_type, idx,
+                                                         expand_pack);
   return std::nullopt;
 }
 

"^std::__[[:alnum:]]+::shared_ptr<.+>(( )?&)?$");
static llvm::Regex k_libcxx_std_weak_ptr_regex(
"^std::__[[:alnum:]]+::weak_ptr<.+>(( )?&)?$");
//
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"^std::shared_ptr<.+>(( )?&)?$");
static llvm::Regex k_libcxx_std_weak_ptr_regex_2(
"^std::weak_ptr<.+>(( )?&)?$");
//
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

bool CompilerType::IsInteger() const {
// This is used when you don't care about the signedness of the integer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should this be a Doxygen comment (///) in the header. Also applies to comments in the functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 376 to 378
if (IsValid()) {
return GetEnumerationIntegerType().GetTypeInfo() & lldb::eTypeIsSigned;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No braces around single line if.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (for entire change).


default:
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This likely needs a llvm_unreachable("All cases handled above."); to appease gcc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 435 to 437
if (CompareTypes(target_base)) {
return carry_virtual;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No braces

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many other instances of this in the code below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (everywhere, I think).

auto name = GetTypeName();
auto canonical_name = GetCanonicalType().GetTypeName();
if (name.IsEmpty() || canonical_name.IsEmpty()) {
return "''"; // should not happen
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not happen as in we can guarantee that (in which case this should be an assert) or as in this would only happen for broken input?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this will only happen for broken input; I'll update the comment to make that clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 499 to 504
// clang-format off
case lldb::eTypeClassClass: return "class";
case lldb::eTypeClassEnumeration: return "enum";
case lldb::eTypeClassStruct: return "struct";
case lldb::eTypeClassUnion: return "union";
// clang-format on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really worth disabling clang format for? It seems like this would just end up on the line below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the formatting & removed the comments.

Copy link
Member

@bulbazord bulbazord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems okay to me in general. Left some inline comments and I also agree with all of Jonas's inline comments I saw too.

// These regular expressions cover shared, unique and weak pointers both from
// stdlibc++ and libc+++.

static llvm::Regex k_libcxx_std_unique_ptr_regex(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the prefix k mean here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It indicates the that these are constants. I can remove it if you like?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, I don't mind it being there. Just hadn't seen it before. If they're constants though, then please mark them const as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -302,6 +303,256 @@ bool CompilerType::IsBeingDefined() const {
return false;
}

bool CompilerType::IsSmartPtrType() const {
// These regular expressions cover shared, unique and weak pointers both from
// stdlibc++ and libc+++.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it covers both, I would suggest changing the names of the variables to not have the word "libcxx" in them. Comments can get out of date or otherwise be misleading, this one feels confusing to read through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Renamed the libstdc++ variable names to contain 'libstdc++' instead of 'libc++'.


bool CompilerType::IsInteger() const {
// This is used when you don't care about the signedness of the integer.
bool is_signed;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialize this to something even if it's unused. Reasoning: somebody may refactor this in the future in such a way that this becomes a problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 429 to 430
// Checks whether `target_base` is a virtual base of `type` (direct or
// indirect). If it is, stores the first virtual base type on the path from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it would be more useful as a doxygen comment in the header.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should also explain what virtual_base and carry_virtual are

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (mostly). I still need to check on 'carry_virtual' and add a comment about it.

if (name == canonical_name) {
return llvm::formatv("'{0}'", name);
}
return llvm::formatv("'{0}' (aka '{1}')", name, canonical_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Instead of "'foo' (aka 'bar')" maybe something like

"'foo' (canonically referred to as 'bar')"?

My reasoning is 'aka' does not specify where this information is "known" or who knows it. Maybe canonical is also the wrong word but I think it would be useful to tell folks more than just "this type is also known as this other type". Maybe it's just me though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

CompilerType CompilerType::GetSmartPtrPointeeType() {
assert(IsSmartPtrType() &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An assertion feels a little strong here, lldb crashing because somebody called the wrong function in the wrong place is a bit extreme to me.

In my opinion, the fact that GetSmartPtrPointeeType can fail if this isn't a smart pointer means that we should change the interface. Instead of returning a CompilerType, maybe std::optional or llvm::Expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Instead of asserting, if it's not a smart pointer I call GetPointeeType.

@@ -436,8 +482,8 @@ class CompilerType {
ExecutionContextScope *exe_scope);

/// Dump to stdout.
void DumpTypeDescription(lldb::DescriptionLevel level =
lldb::eDescriptionLevelFull) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the desire to fix formatting issues, but let's avoid doing them in a big patch like this. If things needs to be reverted, as this patch was, such a formatting change should not need to be reverted.

This can also cause unnecessary work for downstream forks in case of merge conflicts (the first patch, the revert, the second patch, etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I agree with you. GIthub pull request creation did not give me any choice: It would not allow me to create a PR that did not pass clang-format. If you know of some way to bypass that requirement, PLEASE tell me!!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's ugly, but there is this:

// clang-format off
...
// clang-format on

I am sure that you knew that and were asking for a nicer workaround. I, too, would be interested in learning such a trick, if it existed.

Copy link
Collaborator

@DavidSpickett DavidSpickett Nov 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would not allow me to create a PR that did not pass clang-format.

Can you be more specific here, you could not open a PR at all without it being formatted? If that's the case it doesn't seem like what anyone would have intended.

I've definitely seen PRs that are opened then after that the bot will comment that it needs formatting.

Your other workaround here would be to do a PR to reformat these files, then rebase this onto that after landing it. Review has already started though, so maybe do that as the last thing after approvals.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DavidSpickett: I created the PR, sort of, but... if you look at the bottom of this PR (for example) there's a small section that says "All Checks Passed". If you open that, you will see that one of the checks is a clang-format check. My original PR did not pass that check, so github would not "finish" creating the PR, i.e. it would not allow me to ask for reviews on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be wrong here, but AFAICT the bot should only check for formatting changes inside the diff of the commit, not inside the entire file. If this is not what the bot is doing, it is something we can probably improve upon.

In the meantime though, my suggestion is to manually inspect the bot output: if it is complaining about lines not touched by this commit, just ignore that. This is favorable to editing lines that are not related to the patch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this comment in another PR

NB: not applying all the clang-format recommendations as it affects lines I'm not editing, and I don't want to pollute git-blame

That makes me think the bot is not running clang-format-diff? https://clang.llvm.org/docs/ClangFormat.html#script-for-patch-reformatting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: For this PR, do you want me to un-do the clang-format changes that don't relate to my changes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -54,7 +54,7 @@ bool CompilerType::IsArrayType(CompilerType *element_type_ptr, uint64_t *size,
if (IsValid())
if (auto type_system_sp = GetTypeSystem())
return type_system_sp->IsArrayType(m_type, element_type_ptr, size,
is_incomplete);
is_incomplete);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue with formatting as mentioned before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -157,7 +157,8 @@ bool CompilerType::IsBlockPointerType(
CompilerType *function_pointer_type_ptr) const {
if (IsValid())
if (auto type_system_sp = GetTypeSystem())
return type_system_sp->IsBlockPointerType(m_type, function_pointer_type_ptr);
return type_system_sp->IsBlockPointerType(m_type,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue with formatting as mentioned before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -249,7 +250,7 @@ bool CompilerType::IsPossibleDynamicType(CompilerType *dynamic_pointee_type,
if (IsValid())
if (auto type_system_sp = GetTypeSystem())
return type_system_sp->IsPossibleDynamicType(m_type, dynamic_pointee_type,
check_cplusplus, check_objc);
check_cplusplus, check_objc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue with formatting as mentioned before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"^std::unique_ptr<.+>(( )?&)?$");
static llvm::Regex k_libcxx_std_shared_ptr_regex_2(
"^std::shared_ptr<.+>(( )?&)?$");
static llvm::Regex k_libcxx_std_weak_ptr_regex_2(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make these consts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


bool CompilerType::IsInteger() const {
// This is used when you don't care about the signedness of the integer.
bool is_signed;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1


default:
return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 435 to 437
if (CompareTypes(target_base)) {
return carry_virtual;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many other instances of this in the code below

Comment on lines 429 to 430
// Checks whether `target_base` is a virtual base of `type` (direct or
// indirect). If it is, stores the first virtual base type on the path from
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Comment on lines 429 to 430
// Checks whether `target_base` is a virtual base of `type` (direct or
// indirect). If it is, stores the first virtual base type on the path from
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should also explain what virtual_base and carry_virtual are

@@ -194,6 +192,54 @@ class CompilerType {
bool IsTypedefType() const;

bool IsVoidType() const;

bool IsSmartPtrType() const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some methods like this one could benefit from a Doxygen comment that explains what the semantics are.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

bool IsVirtualBase(CompilerType target_base, CompilerType *virtual_base,
bool carry_virtual = false) const;

bool IsContextuallyConvertibleToBool() const;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems extremely specific to the C language family. I don't think that's a problem, but it would be nice to at least document that this may be only defined in TypeSystemClang?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done? I think?


CompilerType GetTemplateArgumentType(uint32_t idx);

CompilerType GetSmartPtrPointeeType();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be some overlap between this and the C++ shared_ptr synthetic child provider. I think it would be good if either the formatter uses this functionality or if the intended user of this function uses the synthetic children instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on this a little bit more please. I'm still learning the LLDB type system, and I don't quite understand what you're asking me to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at the functions in LibcxxSharedPtrSyntheticFrontend (iPlugins/Language/CPlusPlus) -- I assume this is what you were referring to. None of the functions there do exactly what this function does, so I can't use one of them instead. The only place I saw there that might be able to use this function was in GetChildAtIndex... I could replace 'valobj_sp->GetCompilerType().GetTypeTemplateArgument(0).GetPointerType()' with 'valobj_sp->GetCompilerType().GetSmartPtrPointeeType()', but it looks like the current call would return something like "* FooType", whereas my call would return "FooType", so they're not quite the same, either.

So I don't see how to do what you're asking.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you are adding a function that takes a CompilerType holding an STL smart pointer and ask for the pointee type. We also have synthetic child formatters used by frame variable that allow evaluating *my_shared_ptr. From that I was assuming that there must be some overlap between what the dataformatter provides and what this function does.
What I don't know is whether synthetic children provided by a formatter are a good fit for what you are trying to do, but maybe the inverse is true, and the existing data formatter should be implemented in terms of this new GetSmartPtrPointeeType() API.
I just want to make sure we're not introducing two places in LLDB that have to know about STL smart pointers while covering slightly different functionality.

Copy link
Collaborator

@jimingham jimingham Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this function returns the type of the shared pointer, whereas the synthetic child providers return values of that type. So if anything you would implement the formatter on top of this type recognition (though that only gets you a little way there, you also have to find where the value is in the smart pointer.)

It's a little awkward that these API's speak of "smart pointer" when they really mean "the system library provided memory managing smart pointer". But I can't think of a good name to make this explicit.

However, that brings us round to the synthetic child support for "smart pointers". That is a feature that's not limited to the standard library's notion of a smart pointer. It really just tells lldb how to interpret the -> in:

(lldb) frame var foo->bar

The implementation of this feature doesn't require any knowledge of what the arrow operator for foo does (it is not even required to have one). For instance, you could use it to have lldb log uses of the -> operator for this type in the expression parser, or whatever. The only requirement is that you define a synthetic child provider that emulates the -> operator, returning a ValueObject from the appropriate API in the provider.

So at some point someone could redo the STL smart pointer comprehension in lldb to use the synthetic child arrow operator emulation, that's a much broader feature than what this API is promising.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mostly trying to say that the ValueObject support for "faking the arrow operator effects" is orthogonal to having the type system contain the notion of "smart pointers".

I also think the whole naming of this is confusing. What is a "smart pointer"? The IsSmartPtr states some intent about this:

/// This determines if the type is a shared, unique or weak pointer, either /// from stdlibc++ or libc+++. bool IsSmartPtrType() const;

But that is also a little odd. This is a general type system query but doesn't in fact have anything to do with the type system. Rather with a particular implementation in the C++ STL - which is orthogonal to the C++ type system... So it seems like an awfully C++ specific API to have in the Typesystem.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a comment saying "This is defined only for C++ types" in a function from CompilerType.h seems wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can remove these helper functions from the CompilerType class and put them into my parser class for DIL instead, if you think that would be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarification: I meant "I can remove the helper functions related to smart pointers."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I've removed them now.

Address most of the reviewer comments:
  - Fix formatting issues
  - Add/Move Doxygen function comments in header file.
  - Undo clang-format changes to code that I didn't touch.
  - Other requested fixes.
Update comment for "IsVirtualBase" member function as requested.
@cmtice
Copy link
Contributor Author

cmtice commented Dec 5, 2023

I believe i have addressed all the review comments. Please let me know if there's anything else I need to do and/or (hopefully) approve this so I can merge it.

Thanks!

@jimingham
Copy link
Collaborator

jimingham commented Dec 6, 2023 via email

Remove smart pointer functions from the new helper functions for
CompilerType class.
@cmtice
Copy link
Contributor Author

cmtice commented Dec 7, 2023

I've removed the Smart pointer functions from this PR.

Are there any other changes that I need to make? If not, could someone approve this?

@cmtice
Copy link
Contributor Author

cmtice commented Dec 13, 2023

Ping! Could somebody please either approve this PR or tell me what other changes they would like to see? Pretty please?

Copy link
Member

@bulbazord bulbazord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems okay to me... Quick question: Why do you need CompilerType::GetTemplateArgumentType? This seems pretty specific to C++. Is there a more general concept we could capture? Like something related to "Generics"?

@cmtice
Copy link
Contributor Author

cmtice commented Dec 13, 2023

@bulbazord Actually, I think you're right. I use that in my smart pointer type code. I will pull that out as well (I overlooked it before).

Remove GetTemplateArgumentType from list of new helper functions (it's
too C++-specific).
@cmtice
Copy link
Contributor Author

cmtice commented Dec 13, 2023

Ok, I have now removed CompilerType::GetTemplateArgumentType from the new helper functions.

Copy link
Member

@bulbazord bulbazord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems okay to me. I don't see anything in here that wouldn't make sense in at least some other languages (other than C++).

if (IsEnumerationType())
return IsEnumerationIntegerTypeSigned();

return GetTypeInfo() & lldb::eTypeIsSigned;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function looks like it's working around a bug in GetTypeInfo?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(i.e., could this be fixed by modifying TypeSystemClang::GetTypeInfo to return the sign bit on enums?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrian-prantl Actually I think it already works properly without special-casing enums (I went back and looked and could not figure out why I had done this). I've cleaned it up now and removed the special casing. Are you OK with me committing this now?

Clean up implementation of IsSigned function.
Copy link
Collaborator

@adrian-prantl adrian-prantl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing all my feedback!

@cmtice cmtice merged commit e692d08 into llvm:main Dec 14, 2023
@cmtice cmtice deleted the compiler-type branch December 11, 2024 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants