Skip to content

lldb::SBType::GetFieldAtIndex returning static field members. #55040

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
siggi-alpheus opened this issue Apr 22, 2022 · 16 comments
Closed

lldb::SBType::GetFieldAtIndex returning static field members. #55040

siggi-alpheus opened this issue Apr 22, 2022 · 16 comments
Labels

Comments

@siggi-alpheus
Copy link
Contributor

siggi-alpheus commented Apr 22, 2022

I had a hiccup trying to figure the virtual pointer layout of this class from google/glog. The heuristic used in the types.py example looks for an initial field that starts at a pointer-size offset into the class to infer that the class has a virtual table pointer. When the first field of a class is static, it appears this heuristic will fail, as the static member returns a zero offset.

I haven't found any good way to identify or filter out static members, it appears the data isn't there at all.
Maybe SBTypeMember needs an IsStatic() function, or perhaps GetOffsetInBytes() could return a sentinel value to indicate that the member does not have this pointer offset?

Here's what the "offending" code looks like:

class LogFileObject : public base::Logger {
 public:
  LogFileObject(LogSeverity severity, const char* base_filename);
  ~LogFileObject();
...
 private:
  static const uint32 kRolloverAttemptFrequency = 0x20;

  Mutex lock_;
...
};
@llvmbot
Copy link
Member

llvmbot commented Apr 22, 2022

@llvm/issue-subscribers-lldb

@jimingham
Copy link
Collaborator

jimingham commented Apr 22, 2022 via email

@siggi-alpheus
Copy link
Contributor Author

There's something deeper going on here and I'm having trouble creating a minimized repro.
I'm using the C++ API, retrieving fields with SBType::GetFieldAtIndex(), and I'm definitely getting the static member that way. I can repro the same behaviour using the python API:

import lldb

d = lldb.SBDebugger.Create()
t = d.CreateTarget("bazel-bin/symbols/assay_types")

l = t.FindFirstType('google::(anonymous namespace)::LogFileObject')
for f in l.fields:
  print(str(f))

outputs:

+0: (typedef gflags::uint32) kRolloverAttemptFrequency
+8: (class Mutex {
...
}) lock_
+56: (_Bool) base_filename_selected_
...
+200: (typedef google::glog_internal_namespace_::WallTime) start_time_

The DWARF has this:

0x000fac6d:         DW_TAG_member
                      DW_AT_name	("kRolloverAttemptFrequency")
                      DW_AT_decl_file	("/proc/self/cwd/external/com_github_google_glog/src/logging.cc")
                      DW_AT_decl_line	(455)
                      DW_AT_decl_column	(0x17)
                      DW_AT_type	(0x000f900b "const uint32")
                      DW_AT_declaration	(true)
                      DW_AT_const_value	(0x20)

0x000fac7c:         DW_TAG_member
                      DW_AT_name	("lock_")
                      DW_AT_decl_file	("/proc/self/cwd/external/com_github_google_glog/src/logging.cc")
                      DW_AT_decl_line	(457)
                      DW_AT_decl_column	(0x09)
                      DW_AT_type	(0x000f712f "Mutex")
                      DW_AT_data_member_location	(0x08)

I guess somehow filtering the static members is derailing for this particular type. I'll see if I can repro this in a local build in a debugger - or maybe this has been fixed already. I'm running V12:

$ lldb --version
lldb version 12.0.0

@jimingham
Copy link
Collaborator

jimingham commented Apr 22, 2022 via email

@jimingham
Copy link
Collaborator

jimingham commented Apr 22, 2022 via email

@siggi-alpheus
Copy link
Contributor Author

Interesting, I see my attempted repro has the DW_AT_external tag:

0x000023ed:     DW_TAG_member
                  DW_AT_name	("kZero")
                  DW_AT_decl_file	("/workspaces/thresher/repro.cc")
                  DW_AT_decl_line	(6)
                  DW_AT_decl_column	(0x19)
                  DW_AT_type	(0x000001c3 "const uint32_t")
                  DW_AT_external	(true)
                  DW_AT_accessibility	(DW_ACCESS_public)
                  DW_AT_declaration	(true)
                  DW_AT_const_value	(0x20)

Out of curiosity - is there a good reason why not to filter on the presence of the DW_AT_data_member_location tag?

@siggi-alpheus
Copy link
Contributor Author

Sorry for being a little confusing.

Not at all - thanks for taking a look and explaining things to me.

@jimingham
Copy link
Collaborator

jimingham commented Apr 22, 2022 via email

@labath
Copy link
Collaborator

labath commented Apr 25, 2022

Out of curiosity - is there a good reason why not to filter on the presence of the DW_AT_data_member_location tag?
That's a good question, but for one of our clang DWARF experts, of which I am not one...

Adding some DWARF experts (@dwblaikie).

I think there is some confusion in the DWARF spec about the meaning of DW_AT_external in this context. The spec first says this:

A DW_AT_external attribute, which is a flag, if the name of a variable is visible outside
of its enclosing compilation unit.

which makes the attribute sound completely unrelated to static-ness. However, then it goes on (in an non-normative note) with:

The definitions of C++ static data members of structures or classes are represented by
variable entries flagged as external. Both file static and local variables in C and C++
are represented by non-external variable entries.

At first glance that may make sense, since the static keyword (the "usual" way of making something non-external) has a slightly different meaning in a class, so one cannot make a static member non-external that way. However, one can also make something non-external by putting it inside an anonymous namespace, and that method can be applied to static members as well.

So, I would expect that a static member inside an anonymous namespace does not have the external attribute. However, clang does not do that -- it still emits the attribute. OTOH, gcc does not emit the attribute, so I would be inclined to consider the clang behavior a bug, and try to determine static-ness based on the presence of the DW_AT_data_member_location (and DW_AT_data_bit_offset) attribute.

@siggi-alpheus
Copy link
Contributor Author

siggi-alpheus commented Apr 25, 2022

So, I would expect that a static member inside an anonymous namespace does not have the external attribute.

Ah, interesting. When I was trying to come up with a repro, I missed the fact that the anonymous namespace might be significant.

This repros the problem:

// gcc -g -fPIC repro.cc -lstdc++
namespace {

class C {
 public:
  static const int kZero = 0;

  virtual ~C() = default;

  int member_;
};

}  // namespace

int main() {
  C c;
  return c.member_ + C::kZero;
}

assay.py:

import lldb
d = lldb.SBDebugger.Create()
t = d.CreateTarget("a.out")
l = t.FindFirstType('(anonymous namespace)::C')
for f in l.fields:
  print(str(f))
% python3 assay.py
+0: (const int) kZero
+8: (int) member_

Here's the DWARF for the class:

0x00000032:     DW_TAG_class_type
                  DW_AT_name	("C")
                  DW_AT_byte_size	(0x10)
                  DW_AT_decl_file	("/workspaces/thresher/repro.cc")
                  DW_AT_decl_line	(4)
                  DW_AT_decl_column	(0x07)
                  DW_AT_containing_type	(0x00000032)
                  DW_AT_sibling	(0x000000b3)

0x00000041:       DW_TAG_subprogram
                    DW_AT_name	("C")
                    DW_AT_artificial	(true)
                    DW_AT_accessibility	(DW_ACCESS_public)
                    DW_AT_declaration	(true)
                    DW_AT_object_pointer	(0x0000004d)
                    DW_AT_sibling	(0x00000058)

0x0000004d:         DW_TAG_formal_parameter
                      DW_AT_type	(0x000000c1 "C*")
                      DW_AT_artificial	(true)

0x00000052:         DW_TAG_formal_parameter
                      DW_AT_type	(0x000000cc "const C&")

0x00000057:         NULL

0x00000058:       DW_TAG_subprogram
                    DW_AT_name	("C")
                    DW_AT_artificial	(true)
                    DW_AT_accessibility	(DW_ACCESS_public)
                    DW_AT_declaration	(true)
                    DW_AT_object_pointer	(0x00000064)
                    DW_AT_sibling	(0x0000006a)

0x00000064:         DW_TAG_formal_parameter
                      DW_AT_type	(0x000000c1 "C*")
                      DW_AT_artificial	(true)

0x00000069:         NULL

0x0000006a:       DW_TAG_member
                    DW_AT_name	("_vptr.C")
                    DW_AT_type	(0x000000e9 "__vtbl_ptr_type*")
                    DW_AT_data_member_location	(0x00)
                    DW_AT_artificial	(true)
                    DW_AT_accessibility	(DW_ACCESS_public)

0x00000075:       DW_TAG_member
                    DW_AT_name	("kZero")
                    DW_AT_decl_file	("/workspaces/thresher/repro.cc")
                    DW_AT_decl_line	(6)
                    DW_AT_decl_column	(0x14)
                    DW_AT_type	(0x000000e4 "const int")
                    DW_AT_accessibility	(DW_ACCESS_public)
                    DW_AT_declaration	(true)
                    DW_AT_const_value	(0x00)

0x00000083:       DW_TAG_subprogram
                    DW_AT_name	("~C")
                    DW_AT_decl_file	("/workspaces/thresher/repro.cc")
                    DW_AT_decl_line	(8)
                    DW_AT_decl_column	(0x0b)
                    DW_AT_virtuality	(DW_VIRTUALITY_virtual)
                    DW_AT_containing_type	(0x00000032)
                    DW_AT_accessibility	(DW_ACCESS_public)
                    DW_AT_declaration	(true)
                    DW_AT_defaulted	(DW_DEFAULTED_in_class)
                    DW_AT_object_pointer	(0x00000099)
                    DW_AT_sibling	(0x000000a4)

0x00000099:         DW_TAG_formal_parameter
                      DW_AT_type	(0x000000c1 "C*")
                      DW_AT_artificial	(true)

0x0000009e:         DW_TAG_formal_parameter
                      DW_AT_type	(0x000000dd "int")
                      DW_AT_artificial	(true)

0x000000a3:         NULL

0x000000a4:       DW_TAG_member
                    DW_AT_name	("member_")
                    DW_AT_decl_file	("/workspaces/thresher/repro.cc")
                    DW_AT_decl_line	(10)
                    DW_AT_decl_column	(0x07)
                    DW_AT_type	(0x000000dd "int")
                    DW_AT_data_member_location	(0x08)
                    DW_AT_accessibility	(DW_ACCESS_public)

0x000000b2:       NULL

@siggi-alpheus siggi-alpheus changed the title lldb::SBTypeMember has no way to distinguish static fields lldb::SBType::GetFieldAtIndex returning static field members. Apr 25, 2022
@siggi-alpheus
Copy link
Contributor Author

I've uploaded a straw man patch that does appear to filter this out. Whether it's a correct fix, I can't say :).

@dwblaikie
Copy link
Collaborator

re: @labath's reading. Yeah, that all sounds right - maybe some history about how the non-normative DWARF spec text could help, but I guess it was just loose language.

The GCC behavior probably comes from a simple implementation that inspects the linkage of the variable when deciding whether to put DW_AT_external on it, without consideration for member, etc.

Regardless of what the spec says, supporting what GCC does is probably a good idea - and yeah, it seems better to determine non-static members by the presence/absence of of DW_AT_data_member_location rather than DW_AT_external. (I guess in theory a data member might not have a location that can be expressed? (like a local or global variable can have no location) but in practice that seems to not be an issue?)

(as for the original question: Trying to determine if a class is dynamic/has virtual functions based on the offset of the first non-virtual member is probably not the best way to go - instead looking for the virtuality property on the class would be a good idea)

@siggi-alpheus
Copy link
Contributor Author

(as for the original question: Trying to determine if a class is dynamic/has virtual functions based on the offset of the first non-virtual member is probably not the best way to go - instead looking for the virtuality property on the class would be a good idea)

The example script tests the IsPolymorphicClass() property too. Maybe that's a sufficient check to know that the class has a virtual table pointer at this pointer offset 0?

@dwblaikie
Copy link
Collaborator

(as for the original question: Trying to determine if a class is dynamic/has virtual functions based on the offset of the first non-virtual member is probably not the best way to go - instead looking for the virtuality property on the class would be a good idea)

The example script tests the IsPolymorphicClass() property too. Maybe that's a sufficient check to know that the class has a virtual table pointer at this pointer offset 0?

Not sure that guarantees the offset/location of the vtable - I guess that might depend on the inheritance hierarchy, but I'm not sure. I think computing the offset as a side effect of inspecting fields is probably not ideal.

In the DWARF itself there's a DW_TAG_member that's marked DW_AT_artificial and named "_vptr$" in the base of that virtual hierarchy (might not be the ultimate base - since the polymorphic class may derive from some non-polymorphic type). Maybe that field/properties are accessible through the LLDB API.

@siggi-alpheus
Copy link
Contributor Author

In the DWARF itself there's a DW_TAG_member that's marked DW_AT_artificial and named "_vptr$" in the base of that virtual hierarchy (might not be the ultimate base - since the polymorphic class may derive from some non-polymorphic type). Maybe that field/properties are accessible through the LLDB API.

That'd be ideal, though it looks as though LLDB/Clang reconstructs the vtable layout of a class, rather than reading it from the symbols. I'll see if I can coax that info out of the API.

labath pushed a commit that referenced this issue May 9, 2022
See [[ #55040 | issue 55040 ]] where static members of classes declared in the anonymous namespace are incorrectly returned as member fields from lldb::SBType::GetFieldAtIndex(). It appears that attrs.member_byte_offset contains a sentinel value for members that don't have a DW_AT_data_member_location.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D124409
@siggi-alpheus
Copy link
Contributor Author

Not sure that guarantees the offset/location of the vtable - I guess that might depend on the inheritance hierarchy, but I'm not sure. I think computing the offset as a side effect of inspecting fields is probably not ideal.

@dwblaikie it looks like any class that IsPolymorphic() will have a vtable at offset zero. However, I find that Chrome uses [[no_unique_address]], leading to member variables that overlap the virtual table(!!!) in the symbols I'm working with.
I wonder whether/how LLDB copes with this...

mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
See [[ llvm/llvm-project#55040 | issue 55040 ]] where static members of classes declared in the anonymous namespace are incorrectly returned as member fields from lldb::SBType::GetFieldAtIndex(). It appears that attrs.member_byte_offset contains a sentinel value for members that don't have a DW_AT_data_member_location.

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D124409
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants