Skip to content

[clang] Fix non-deterministic infinite recursion... #118288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 10, 2024

Conversation

alejandro-alvarez-sonarsource
Copy link
Contributor

...in ASTContext::getAutoTypeInternal

Given

template < typename >
concept C1 = true;

template < typename , auto >
concept C2 = true;

template < C1 auto V, C2< V > auto>
struct S;

Both C1 auto V and C2<V> auto end on the set AutoType, the former being a template parameter for the latter.

Since the hashing is not deterministic (i.e., pointers are hashed), every now and then, both will end on the same bucket. Given that FoldingSet recomputes the FoldingSetID for each node in the target bucket on lookup, this triggers an infinite recursion:

  1. Look for X in AutoTypes
  2. Let's assume it would be in bucket N, so it iterates over nodes in that bucket. Let's assume the first is C2<V> auto.
  3. Computes the FoldingSetID for this one, which requires the profile of its template parameters, so they are visited.
  4. In some frames below, we end on the same FoldingSet, and, by chance, C1 auto V would be in bucket N too.
  5. But the first node in the bucket is C2<V> auto for which we need to profile C1 auto V
  6. ... stack overflow!

No step individually does anything wrong, but in general, FoldingSet seems not to be re-entrant, and this fact is hidden behind many nested calls.

With this change, we store the AutoTypes inside a DenseMap instead. The FoldingSetID is computed once only and then kept as the map's key, avoiding the need to do recursive lookups.

We also now make sure the key for the inserted AutoType is the same as the key used for lookup. Before, this was not the case, and it caused also non-deterministic parsing errors.

Fixes #110231

...in `ASTContext::getAutoTypeInternal`

Given

```cpp
template < typename >
concept C1 = true;

template < typename , auto >
concept C2 = true;

template < C1 auto V, C2< V > auto>
struct S;
```

Both `C1 auto V` and `C2<V> auto` end on the set `AutoType`, the former
being a template parameter for the latter.

Since the hashing is not deterministic (i.e., pointers are hashed),
every now and then, both will end on the same bucket.
Given that `FoldingSet` recomputes the `FoldingSetID` for each node in
the target bucket on lookup, this triggers an infinite recursion:

1. Look for `X` in `AutoTypes`
2. Let's assume it would be in bucket N, so it iterates over nodes in
that bucket. Let's assume the first is `C2<V> auto`.
3. Computes the `FoldingSetID` for this one, which requires the profile
of its template parameters, so they are visited.
4. In some frames below, we end on the same `FoldingSet`, and, by
chance, `C1 auto V` would be in bucket N too.
5. But the first node in the bucket is `C2<V> auto` for which we need to
profile `C1 auto V`
6. ... stack overflow!

No step individually does anything wrong, but in general, `FoldingSet`
seems not to be re-entrant, and this fact is hidden behind many nested
calls.

With this change, we store the `AutoType`s inside a `DenseMap` instead.
The `FoldingSetID` is computed once only and then kept as the map's key,
avoiding the need to do recursive lookups.

We also now make sure the key for the inserted `AutoType` is the same as
the key used for lookup. Before, this was not the case, and it caused
also non-deterministic parsing errors.

Fixes llvm#110231
Copy link

graphite-app bot commented Dec 2, 2024

Your org has enabled the Graphite merge queue for merging into main

Add the label “FP Bundles” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Dec 2, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 2, 2024

@llvm/pr-subscribers-clang

Author: Alejandro Álvarez Ayllón (alejandro-alvarez-sonarsource)

Changes

...in ASTContext::getAutoTypeInternal

Given

template &lt; typename &gt;
concept C1 = true;

template &lt; typename , auto &gt;
concept C2 = true;

template &lt; C1 auto V, C2&lt; V &gt; auto&gt;
struct S;

Both C1 auto V and C2&lt;V&gt; auto end on the set AutoType, the former being a template parameter for the latter.

Since the hashing is not deterministic (i.e., pointers are hashed), every now and then, both will end on the same bucket. Given that FoldingSet recomputes the FoldingSetID for each node in the target bucket on lookup, this triggers an infinite recursion:

  1. Look for X in AutoTypes
  2. Let's assume it would be in bucket N, so it iterates over nodes in that bucket. Let's assume the first is C2&lt;V&gt; auto.
  3. Computes the FoldingSetID for this one, which requires the profile of its template parameters, so they are visited.
  4. In some frames below, we end on the same FoldingSet, and, by chance, C1 auto V would be in bucket N too.
  5. But the first node in the bucket is C2&lt;V&gt; auto for which we need to profile C1 auto V
  6. ... stack overflow!

No step individually does anything wrong, but in general, FoldingSet seems not to be re-entrant, and this fact is hidden behind many nested calls.

With this change, we store the AutoTypes inside a DenseMap instead. The FoldingSetID is computed once only and then kept as the map's key, avoiding the need to do recursive lookups.

We also now make sure the key for the inserted AutoType is the same as the key used for lookup. Before, this was not the case, and it caused also non-deterministic parsing errors.

Fixes #110231


Full diff: https://github.com/llvm/llvm-project/pull/118288.diff

4 Files Affected:

  • (modified) clang/include/clang/AST/ASTContext.h (+5-1)
  • (modified) clang/include/clang/AST/Type.h (+1-1)
  • (modified) clang/lib/AST/ASTContext.cpp (+33-10)
  • (added) clang/test/Parser/gh110231.cpp (+12)
diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h
index 89fcb6789d880a..1e89a6805ce9c6 100644
--- a/clang/include/clang/AST/ASTContext.h
+++ b/clang/include/clang/AST/ASTContext.h
@@ -245,7 +245,11 @@ class ASTContext : public RefCountedBase<ASTContext> {
   mutable llvm::FoldingSet<ObjCObjectPointerType> ObjCObjectPointerTypes;
   mutable llvm::FoldingSet<DependentUnaryTransformType>
     DependentUnaryTransformTypes;
-  mutable llvm::ContextualFoldingSet<AutoType, ASTContext&> AutoTypes;
+  // An AutoType can have a dependency on another AutoType via its template
+  // arguments. Since both dependent and dependency are on the same set,
+  // we can end up in an infinite recursion when looking for a node if we used
+  // a `FoldingSet`, since both could end up in the same bucket.
+  mutable llvm::DenseMap<llvm::FoldingSetNodeID, AutoType *> AutoTypes;
   mutable llvm::FoldingSet<DeducedTemplateSpecializationType>
     DeducedTemplateSpecializationTypes;
   mutable llvm::FoldingSet<AtomicType> AtomicTypes;
diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h
index 90a52b1dcbf624..7a10aedbcb6437 100644
--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -6551,7 +6551,7 @@ class DeducedType : public Type {
 
 /// Represents a C++11 auto or C++14 decltype(auto) type, possibly constrained
 /// by a type-constraint.
-class AutoType : public DeducedType, public llvm::FoldingSetNode {
+class AutoType : public DeducedType {
   friend class ASTContext; // ASTContext creates these
 
   ConceptDecl *TypeConstraintConcept;
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 80e8c5b9df58e7..f2cb9d3c57b5e9 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -112,6 +112,27 @@ enum FloatingRank {
   Ibm128Rank
 };
 
+template <> struct llvm::DenseMapInfo<llvm::FoldingSetNodeID> {
+  static FoldingSetNodeID getEmptyKey() { return FoldingSetNodeID{}; }
+
+  static FoldingSetNodeID getTombstoneKey() {
+    FoldingSetNodeID id;
+    for (size_t i = 0; i < sizeof(id) / sizeof(unsigned); ++i) {
+      id.AddInteger(std::numeric_limits<unsigned>::max());
+    }
+    return id;
+  }
+
+  static unsigned getHashValue(const FoldingSetNodeID &Val) {
+    return Val.ComputeHash();
+  }
+
+  static bool isEqual(const FoldingSetNodeID &LHS,
+                      const FoldingSetNodeID &RHS) {
+    return LHS == RHS;
+  }
+};
+
 /// \returns The locations that are relevant when searching for Doc comments
 /// related to \p D.
 static SmallVector<SourceLocation, 2>
@@ -899,7 +920,7 @@ ASTContext::ASTContext(LangOptions &LOpts, SourceManager &SM,
       FunctionProtoTypes(this_(), FunctionProtoTypesLog2InitSize),
       DependentTypeOfExprTypes(this_()), DependentDecltypeTypes(this_()),
       TemplateSpecializationTypes(this_()),
-      DependentTemplateSpecializationTypes(this_()), AutoTypes(this_()),
+      DependentTemplateSpecializationTypes(this_()),
       DependentBitIntTypes(this_()), SubstTemplateTemplateParmPacks(this_()),
       DeducedTemplates(this_()), ArrayParameterTypes(this_()),
       CanonTemplateTemplateParms(this_()), SourceMgr(SM), LangOpts(LOpts),
@@ -6294,12 +6315,13 @@ QualType ASTContext::getAutoTypeInternal(
     return getAutoDeductType();
 
   // Look in the folding set for an existing type.
-  void *InsertPos = nullptr;
   llvm::FoldingSetNodeID ID;
-  AutoType::Profile(ID, *this, DeducedType, Keyword, IsDependent,
+  bool IsDeducedDependent =
+      !DeducedType.isNull() && DeducedType->isDependentType();
+  AutoType::Profile(ID, *this, DeducedType, Keyword, IsDependent || IsDeducedDependent,
                     TypeConstraintConcept, TypeConstraintArgs);
-  if (AutoType *AT = AutoTypes.FindNodeOrInsertPos(ID, InsertPos))
-    return QualType(AT, 0);
+  if (auto const AT_iter = AutoTypes.find(ID); AT_iter != AutoTypes.end())
+    return QualType(AT_iter->getSecond(), 0);
 
   QualType Canon;
   if (!IsCanon) {
@@ -6314,10 +6336,6 @@ QualType ASTContext::getAutoTypeInternal(
         Canon =
             getAutoTypeInternal(QualType(), Keyword, IsDependent, IsPack,
                                 CanonicalConcept, CanonicalConceptArgs, true);
-        // Find the insert position again.
-        [[maybe_unused]] auto *Nothing =
-            AutoTypes.FindNodeOrInsertPos(ID, InsertPos);
-        assert(!Nothing && "canonical type broken");
       }
     }
   }
@@ -6331,8 +6349,13 @@ QualType ASTContext::getAutoTypeInternal(
                    : TypeDependence::None) |
           (IsPack ? TypeDependence::UnexpandedPack : TypeDependence::None),
       Canon, TypeConstraintConcept, TypeConstraintArgs);
+#ifndef NDEBUG
+  llvm::FoldingSetNodeID InsertedID;
+  AT->Profile(InsertedID, *this);
+  assert(InsertedID == ID && "ID does not match");
+#endif
   Types.push_back(AT);
-  AutoTypes.InsertNode(AT, InsertPos);
+  AutoTypes.try_emplace(ID, AT);
   return QualType(AT, 0);
 }
 
diff --git a/clang/test/Parser/gh110231.cpp b/clang/test/Parser/gh110231.cpp
new file mode 100644
index 00000000000000..b1405517505ff7
--- /dev/null
+++ b/clang/test/Parser/gh110231.cpp
@@ -0,0 +1,12 @@
+// RUN: seq 100 | xargs -Ifoo %clang_cc1 -std=c++20 -fsyntax-only -verify %s
+// expected-no-diagnostics
+// This is a regression test for a non-deterministic stack-overflow.
+
+template < typename >
+concept C1 = true;
+
+template < typename , auto >
+concept C2 = true;
+
+template < C1 auto V, C2< V > auto>
+struct S;

Copy link

github-actions bot commented Dec 2, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a reasonable solution for this, though I'd very much like to see the compile-time perf results for this before committing.

@nikic , is this something we can do quickly?

@nikic
Copy link
Contributor

nikic commented Dec 9, 2024

@erichkeane
Copy link
Collaborator

@alejandro-alvarez-sonarsource
Copy link
Contributor Author

Thanks! Can you do the merge? I do not have the rights.

Also, would it make sense to backport to 19?

@cor3ntin cor3ntin merged commit c2d7e96 into llvm:main Dec 10, 2024
8 checks passed
@alejandro-alvarez-sonarsource alejandro-alvarez-sonarsource deleted the aa/fix_gh110231 branch December 12, 2024 12:59
@lbonn
Copy link

lbonn commented Dec 16, 2024

Given that the original report is a regression that prevents using the current version of mp-units, I would advocate for a backport.

As I understand, the procedure is to send the cherry-pick command that will be picked up by llvmbot so that the resulting MR can be considered for triage. However, I am hesitant to trigger it myself as I am neither an llvm contributor nor the author.

@alejandro-alvarez-sonarsource could you maybe trigger the bot?

@nikic
Copy link
Contributor

nikic commented Dec 16, 2024

This is an ABI-breaking fix, so it cannot be backported in this form.

@hubert-reinterpretcast
Copy link
Collaborator

This is an ABI-breaking fix, so it cannot be backported in this form.

@nikic, can you explain this statement? Is this something that can be documented in the release notes: https://github.com/llvm/llvm-project/commits/main/clang/docs/ReleaseNotes.rst?

@nikic
Copy link
Contributor

nikic commented Dec 20, 2024

@hubert-reinterpretcast This patch modifies structure layout in exported headers, changing the libclang-cpp ABI. LLVM patch releases cannot break API or ABI compatibility, see the last point in https://llvm.org/docs/HowToReleaseLLVM.html#release-patch-rules.

@hubert-reinterpretcast
Copy link
Collaborator

LLVM patch releases cannot break API or ABI compatibility

Thanks @nikic for the explanation. I thought you meant ABI compatibility in the "user program" sense.

hanickadot added a commit to hanickadot/llvm-project that referenced this pull request May 4, 2025
[clang] Fix non-deterministic infinite recursion... (llvm#118288)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[clang++-19][regression] "error: type constraint differs in template redeclaration"
7 participants