Skip to content

[pseudo] remove most of clang-pseudo #80081

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

sam-mccall
Copy link
Collaborator

This was never completed, in particular we still wanted:

  • disambiguation all grammatical ambiguity, e.g. by cross-referencing
    reused identifiers
  • heuristic symbol resolution
  • conversion to syntax trees

The parts still used by clangd remain and will be dealt with later.

See https://discourse.llvm.org/t/removing-pseudo-parser/71131/5
Original design doc: https://docs.google.com/document/d/1eGkTOsFja63wsv8v0vd5JdoTonj-NlN3ujGF0T7xDbM/edit

@llvmbot
Copy link
Member

llvmbot commented Jan 30, 2024

@llvm/pr-subscribers-clangd

@llvm/pr-subscribers-clang-tools-extra

Author: Sam McCall (sam-mccall)

Changes

This was never completed, in particular we still wanted:

  • disambiguation all grammatical ambiguity, e.g. by cross-referencing
    reused identifiers
  • heuristic symbol resolution
  • conversion to syntax trees

The parts still used by clangd remain and will be dealt with later.

See https://discourse.llvm.org/t/removing-pseudo-parser/71131/5
Original design doc: https://docs.google.com/document/d/1eGkTOsFja63wsv8v0vd5JdoTonj-NlN3ujGF0T7xDbM/edit


Patch is 329.38 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/80081.diff

79 Files Affected:

  • (modified) clang-tools-extra/clangd/ClangdServer.h (+1)
  • (modified) clang-tools-extra/pseudo/CMakeLists.txt (-6)
  • (modified) clang-tools-extra/pseudo/README.md (+7)
  • (removed) clang-tools-extra/pseudo/benchmarks/Benchmark.cpp (-156)
  • (removed) clang-tools-extra/pseudo/benchmarks/CMakeLists.txt (-9)
  • (removed) clang-tools-extra/pseudo/fuzzer/CMakeLists.txt (-16)
  • (removed) clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp (-82)
  • (removed) clang-tools-extra/pseudo/fuzzer/Main.cpp (-16)
  • (removed) clang-tools-extra/pseudo/gen/CMakeLists.txt (-11)
  • (removed) clang-tools-extra/pseudo/gen/Main.cpp (-172)
  • (removed) clang-tools-extra/pseudo/include/CMakeLists.txt (-31)
  • (removed) clang-tools-extra/pseudo/include/clang-pseudo/Disambiguate.h (-64)
  • (removed) clang-tools-extra/pseudo/include/clang-pseudo/Forest.h (-236)
  • (removed) clang-tools-extra/pseudo/include/clang-pseudo/GLR.h (-170)
  • (removed) clang-tools-extra/pseudo/include/clang-pseudo/Language.h (-64)
  • (removed) clang-tools-extra/pseudo/include/clang-pseudo/cli/CLI.h (-35)
  • (removed) clang-tools-extra/pseudo/include/clang-pseudo/cxx/CXX.h (-91)
  • (removed) clang-tools-extra/pseudo/include/clang-pseudo/grammar/Grammar.h (-230)
  • (removed) clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRGraph.h (-196)
  • (removed) clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRTable.h (-278)
  • (modified) clang-tools-extra/pseudo/lib/CMakeLists.txt (-8)
  • (removed) clang-tools-extra/pseudo/lib/Disambiguate.cpp (-48)
  • (removed) clang-tools-extra/pseudo/lib/Forest.cpp (-199)
  • (removed) clang-tools-extra/pseudo/lib/GLR.cpp (-772)
  • (removed) clang-tools-extra/pseudo/lib/cli/CLI.cpp (-54)
  • (removed) clang-tools-extra/pseudo/lib/cli/CMakeLists.txt (-15)
  • (removed) clang-tools-extra/pseudo/lib/cxx/CMakeLists.txt (-15)
  • (removed) clang-tools-extra/pseudo/lib/cxx/CXX.cpp (-452)
  • (removed) clang-tools-extra/pseudo/lib/cxx/cxx.bnf (-776)
  • (removed) clang-tools-extra/pseudo/lib/grammar/CMakeLists.txt (-10)
  • (removed) clang-tools-extra/pseudo/lib/grammar/Grammar.cpp (-190)
  • (removed) clang-tools-extra/pseudo/lib/grammar/GrammarBNF.cpp (-362)
  • (removed) clang-tools-extra/pseudo/lib/grammar/LRGraph.cpp (-265)
  • (removed) clang-tools-extra/pseudo/lib/grammar/LRTable.cpp (-79)
  • (removed) clang-tools-extra/pseudo/lib/grammar/LRTableBuild.cpp (-121)
  • (modified) clang-tools-extra/pseudo/test/CMakeLists.txt (-2)
  • (removed) clang-tools-extra/pseudo/test/check-cxx-bnf.test (-2)
  • (removed) clang-tools-extra/pseudo/test/crash/backslashes.c (-4)
  • (removed) clang-tools-extra/pseudo/test/cxx/capture-list.cpp (-23)
  • (removed) clang-tools-extra/pseudo/test/cxx/contextual-keywords.cpp (-9)
  • (removed) clang-tools-extra/pseudo/test/cxx/dangling-else.cpp (-22)
  • (removed) clang-tools-extra/pseudo/test/cxx/decl-specfier-seq.cpp (-27)
  • (removed) clang-tools-extra/pseudo/test/cxx/declarator-function.cpp (-9)
  • (removed) clang-tools-extra/pseudo/test/cxx/declarator-var.cpp (-9)
  • (removed) clang-tools-extra/pseudo/test/cxx/declator-member-function.cpp (-9)
  • (removed) clang-tools-extra/pseudo/test/cxx/empty-member-declaration.cpp (-7)
  • (removed) clang-tools-extra/pseudo/test/cxx/empty-member-spec.cpp (-13)
  • (removed) clang-tools-extra/pseudo/test/cxx/keyword.cpp (-12)
  • (removed) clang-tools-extra/pseudo/test/cxx/literals.cpp (-43)
  • (removed) clang-tools-extra/pseudo/test/cxx/mixed-designator.cpp (-27)
  • (removed) clang-tools-extra/pseudo/test/cxx/nested-name-specifier.cpp (-28)
  • (removed) clang-tools-extra/pseudo/test/cxx/parameter-decl-clause.cpp (-14)
  • (removed) clang-tools-extra/pseudo/test/cxx/predefined-identifier.cpp (-5)
  • (removed) clang-tools-extra/pseudo/test/cxx/recovery-func-parameters.cpp (-13)
  • (removed) clang-tools-extra/pseudo/test/cxx/recovery-init-list.cpp (-13)
  • (removed) clang-tools-extra/pseudo/test/cxx/structured-binding.cpp (-6)
  • (removed) clang-tools-extra/pseudo/test/cxx/template-empty-type-parameter.cpp (-3)
  • (removed) clang-tools-extra/pseudo/test/cxx/unsized-array.cpp (-7)
  • (removed) clang-tools-extra/pseudo/test/fuzzer.cpp (-4)
  • (removed) clang-tools-extra/pseudo/test/glr-variant-start.cpp (-9)
  • (removed) clang-tools-extra/pseudo/test/glr.cpp (-30)
  • (removed) clang-tools-extra/pseudo/test/html-forest.c (-8)
  • (removed) clang-tools-extra/pseudo/test/lex.c (-42)
  • (removed) clang-tools-extra/pseudo/test/lr-build-basic.test (-32)
  • (removed) clang-tools-extra/pseudo/test/lr-build-conflicts.test (-49)
  • (removed) clang-tools-extra/pseudo/test/strip-directives.c (-49)
  • (removed) clang-tools-extra/pseudo/tool/CMakeLists.txt (-29)
  • (removed) clang-tools-extra/pseudo/tool/ClangPseudo.cpp (-243)
  • (removed) clang-tools-extra/pseudo/tool/HTMLForest.cpp (-192)
  • (removed) clang-tools-extra/pseudo/tool/HTMLForest.css (-93)
  • (removed) clang-tools-extra/pseudo/tool/HTMLForest.html (-15)
  • (removed) clang-tools-extra/pseudo/tool/HTMLForest.js (-290)
  • (modified) clang-tools-extra/pseudo/unittests/CMakeLists.txt (-8)
  • (removed) clang-tools-extra/pseudo/unittests/CXXTest.cpp (-30)
  • (removed) clang-tools-extra/pseudo/unittests/DisambiguateTest.cpp (-111)
  • (removed) clang-tools-extra/pseudo/unittests/ForestTest.cpp (-180)
  • (removed) clang-tools-extra/pseudo/unittests/GLRTest.cpp (-789)
  • (removed) clang-tools-extra/pseudo/unittests/GrammarTest.cpp (-213)
  • (removed) clang-tools-extra/pseudo/unittests/LRTableTest.cpp (-76)
diff --git a/clang-tools-extra/clangd/ClangdServer.h b/clang-tools-extra/clangd/ClangdServer.h
index a416602251428..7d0a1c65b8e38 100644
--- a/clang-tools-extra/clangd/ClangdServer.h
+++ b/clang-tools-extra/clangd/ClangdServer.h
@@ -168,6 +168,7 @@ class ClangdServer {
     std::vector<std::string> QueryDriverGlobs;
 
     // Whether the client supports folding only complete lines.
+    // FIXME: we currently do not behave differently based on this flag.
     bool LineFoldingOnly = false;
 
     FeatureModuleSet *FeatureModules = nullptr;
diff --git a/clang-tools-extra/pseudo/CMakeLists.txt b/clang-tools-extra/pseudo/CMakeLists.txt
index 24bc1530bb7d6..2bc0f92d063cc 100644
--- a/clang-tools-extra/pseudo/CMakeLists.txt
+++ b/clang-tools-extra/pseudo/CMakeLists.txt
@@ -1,11 +1,5 @@
 include_directories(include)
-include_directories(${CMAKE_CURRENT_BINARY_DIR}/include)
-add_subdirectory(include)
-add_subdirectory(gen)
 add_subdirectory(lib)
-add_subdirectory(tool)
-add_subdirectory(fuzzer)
-add_subdirectory(benchmarks)
 if(CLANG_INCLUDE_TESTS)
   add_subdirectory(unittests)
   add_subdirectory(test)
diff --git a/clang-tools-extra/pseudo/README.md b/clang-tools-extra/pseudo/README.md
index 0958f5d500e7f..b5984fdcdc097 100644
--- a/clang-tools-extra/pseudo/README.md
+++ b/clang-tools-extra/pseudo/README.md
@@ -1,3 +1,10 @@
+# Removed
+
+This was never completed and most of the implementation has been removed.
+This document remains for historical interest, for now.
+
+See https://docs.google.com/document/d/1eGkTOsFja63wsv8v0vd5JdoTonj-NlN3ujGF0T7xDbM/edit
+
 # clang pseudoparser
 
 This directory implements an approximate heuristic parser for C++, based on the
diff --git a/clang-tools-extra/pseudo/benchmarks/Benchmark.cpp b/clang-tools-extra/pseudo/benchmarks/Benchmark.cpp
deleted file mode 100644
index 087ab6c250e39..0000000000000
--- a/clang-tools-extra/pseudo/benchmarks/Benchmark.cpp
+++ /dev/null
@@ -1,156 +0,0 @@
-//===--- Benchmark.cpp -  clang pseudoparser benchmarks ---------*- C++ -*-===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-//
-// Benchmark for the overall pseudoparser performance, it also includes other
-// important pieces of the pseudoparser (grammar compliation, LR table build
-// etc).
-//
-// Note: make sure to build the benchmark in Release mode.
-//
-// Usage:
-//   tools/clang/tools/extra/pseudo/benchmarks/ClangPseudoBenchmark \
-//      --grammar=../clang-tools-extra/pseudo/lib/cxx.bnf \
-//      --source=../clang/lib/Sema/SemaDecl.cpp
-//
-//===----------------------------------------------------------------------===//
-
-#include "benchmark/benchmark.h"
-#include "clang-pseudo/Bracket.h"
-#include "clang-pseudo/DirectiveTree.h"
-#include "clang-pseudo/Forest.h"
-#include "clang-pseudo/GLR.h"
-#include "clang-pseudo/Token.h"
-#include "clang-pseudo/cli/CLI.h"
-#include "clang-pseudo/grammar/Grammar.h"
-#include "clang-pseudo/grammar/LRTable.h"
-#include "clang/Basic/LangOptions.h"
-#include "llvm/ADT/StringRef.h"
-#include "llvm/Support/CommandLine.h"
-#include "llvm/Support/ErrorOr.h"
-#include "llvm/Support/MemoryBuffer.h"
-#include "llvm/Support/raw_ostream.h"
-#include <string>
-
-using llvm::cl::desc;
-using llvm::cl::opt;
-using llvm::cl::Required;
-
-static opt<std::string> Source("source", desc("Source file"), Required);
-
-namespace clang {
-namespace pseudo {
-namespace bench {
-namespace {
-
-const std::string *SourceText = nullptr;
-const Language *Lang = nullptr;
-
-void setup() {
-  auto ReadFile = [](llvm::StringRef FilePath) -> std::string {
-    llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> GrammarText =
-        llvm::MemoryBuffer::getFile(FilePath);
-    if (std::error_code EC = GrammarText.getError()) {
-      llvm::errs() << "Error: can't read file '" << FilePath
-                   << "': " << EC.message() << "\n";
-      std::exit(1);
-    }
-    return GrammarText.get()->getBuffer().str();
-  };
-  SourceText = new std::string(ReadFile(Source));
-  Lang = &getLanguageFromFlags();
-}
-
-static void buildSLR(benchmark::State &State) {
-  for (auto _ : State)
-    LRTable::buildSLR(Lang->G);
-}
-BENCHMARK(buildSLR);
-
-TokenStream lexAndPreprocess() {
-  clang::LangOptions LangOpts = genericLangOpts();
-  TokenStream RawStream = pseudo::lex(*SourceText, LangOpts);
-  auto DirectiveStructure = DirectiveTree::parse(RawStream);
-  chooseConditionalBranches(DirectiveStructure, RawStream);
-  TokenStream Cook =
-      cook(DirectiveStructure.stripDirectives(RawStream), LangOpts);
-  auto Stream = stripComments(Cook);
-  pairBrackets(Stream);
-  return Stream;
-}
-
-static void lex(benchmark::State &State) {
-  clang::LangOptions LangOpts = genericLangOpts();
-  for (auto _ : State)
-    clang::pseudo::lex(*SourceText, LangOpts);
-  State.SetBytesProcessed(static_cast<uint64_t>(State.iterations()) *
-                          SourceText->size());
-}
-BENCHMARK(lex);
-
-static void pairBrackets(benchmark::State &State) {
-  clang::LangOptions LangOpts = genericLangOpts();
-  auto Stream = clang::pseudo::lex(*SourceText, LangOpts);
-  for (auto _ : State)
-    pairBrackets(Stream);
-  State.SetBytesProcessed(static_cast<uint64_t>(State.iterations()) *
-                          SourceText->size());
-}
-BENCHMARK(pairBrackets);
-
-static void preprocess(benchmark::State &State) {
-  clang::LangOptions LangOpts = genericLangOpts();
-  TokenStream RawStream = clang::pseudo::lex(*SourceText, LangOpts);
-  for (auto _ : State) {
-    auto DirectiveStructure = DirectiveTree::parse(RawStream);
-    chooseConditionalBranches(DirectiveStructure, RawStream);
-    stripComments(
-        cook(DirectiveStructure.stripDirectives(RawStream), LangOpts));
-  }
-  State.SetBytesProcessed(static_cast<uint64_t>(State.iterations()) *
-                          SourceText->size());
-}
-BENCHMARK(preprocess);
-
-static void glrParse(benchmark::State &State) {
-  SymbolID StartSymbol = *Lang->G.findNonterminal("translation-unit");
-  TokenStream Stream = lexAndPreprocess();
-  for (auto _ : State) {
-    pseudo::ForestArena Forest;
-    pseudo::GSS GSS;
-    pseudo::glrParse(ParseParams{Stream, Forest, GSS}, StartSymbol, *Lang);
-  }
-  State.SetBytesProcessed(static_cast<uint64_t>(State.iterations()) *
-                          SourceText->size());
-}
-BENCHMARK(glrParse);
-
-static void full(benchmark::State &State) {
-  SymbolID StartSymbol = *Lang->G.findNonterminal("translation-unit");
-  for (auto _ : State) {
-    TokenStream Stream = lexAndPreprocess();
-    pseudo::ForestArena Forest;
-    pseudo::GSS GSS;
-    pseudo::glrParse(ParseParams{Stream, Forest, GSS}, StartSymbol, *Lang);
-  }
-  State.SetBytesProcessed(static_cast<uint64_t>(State.iterations()) *
-                          SourceText->size());
-}
-BENCHMARK(full);
-
-} // namespace
-} // namespace bench
-} // namespace pseudo
-} // namespace clang
-
-int main(int argc, char *argv[]) {
-  benchmark::Initialize(&argc, argv);
-  llvm::cl::ParseCommandLineOptions(argc, argv);
-  clang::pseudo::bench::setup();
-  benchmark::RunSpecifiedBenchmarks();
-  return 0;
-}
diff --git a/clang-tools-extra/pseudo/benchmarks/CMakeLists.txt b/clang-tools-extra/pseudo/benchmarks/CMakeLists.txt
deleted file mode 100644
index 859db991403cd..0000000000000
--- a/clang-tools-extra/pseudo/benchmarks/CMakeLists.txt
+++ /dev/null
@@ -1,9 +0,0 @@
-add_benchmark(ClangPseudoBenchmark Benchmark.cpp)
-
-target_link_libraries(ClangPseudoBenchmark
-  PRIVATE
-  clangPseudo
-  clangPseudoCLI
-  clangPseudoGrammar
-  LLVMSupport
-  )
diff --git a/clang-tools-extra/pseudo/fuzzer/CMakeLists.txt b/clang-tools-extra/pseudo/fuzzer/CMakeLists.txt
deleted file mode 100644
index e1d79873471f0..0000000000000
--- a/clang-tools-extra/pseudo/fuzzer/CMakeLists.txt
+++ /dev/null
@@ -1,16 +0,0 @@
-set(LLVM_LINK_COMPONENTS
-  FuzzerCLI
-  Support
-  )
-
-add_llvm_fuzzer(clang-pseudo-fuzzer
-  Fuzzer.cpp
-  DUMMY_MAIN Main.cpp
-  )
-
-target_link_libraries(clang-pseudo-fuzzer
-  PRIVATE
-  clangPseudo
-  clangPseudoCLI
-  clangPseudoGrammar
-  )
diff --git a/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp b/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp
deleted file mode 100644
index 87b9d15480cc3..0000000000000
--- a/clang-tools-extra/pseudo/fuzzer/Fuzzer.cpp
+++ /dev/null
@@ -1,82 +0,0 @@
-//===-- Fuzzer.cpp - Fuzz the pseudoparser --------------------------------===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-
-#include "clang-pseudo/DirectiveTree.h"
-#include "clang-pseudo/Forest.h"
-#include "clang-pseudo/GLR.h"
-#include "clang-pseudo/Token.h"
-#include "clang-pseudo/cli/CLI.h"
-#include "clang-pseudo/grammar/Grammar.h"
-#include "clang-pseudo/grammar/LRTable.h"
-#include "clang/Basic/LangOptions.h"
-#include "llvm/ADT/StringRef.h"
-#include "llvm/Support/MemoryBuffer.h"
-#include "llvm/Support/raw_ostream.h"
-#include <algorithm>
-
-namespace clang {
-namespace pseudo {
-namespace {
-
-class Fuzzer {
-  clang::LangOptions LangOpts = clang::pseudo::genericLangOpts();
-  bool Print;
-
-public:
-  Fuzzer(bool Print) : Print(Print) {}
-
-  void operator()(llvm::StringRef Code) {
-    std::string CodeStr = Code.str(); // Must be null-terminated.
-    auto RawStream = lex(CodeStr, LangOpts);
-    auto DirectiveStructure = DirectiveTree::parse(RawStream);
-    clang::pseudo::chooseConditionalBranches(DirectiveStructure, RawStream);
-    // FIXME: strip preprocessor directives
-    auto ParseableStream =
-        clang::pseudo::stripComments(cook(RawStream, LangOpts));
-
-    clang::pseudo::ForestArena Arena;
-    clang::pseudo::GSS GSS;
-    const Language &Lang = getLanguageFromFlags();
-    auto &Root =
-        glrParse(clang::pseudo::ParseParams{ParseableStream, Arena, GSS},
-                 *Lang.G.findNonterminal("translation-unit"), Lang);
-    if (Print)
-      llvm::outs() << Root.dumpRecursive(Lang.G);
-  }
-};
-
-Fuzzer *Fuzz = nullptr;
-
-} // namespace
-} // namespace pseudo
-} // namespace clang
-
-extern "C" {
-
-// Set up the fuzzer from command line flags:
-//  -print                     - used for testing the fuzzer
-int LLVMFuzzerInitialize(int *Argc, char ***Argv) {
-  bool PrintForest = false;
-  auto ConsumeArg = [&](llvm::StringRef Arg) -> bool {
-    if (Arg == "-print") {
-      PrintForest = true;
-      return true;
-    }
-    return false;
-  };
-  *Argc = std::remove_if(*Argv + 1, *Argv + *Argc, ConsumeArg) - *Argv;
-
-  clang::pseudo::Fuzz = new clang::pseudo::Fuzzer(PrintForest);
-  return 0;
-}
-
-int LLVMFuzzerTestOneInput(uint8_t *Data, size_t Size) {
-  (*clang::pseudo::Fuzz)(llvm::StringRef(reinterpret_cast<char *>(Data), Size));
-  return 0;
-}
-}
diff --git a/clang-tools-extra/pseudo/fuzzer/Main.cpp b/clang-tools-extra/pseudo/fuzzer/Main.cpp
deleted file mode 100644
index 542a3007a399f..0000000000000
--- a/clang-tools-extra/pseudo/fuzzer/Main.cpp
+++ /dev/null
@@ -1,16 +0,0 @@
-//===--- Main.cpp - Entry point to sanity check the fuzzer ----------------===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-
-#include "llvm/FuzzMutate/FuzzerCLI.h"
-
-extern "C" int LLVMFuzzerInitialize(int *, char ***);
-extern "C" int LLVMFuzzerTestOneInput(const uint8_t *, size_t);
-int main(int argc, char *argv[]) {
-  return llvm::runFuzzerOnInputs(argc, argv, LLVMFuzzerTestOneInput,
-                                 LLVMFuzzerInitialize);
-}
diff --git a/clang-tools-extra/pseudo/gen/CMakeLists.txt b/clang-tools-extra/pseudo/gen/CMakeLists.txt
deleted file mode 100644
index 3dd615a558751..0000000000000
--- a/clang-tools-extra/pseudo/gen/CMakeLists.txt
+++ /dev/null
@@ -1,11 +0,0 @@
-set(LLVM_LINK_COMPONENTS Support)
-list(REMOVE_ITEM LLVM_COMMON_DEPENDS clang-tablegen-targets)
-
-add_clang_executable(clang-pseudo-gen
-  Main.cpp
-  )
-
-target_link_libraries(clang-pseudo-gen
-  PRIVATE
-  clangPseudoGrammar
-  )
diff --git a/clang-tools-extra/pseudo/gen/Main.cpp b/clang-tools-extra/pseudo/gen/Main.cpp
deleted file mode 100644
index 25cb26563837a..0000000000000
--- a/clang-tools-extra/pseudo/gen/Main.cpp
+++ /dev/null
@@ -1,172 +0,0 @@
-//===--- Main.cpp - Compile BNF grammar -----------------------------------===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-//
-// This is a tool to compile a BNF grammar, it is used by the build system to
-// generate a necessary data bits to statically construct core pieces (Grammar,
-// LRTable etc) of the LR parser.
-//
-//===----------------------------------------------------------------------===//
-
-#include "clang-pseudo/grammar/Grammar.h"
-#include "llvm/ADT/StringExtras.h"
-#include "llvm/Support/CommandLine.h"
-#include "llvm/Support/FileSystem.h"
-#include "llvm/Support/FormatVariadic.h"
-#include "llvm/Support/MemoryBuffer.h"
-#include "llvm/Support/ToolOutputFile.h"
-#include <algorithm>
-
-using llvm::cl::desc;
-using llvm::cl::init;
-using llvm::cl::opt;
-using llvm::cl::Required;
-using llvm::cl::value_desc;
-using llvm::cl::values;
-
-namespace {
-enum EmitType {
-  EmitSymbolList,
-  EmitGrammarContent,
-};
-
-opt<std::string> Grammar("grammar", desc("Parse a BNF grammar file."),
-                         Required);
-opt<EmitType>
-    Emit(desc("which information to emit:"),
-         values(clEnumValN(EmitSymbolList, "emit-symbol-list",
-                           "Print nonterminal symbols (default)"),
-                clEnumValN(EmitGrammarContent, "emit-grammar-content",
-                           "Print the BNF grammar content as a string")));
-
-opt<std::string> OutputFilename("o", init("-"), desc("Output"),
-                                value_desc("file"));
-
-std::string readOrDie(llvm::StringRef Path) {
-  llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> Text =
-      llvm::MemoryBuffer::getFile(Path);
-  if (std::error_code EC = Text.getError()) {
-    llvm::errs() << "Error: can't read grammar file '" << Path
-                 << "': " << EC.message() << "\n";
-    ::exit(1);
-  }
-  return Text.get()->getBuffer().str();
-}
-} // namespace
-
-namespace clang {
-namespace pseudo {
-namespace {
-
-// Mangles a symbol name into a valid identifier.
-//
-// These follow names in the grammar fairly closely:
-//   nonterminal: `ptr-declarator` becomes `ptr_declarator`;
-//   punctuator: `,` becomes `COMMA`;
-//   keyword: `INT` becomes `INT`;
-//   terminal: `IDENTIFIER` becomes `IDENTIFIER`;
-std::string mangleSymbol(SymbolID SID, const Grammar &G) {
-  static auto &TokNames = *new std::vector<std::string>{
-#define TOK(X) llvm::StringRef(#X).upper(),
-#define KEYWORD(Keyword, Condition) llvm::StringRef(#Keyword).upper(),
-#include "clang/Basic/TokenKinds.def"
-      };
-  if (isToken(SID))
-    return TokNames[symbolToToken(SID)];
-  std::string Name = G.symbolName(SID).str();
-  // translation-unit -> translation_unit
-  std::replace(Name.begin(), Name.end(), '-', '_');
-  return Name;
-}
-
-// Mangles the RHS of a rule definition into a valid identifier.
-// 
-// These are unique only for a fixed LHS.
-// e.g. for the grammar rule `ptr-declarator := ptr-operator ptr-declarator`,
-// it is `ptr_operator__ptr_declarator`.
-std::string mangleRule(RuleID RID, const Grammar &G) {
-  const auto &R = G.lookupRule(RID);
-  std::string MangleName = mangleSymbol(R.seq().front(), G);
-  for (SymbolID S : R.seq().drop_front()) {
-    MangleName.append("__");
-    MangleName.append(mangleSymbol(S, G));
-  }
-  return MangleName;
-}
-
-} // namespace
-} // namespace pseudo
-} // namespace clang
-
-int main(int argc, char *argv[]) {
-  llvm::cl::ParseCommandLineOptions(argc, argv, "");
-
-  std::string GrammarText = readOrDie(Grammar);
-  std::vector<std::string> Diags;
-  auto G = clang::pseudo::Grammar::parseBNF(GrammarText, Diags);
-
-  if (!Diags.empty()) {
-    llvm::errs() << llvm::join(Diags, "\n");
-    return 1;
-  }
-
-  std::error_code EC;
-  llvm::ToolOutputFile Out{OutputFilename, EC, llvm::sys::fs::OF_None};
-  if (EC) {
-    llvm::errs() << EC.message() << '\n';
-    return 1;
-  }
-
-  switch (Emit) {
-  case EmitSymbolList:
-    Out.os() << R"cpp(
-#ifndef NONTERMINAL
-#define NONTERMINAL(NAME, ID)
-#endif
-#ifndef RULE
-#define RULE(LHS, RHS, ID)
-#endif
-#ifndef EXTENSION
-#define EXTENSION(NAME, ID)
-#endif
-)cpp";
-    for (clang::pseudo::SymbolID ID = 0; ID < G.table().Nonterminals.size();
-         ++ID) {
-      Out.os() << llvm::formatv("NONTERMINAL({0}, {1})\n",
-                                clang::pseudo::mangleSymbol(ID, G), ID);
-      for (const clang::pseudo::Rule &R : G.rulesFor(ID)) {
-        clang::pseudo::RuleID RID = &R - G.table().Rules.data();
-        Out.os() << llvm::formatv("RULE({0}, {1}, {2})\n",
-                                  clang::pseudo::mangleSymbol(R.Target, G),
-                                  clang::pseudo::mangleRule(RID, G), RID);
-      }
-    }
-    for (clang::pseudo::ExtensionID EID = 1 /*skip the sentinel 0 value*/;
-         EID < G.table().AttributeValues.size(); ++EID) {
-      llvm::StringRef Name = G.table().AttributeValues[EID];
-      assert(!Name.empty());
-      Out.os() << llvm::formatv("EXTENSION({0}, {1})\n", Name, EID);
-    }
-    Out.os() << R"cpp(
-#undef NONTERMINAL
-#undef RULE
-#undef EXTENSION
-)cpp";
-    break;
-  case EmitGrammarContent:
-    for (llvm::StringRef Line : llvm::split(GrammarText, '\n')) {
-      Out.os() << '"';
-      Out.os().write_escaped((Line + "\n").str());
-      Out.os() << "\"\n";
-    }
-    break;
-  }
-
-  Out.keep();
-
-  return 0;
-}
diff --git a/clang-tools-extra/pseudo/include/CMakeLists.txt b/clang-tools-extra/pseudo/include/CMakeLists.txt
deleted file mode 100644
index 2334cfa12e337..0000000000000
--- a/clang-tools-extra/pseudo/include/CMakeLists.txt
+++ /dev/null
@@ -1,31 +0,0 @@
-# The cxx.bnf grammar file
-set(cxx_bnf ${CMAKE_CURRENT_SOURCE_DIR}/../lib/cxx/cxx.bnf)
-
-setup_host_tool(clang-pseudo-gen CLANG_PSEUDO_GEN pseudo_gen pseudo_gen_target)
-
-# Generate inc files.
-set(cxx_symbols_inc ${CMAKE_CURRENT_BINARY_DIR}/CXXSymbols.inc)
-add_custom_command(OUTPUT ${cxx_symbols_inc}
-   COMMAND "${pseudo_gen}"
-     --grammar ${cxx_bnf}
-     --emit-symbol-list
-     -o ${cxx_symbols_inc}
-   COMMENT "Generating nonterminal symbol file for cxx grammar..."
-   DEPENDS ${pseudo_gen_target} ${cxx_bnf}
-   VERBATIM)
-
-set(cxx_bnf_inc ${CMAKE_CURRENT_BINARY_DIR}/CXXBNF.inc)
-add_custom_command(OUTPUT ${cxx_bnf_inc}
-   COMMAND "${pseudo_gen}"
-     --grammar ${cxx_bnf}
-     --emit-grammar-content
-     -o ${cxx_bnf_inc}
-   COMMENT "Generating bnf string file for cxx grammar..."
-   DEPENDS ${pseudo_gen_target} ${cxx_bnf}
-   VERBATIM)
-
-# add_custom_command does not create a new target, we need to deine a target
-# explicitly, so that other targets can depend on it.
-add_custom_target(cxx_gen
-    DEPENDS ${cxx_symbols_inc} ${cxx_bnf_inc}
-    VERBATIM)
diff --git a/clang-tools-extra/pseudo/include/clang-pseudo/Disambiguate.h b/clang-tools-extra/pseudo/include/clang-pseudo/Disambiguate.h
deleted file mode 100644
index 5f3a22c9cabb3..0000000000000
--- a/clang-tools-extra/pseudo/include/clang-pseudo/Disambiguate.h
+++ /dev/null
@@ -1,64 +0,0 @@
-//===--- Disambiguate.h - Find the best tree in the forest -------*- C++-*-===//
-//
-// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
-// See https://llvm.org/LICENSE.txt for license information.
-// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-//
-//===----------------------------------------------------------------------===//
-//
-// A GLR parse forest represents every possible parse tree fo...
[truncated]

This was never completed, in particular we still wanted:

- disambiguation all grammatical ambiguity, e.g. by cross-referencing
  reused identifiers
- heuristic symbol resolution
- conversion to syntax trees

The parts still used by clangd remain and will be dealt with later.

See https://discourse.llvm.org/t/removing-pseudo-parser/71131/5
Original design doc: https://docs.google.com/document/d/1eGkTOsFja63wsv8v0vd5JdoTonj-NlN3ujGF0T7xDbM/edit
@hokein
Copy link
Collaborator

hokein commented Jan 31, 2024

We have some internal usages of the library. To make the integration life easier, I think it is better to do an internal cleanup first, and then land this patch.

@AaronBallman
Copy link
Collaborator

These changes are no longer needed because we landed ed8f788, so closing this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants