Skip to content

clang::ASTWriter can create a crashing PCH if an incorrect hasErrors value is passed #53952

Closed
@TestingPlant

Description

@TestingPlant

With the following code:

#include "clang/Frontend/ASTUnit.h"
#include "clang/Serialization/ASTWriter.h"
#include "clang/Serialization/InMemoryModuleCache.h"
#include "clang/Tooling/Tooling.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/Bitstream/BitstreamWriter.h"
#include <fstream>
#include <memory>
#include <sstream>
#include <string>

int main() {
	std::ifstream codeFile("input.cc");
	std::stringstream code;
	code << codeFile.rdbuf();

	const std::unique_ptr<clang::ASTUnit> astUnit = clang::tooling::buildASTFromCode(code.str());

	const bool hasErrors = false; // This will not cause a crash if this is true
	llvm::SmallString<128> pchData;
	llvm::BitstreamWriter pchDataStream(pchData);
	clang::InMemoryModuleCache moduleCache;
	clang::ASTWriter astWriter(pchDataStream, pchData, moduleCache, {}, false);
	astWriter.WriteAST(astUnit->getSema(), "", nullptr, "", hasErrors);

	std::ofstream file("input.gch", std::ios::binary | std::ios::out);
	file << static_cast<std::string>(pchData);
}

and the below file:

// input.cc
int main() {
	foo(FOO);
}

clang will crash when using the input.gch file generated from the code.

Output:

$ ./a.out                         
input.cc:3:6: error: use of undeclared identifier 'FOO'    
        foo(FOO);                                          
            ^
$ clang input.gch
/tmp/input.cc:3:2: error: cannot compile this l-value expression yet
        foo(FOO);
        ^~~
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /usr/bin/clang-13 -cc1 -triple x86_64-pc-linux-gnu -emit-obj -mrelax-all --mrelax-relocations -disable-free -disable-llvm-verifier -discard-value-names -main-file-name input.gch -mrelocation-model pic -pic-level 2 -pic-is-pie -mframe-pointer=all -fmath-errno -fno-rounding-math -mconstructor-aliases -munwind-tables -target-cpu x86-64 -tune-cpu generic -debugger-tuning=gdb -fcoverage-compilation-dir=/tmp -resource-dir /usr/lib/clang/13.0.1 -fdebug-compilation-dir=/tmp -ferror-limit 19 -stack-protector 2 -fgnuc-version=4.2.1 -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/input-11f595.o -x precompiled-header input.gch
1.	<eof> parser at end of file
2.	/tmp/input.cc:2:5: LLVM IR generation of declaration 'main'
3.	/tmp/input.cc:2:5: Generating code for declaration 'main'
 #0 0x00007fd23586cea7 (/usr/lib/libLLVM-13.so+0xba6ea7)
 #1 0x00007fd23586a6a6 (/usr/lib/libLLVM-13.so+0xba46a6)
 #2 0x00007fd234920da0 __restore_rt sigaction.c:0:0
 #3 0x00007fd235a39b48 llvm::PointerType::get(llvm::Type*, unsigned int) (/usr/lib/libLLVM-13.so+0xd73b48)
 #4 0x00007fd23cdec3bd clang::CodeGen::CodeGenFunction::EmitUnsupportedLValue(clang::Expr const*, char const*) (/usr/lib/libclang-cpp.so.13+0x180e3bd)
 #5 0x00007fd23ce020e5 clang::CodeGen::CodeGenFunction::EmitLValue(clang::Expr const*) (/usr/lib/libclang-cpp.so.13+0x18240e5)
 #6 0x00007fd23ce12222 clang::CodeGen::CodeGenFunction::EmitCallee(clang::Expr const*) (/usr/lib/libclang-cpp.so.13+0x1834222)
 #7 0x00007fd23ce1266b clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) (/usr/lib/libclang-cpp.so.13+0x183466b)
 #8 0x00007fd23ce1dbbe (/usr/lib/libclang-cpp.so.13+0x183fbbe)
 #9 0x00007fd23ce57e52 clang::CodeGen::CodeGenFunction::EmitScalarExpr(clang::Expr const*, bool) (/usr/lib/libclang-cpp.so.13+0x1879e52)
#10 0x00007fd23ce11cdf clang::CodeGen::CodeGenFunction::EmitAnyExpr(clang::Expr const*, clang::CodeGen::AggValueSlot, bool) (/usr/lib/libclang-cpp.so.13+0x1833cdf)
#11 0x00007fd23ce11eb6 clang::CodeGen::CodeGenFunction::EmitIgnoredExpr(clang::Expr const*) (/usr/lib/libclang-cpp.so.13+0x1833eb6)
#12 0x00007fd23cf1f972 clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/usr/lib/libclang-cpp.so.13+0x1941972)
#13 0x00007fd23cf20822 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/usr/lib/libclang-cpp.so.13+0x1942822)
#14 0x00007fd23cf78446 clang::CodeGen::CodeGenFunction::EmitFunctionBody(clang::Stmt const*) (/usr/lib/libclang-cpp.so.13+0x199a446)
#15 0x00007fd23cf98d40 clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) (/usr/lib/libclang-cpp.so.13+0x19bad40)
#16 0x00007fd23cfa1d78 clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/lib/libclang-cpp.so.13+0x19c3d78)
#17 0x00007fd23cf9fb75 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/usr/lib/libclang-cpp.so.13+0x19c1b75)
#18 0x00007fd23cfc01bf (/usr/lib/libclang-cpp.so.13+0x19e21bf)
#19 0x00007fd23d00f5a4 (/usr/lib/libclang-cpp.so.13+0x1a315a4)
#20 0x00007fd23cf2c779 (/usr/lib/libclang-cpp.so.13+0x194e779)
#21 0x00007fd23d3de019 (/usr/lib/libclang-cpp.so.13+0x1e00019)
#22 0x00007fd23d3683ab non-virtual thunk to clang::ASTReader::StartTranslationUnit(clang::ASTConsumer*) (/usr/lib/libclang-cpp.so.13+0x1d8a3ab)
#23 0x00007fd23bfcc0f6 clang::ParseAST(clang::Sema&, bool, bool) (/usr/lib/libclang-cpp.so.13+0x9ee0f6)
#24 0x00007fd23d5548b9 clang::FrontendAction::Execute() (/usr/lib/libclang-cpp.so.13+0x1f768b9)
#25 0x00007fd23d4fabbf clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/usr/lib/libclang-cpp.so.13+0x1f1cbbf)
#26 0x00007fd23d59ff20 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/usr/lib/libclang-cpp.so.13+0x1fc1f20)
#27 0x000055fe56ac784c cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/usr/bin/clang-13+0x1684c)
#28 0x000055fe56ac9c2d (/usr/bin/clang-13+0x18c2d)
#29 0x000055fe56abe185 main (/usr/bin/clang-13+0xd185)
#30 0x00007fd23490bb25 __libc_start_main (/usr/lib/libc.so.6+0x27b25)
#31 0x000055fe56ac048e _start (/usr/bin/clang-13+0xf48e)
clang-13: error: unable to execute command: Segmentation fault (core dumped)
clang-13: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 13.0.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
clang-13: note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs.

Activity

changed the title [-]clang::ASTWriter can create a crahsing PCH if an incorrect hasErrors value is passed[/-] [+]clang::ASTWriter can create a crashing PCH if an incorrect hasErrors value is passed[/+] on Feb 19, 2022
added
clang:codegenIR generation bugs: mangling, exceptions, etc.
crashPrefer [crash-on-valid] or [crash-on-invalid]
and removed on Feb 19, 2022
llvmbot

llvmbot commented on Feb 19, 2022

@llvmbot
Member

@llvm/issue-subscribers-clang-codegen

Aadi-Mittal-2004

Aadi-Mittal-2004 commented on Mar 9, 2023

@Aadi-Mittal-2004

Hey there, I am new to open source so can you please explain the issue in detail so I can start working on it

phyBrackets

phyBrackets commented on Mar 9, 2023

@phyBrackets
Member

Hey @Aadi-Mittal-2004 , If you are new to LLVM or open source in general, I'd suggest go through the contribution guidelines you can find here for LLVM https://llvm.org/docs/Contributing.html .

About the issue,
The issue is with passing the incorrect hasErrors value in the ASTWriter::WriteAST, that can lead to a crashing PCH(Pre compiled header), here when hasErrors is set to false but there are actually errors in the source code. In that case, the ASTWriter may generate an invalid PCH that can cause a crash when the PCH is later used for compilation but when hasErrors is set to true, then the ASTWriter will write the AST with compiler errors. This means that the generated PCH will contain information about the errors in the source code that were encountered during semantic analysis, and because of this when it later used in the compilation it will probably not cause a crash.
So, I think we need to correctly handle and check the hasErrors state before any major action.

rajkumarananthu

rajkumarananthu commented on Aug 28, 2023

@rajkumarananthu
Contributor

Hi,

If no-one is working on this issue, I would like to take up this, can someone please assign the issue to me. As I am not in the contributor list, I am not able to assign it to myself.

@phyBrackets Thanks for the contributing guidelines and the detailed explanation of the scenario here.

Thanks
Rajkumar Ananthu.

rajkumarananthu

rajkumarananthu commented on Sep 22, 2023

@rajkumarananthu
Contributor

Hi @TestingPlant @phyBrackets @danix800

I am trying to reproduce the issue and find the root cause for the same, but stuck with some linker error, can anyone of you help me in figuring this out.

I am new to llvm project, I managed to build clang and llvm properly. I tried few other things with the build to get more exposure on how things work.

But when I am trying to compile and link the code given above, I am facing an issue with linker as follows:

/usr/bin/ld: /tmp/cch79L0i.o:(.data.rel+0x0): undefined reference to `llvm::EnableABIBreakingChecks'
/usr/bin/ld: /tmp/cch79L0i.o: in function `llvm::MallocAllocator::Deallocate(void const*, unsigned long, unsigned long)':
test.cc:(.text._ZN4llvm15MallocAllocator10DeallocateEPKvmm[_ZN4llvm15MallocAllocator10DeallocateEPKvmm]+0x2f): undefined reference to `llvm::deallocate_buffer(void*, unsigned long, unsigned long)'
/usr/bin/ld: /tmp/cch79L0i.o: in function `llvm::SmallString<128u>::operator std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >() const':
test.cc:(.text._ZNK4llvm11SmallStringILj128EEcvNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEv[_ZNK4llvm11SmallStringILj128EEcvNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEv]+0x38): undefined reference to `llvm::SmallVectorBase<unsigned long>::size() const'
/usr/bin/ld: /tmp/cch79L0i.o: in function `llvm::SmallVectorTemplateCommon<char, void>::end()':
test.cc:(.text._ZN4llvm25SmallVectorTemplateCommonIcvE3endEv[_ZN4llvm25SmallVectorTemplateCommonIcvE3endEv]+0x28): undefined reference to `llvm::SmallVectorBase<unsigned long>::size() const'
/usr/bin/ld: /tmp/cch79L0i.o: in function `llvm::SmallVectorTemplateCommon<std::unique_ptr<clang::PCHContainerReader, std::default_delete<clang::PCHContainerReader> >, void>::end()':
test.cc:(.text._ZN4llvm25SmallVectorTemplateCommonISt10unique_ptrIN5clang18PCHContainerReaderESt14default_deleteIS3_EEvE3endEv[_ZN4llvm25SmallVectorTemplateCommonISt10unique_ptrIN5clang18PCHContainerReaderESt14default_deleteIS3_EEvE3endEv]+0x28): undefined reference to `llvm::SmallVectorBase<unsigned int>::size() const'

I am using the following command to compile & generate the executable for the input C++ program: (bld_ninja is my build directory)

g++ -I clang/include -I bld_ninja/tools/clang/include -I bld_ninja/include -I llvm/include test.cc $CLANGLIBS -std=c++17   -fno-exceptions -funwind-tables -fno-rtti -D_GNU_SOURCE -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_LIBCPP_ENABLE_HARDENED_MODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -L bld_ninja/lib

$CLANGLIBS has the clang libraries linked in correct order:

export CLANGLIBS="-lclangTooling -lclangFrontendTool -lclangFrontend -lclangDriver -lclangSerialization -lclangCodeGen -lclangParse -lclangSema -lclangStaticAnalyzerFrontend -lclangStaticAnalyzerCheckers -lclangStaticAnalyzerCore -lclangAnalysis -lclangARCMigrate -lclangRewrite -lclangRewriteFrontend -lclangEdit -lclangAST -lclangLex -lclangBasic -lcurses"

Based on the error above, I assume that this error is because the LLVM libraries are not linked properly, and I am not sure about the order of LLVM libraries that has to be listed as part of this. I tried using llvm-config to get some, but that did not solve my problem.

Can anyone help me with this.

Thanks
Rajkumar Ananthu

phyBrackets

phyBrackets commented on Sep 25, 2023

@phyBrackets
Member

Hi, not exactly sure did you try building with $(llvm-config --cxxflags) $(llvm-config --ldflags) or you might wanna use -DLLVM_DISABLE_ABI_BREAKING_CHECKS_ENFORCING=OFF

rajkumarananthu

rajkumarananthu commented on Sep 25, 2023

@rajkumarananthu
Contributor

@phyBrackets I have tried this still it is the same issue, I have tried passing the absolute paths also, still the same.

rajkumarananthu

rajkumarananthu commented on Oct 2, 2023

@rajkumarananthu
Contributor

Hi Team,

I kind of followed this thread: https://stackoverflow.com/questions/8607432/link-fails-with-clang-llvm-using-g

And used https://github.com/loarabia/Clang-tutorial/blob/master/makefile this make file to compile,

And then for the linker errors I am getting further, I followed the LLVM thread https://discourse.llvm.org/t/undefined-reference-only-when-including-astmatchers/67687 and added libclang-cpp.so to the $CLANGLIBS list and thus solved my issue.

Now I am able to reproduce the issue, I will work to root cause the issue further and post any updates here.

Thank you for the time and support!

10 remaining items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Labels

clang:codegenIR generation bugs: mangling, exceptions, etc.crashPrefer [crash-on-valid] or [crash-on-invalid]good first issuehttps://github.com/llvm/llvm-project/contribute

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    Participants

    @vgvassilev@EugeneZelenko@rajkumarananthu@TestingPlant@llvmbot

    Issue actions

      clang::ASTWriter can create a crashing PCH if an incorrect hasErrors value is passed · Issue #53952 · llvm/llvm-project