-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[3.0] Non-determinism in self-host on x86-32 linux #11572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The CommandLine.o and Path.o will be ignored, similar to how gcc ignores some harmless diffs in their .o files. The Linux ones are a bit more worrisome. Can you generate the .s files for one of those? If it's something like different debugging information, we can ignore it (and maybe get the tests to ignore it as well...). |
It seems to be real. Here are the differences between SparseBitVectorTest 2448c2448
|
Oh my...this is bad. Looks like non-determinism has snuck in. |
This is looking worse and worse. I tried having the clang build in phase1 |
I can reproduce this with a Release+Asserts build, compiling Function.o. There is a difference in the machine code coming out of isel in the function _ZNK4llvm8Function14getIntrinsicIDEv: There is an extra jump table (jt4), and some virtual register numbers are off by two: @@ -84733,12 +84733,12 @@ define i32 @_ZNK4llvm8Function14getIntri
BB#0: derived from LLVM BB %entry
Later on some BB numbers are also switched. -BB#6266: derived from LLVM BB %_ZNK4llvm9StringRefixEj.exit9479
The IR printed by -print-isel-output is identical, although that printout doesn't include use-list orders. This is probably not a problem with the clang frontend, my guess is the selectiondag switch lowering. |
Duncan, can you tell if this is a proper nondeterminism issue (i.e., the same compiler is producing different results in different runs), or if it is simply the phase 1 and 2 compilers producing different, but consistent output? I suspect that this is related to floating point code in switch lowering. Would gcc build the phase1 compiler with x87 floats while clang builds phase2 with SSE floats? |
Enabling debug output from SelectionDAGBuilder.cpp, I get: @@ -52719,32 +52719,10 @@ Which translates to this code in handleJTSwitchCase() making different decisions: APInt Range = ComputeRange(First, Last); DEBUG(dbgs() << "Lowering jump table\n" This is definitely caused by gcc using x87 instructions while clang defaults to SSE. |
Is it possible to use APFloat instead of roundToDouble(), or would that result in the same thing? |
This particular test is pretty simple to do in pure integers, so that's what I'll do. |
Fixed in r143006. Duncan, please verify. Bill, please merge. |
Merged in. Thanks! |
With a release build, the phase1 and phase2 compilers each seems |
I will reopen after the next round of release testing if this is still an issue. |
I would expect Phase 1 and Phase 2 to have different .o files. Phase 2 and Phase 3 should have identical .o files, though. (That is .o files for the compiler itself. If the Phase 1 and Phase 2 compilers are generating different code for the same program, then that doesn't sound good.) |
Phase1 and phase2 were generating different .o files. Hopefully Jakob fixed |
The problem is gcc and clang have different defaults for floating point code generation on i386-linux. The Phase1 compiler is built by gcc using x87 floating point instructions (80-bit precision). It thinks that 4.0/10 < 0.4. The Phase2 compiler is built by clang using SSE floating point instructions (64-bit). It thinks that 4.0/10 >= 0.4. This difference means that the Phase1 and Phase2 compilers disagree about when to create a jump table when building the Phase2 and Phase3 compilers. A fourth phase would be identical to Phase3. |
Exactly the same problem occurs in rc2. |
grepping for double in lib/CodeGen shows that there are a still a bunch of |
While this should be fixed, I don't think it can be considered release critical: |
Do the compare differences go away if we pass -msee to gcc when building stage1? If not, do they go away if we do a 4 stage bootstrap? |
log of 4-stage bootstrap on x86_64-darwin11 Search for the string "Comparing objects from stage" in the log. On objects that differ, I try to examine the disassembly, strings dumps, and resolve away differences in file/path names. I have not been able to figure out why the TG-generated .inc files differ in 2 vs. 3. Let me know if I can tar up any files for examination. |
These random failures in the stage2/stage3 comparison for the llvm 3.4.1 bootstrap (as currently built in fink for llvm34-3.4.1-0a) appear to be a side-effect of the implicit use of -stdlib=libstdc++ in stage1 on 10.7/10.8 and -stdlib=libc++ in stage2 and stage3 (as those compilers are built against the libc++ from the matching llvm release). I have never seen a stage4 occur in that llvm34 bootstrap on 10.9 which defaults to -stdlib=libc++ in stage1. Explicitly setting -DCMAKE_CXX_FLAGS="-fno-common -std=c++11 -stdlib=libc++" in stage1 on 10.7 with Xcode 4.6.3 and 10.8 with Xcode 5.1.1 eliminates the occurrences of the stage2/stage3 comparison failure and the stage3 bootstrap always succeeds. Since the system libc++ support in 10.7/10.8 is (currently) sufficient to bootstrap llvm 3.5svn with its new requirement for a c++ shared library that supports c++-11, the use of -std=c++11 -stdlib=libc++ for the stage1 bootstrap appears to be the solution going forward. It would be nice if Apple could update the system libc++ on 10.8 to a newer compatibility version matching the llvm svn release used in Xcode 5.1.1. |
Concerning the 3 or 4-stage bootstrap on darwin11+: |
Old build issue |
Extended Description
Using the script utils/release/test-release.sh to do a Release clang self-host,
on 64 bit linux I see the following:
Comparing Phase 2 and Phase 3 files
file CommandLine.o differs between phase 2 and phase 3
file Path.o differs between phase 2 and phase 3
This is normal: these two files have the path and or date embedded in them
(that should be fixed if possible, but that's another story).
However on 32 bit linux I see:
Comparing Phase 2 and Phase 3 files
file SparseBitVectorTest.o differs between phase 2 and phase 3
file CXType.o differs between phase 2 and phase 3
file LiveVariables.o differs between phase 2 and phase 3
file PrintfFormatString.o differs between phase 2 and phase 3
file FormatString.o differs between phase 2 and phase 3
file CGDebugInfo.o differs between phase 2 and phase 3
file Lexer.o differs between phase 2 and phase 3
file ExprConstant.o differs between phase 2 and phase 3
file ASTContext.o differs between phase 2 and phase 3
file Expr.o differs between phase 2 and phase 3
file SemaChecking.o differs between phase 2 and phase 3
file SemaExprMember.o differs between phase 2 and phase 3
file SemaType.o differs between phase 2 and phase 3
file ParseDecl.o differs between phase 2 and phase 3
file ParseExprCXX.o differs between phase 2 and phase 3
file ParseStmt.o differs between phase 2 and phase 3
file ClangAttrEmitter.o differs between phase 2 and phase 3
file ClangDiagnosticsEmitter.o differs between phase 2 and phase 3
file MSP430InstrInfo.o differs between phase 2 and phase 3
file TargetData.o differs between phase 2 and phase 3
file AlphaISelDAGToDAG.o differs between phase 2 and phase 3
file X86ISelLowering.o differs between phase 2 and phase 3
file X86FloatingPoint.o differs between phase 2 and phase 3
file X86CodeEmitter.o differs between phase 2 and phase 3
file X86InstrInfo.o differs between phase 2 and phase 3
file X86FastISel.o differs between phase 2 and phase 3
file X86DisassemblerDecoder.o differs between phase 2 and phase 3
file X86AsmParser.o differs between phase 2 and phase 3
file X86AsmBackend.o differs between phase 2 and phase 3
file X86MCCodeEmitter.o differs between phase 2 and phase 3
file BlackfinISelLowering.o differs between phase 2 and phase 3
file BlackfinRegisterInfo.o differs between phase 2 and phase 3
file CBackend.o differs between phase 2 and phase 3
file ARMCodeEmitter.o differs between phase 2 and phase 3
file ARMFastISel.o differs between phase 2 and phase 3
file ARMBaseInstrInfo.o differs between phase 2 and phase 3
file ARMExpandPseudoInsts.o differs between phase 2 and phase 3
file ARMDisassembler.o differs between phase 2 and phase 3
file ARMInstPrinter.o differs between phase 2 and phase 3
file ARMAsmParser.o differs between phase 2 and phase 3
file SystemZISelLowering.o differs between phase 2 and phase 3
file SPUISelDAGToDAG.o differs between phase 2 and phase 3
file SPUISelLowering.o differs between phase 2 and phase 3
file MBlazeDisassembler.o differs between phase 2 and phase 3
file MBlazeAsmParser.o differs between phase 2 and phase 3
file Lint.o differs between phase 2 and phase 3
file ConstantFolding.o differs between phase 2 and phase 3
file LiveVariables.o differs between phase 2 and phase 3
file LiveIntervalAnalysis.o differs between phase 2 and phase 3
file ShrinkWrapping.o differs between phase 2 and phase 3
file PHIElimination.o differs between phase 2 and phase 3
file MachineVerifier.o differs between phase 2 and phase 3
file AsmPrinterDwarf.o differs between phase 2 and phase 3
file DwarfException.o differs between phase 2 and phase 3
file DAGCombiner.o differs between phase 2 and phase 3
file SelectionDAGBuilder.o differs between phase 2 and phase 3
file JITDwarfEmitter.o differs between phase 2 and phase 3
file SimplifyCFG.o differs between phase 2 and phase 3
file AsmLexer.o differs between phase 2 and phase 3
file Function.o differs between phase 2 and phase 3
file ConstantFold.o differs between phase 2 and phase 3
file AsmWriter.o differs between phase 2 and phase 3
file Module.o differs between phase 2 and phase 3
file Instructions.o differs between phase 2 and phase 3
file regexec.o differs between phase 2 and phase 3
file APInt.o differs between phase 2 and phase 3
file CommandLine.o differs between phase 2 and phase 3
file Path.o differs between phase 2 and phase 3
file SubtargetEmitter.o differs between phase 2 and phase 3
The text was updated successfully, but these errors were encountered: