A deep dive into Clang's source file compilation
2023-9-24 15:0:0 Author: maskray.me(查看原文) 阅读量:20 收藏

Clang is a C/C++ compiler that generates LLVM IR and utilitizes LLVM to generate relocatable object files. Using the classic three-stage compiler structure, the stages can be described as follows:

1
C/C++ =(front end)=> LLVM IR =(middle end)=> LLVM IR (optimized) =(back end)=> relocatable object file

If we follow the representation of functions and instructions, a more detailed diagram looks like this:

1
C/C++ =(front end)=> LLVM IR =(middle end)=> LLVM IR (optimized) =(instruction selector)=> MachineInstr =(AsmPrinter)=> MCInst =(assembler)=> relocatable object file

LLVM and Clang are designed as a collection of libraries. This post describes how different libraries work together to create the final relocatable object file. I will focus on how a function goes through the multiple compilation stages.

This post describes how different libraries work together to create the final relocatable object file.

<%- toc(page.content) %>

Compiler frontend

The compiler frontend primarily comprises the following libraries:

  • clangDriver
  • clangFrontend
  • clangParse and clangSema
  • clangCodeGen

Let's use a C++ source file as an example.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
% cat a.cc
template <typename T>
T div(T a, T b) {
return a / b;
}

__attribute__((noinline))
int foo(int a, int b, int c) {
int s = a + b;
return div(s, c);
}

int main() {
return foo(3, 2, 1);
}
% clang++ -g a.cc

The entry point of the Clang executable is implemented in clang/tools/driver/. clang_main creates a clang::driver::Driver instance, calls BuildCompilation to construct a clang::driver::Compilation instance, and then calls ExecuteCompilation.

clangDriver

clangDriver parses the command line arguments, constructs compilation actions, assigns actions to tools, generates commands for these tools, and executes the commands.

You may read Compiler driver and cross compilation for additional information.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
BuildCompilation
getToolchain
HandleImmediateArgs
BuildInputs
BuildActions
handleArguments
BuildJobs
BuildJobsForAction
ToolChain::SelectTool
Clang::ConstructJob
Clang::RenderTargetOptions
renderDebugOptions
ExecuteCompilation
ExecuteJobs
ExecuteJob
CC1Command::Execute
cc1_main

For clang++ -g a.cc, clangDriver identifies the following phases: preprocessor, compiler (C++ to LLVM IR), backend, assembler, and linker. The first several phases can be performed by one single clang::driver::tools::Clang object (also known as Clang cc1), while the final phase requires an external program (the linker).

1
2
3
4
% clang++ -g a.cc '-###'
...
"/tmp/Rel/bin/clang-18" "-cc1" "-triple" "x86_64-unknown-linux-gnu" "-emit-obj" ...
"/usr/bin/ld" "-pie" ... -o a.out ... /tmp/a-f58f75.o ...

cc1_main in clangDriver calls ExecuteCompilerInvocation defined in clangFrontend.

clangFrontend

clangFrontend defines CompilerInstance, which manages various classes, including CompilerInvocation, DiagnosticsEngine, TargetInfo, FileManager, SourceManager, Preprocessor, ASTContext, ASTConsumer, and Sema.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
ExecuteCompilerInvocation
CreateFrontendAction
ExecuteAction
FrontendAction::BeginSourceFile
CompilerInstance::createFileManager
CompilerInstance::createSourceManager
CompilerInstance::createPreprocessor
CompilerInstance::createASTContext
CreateWrappedASTConsumer
BackendConsumer::BackendConsumer
CodeGenerator::CodeGenerator
CompilerInstance::setASTConsumer
CodeGeneratorImpl::Initialize
CodeGenModule::CodeGenModule
FrontendAction::Execute
FrontendAction::ExecutionAction => CodeGenAction
ASTFrontendAction::ExecuteAction
CompilerInstance::createSema
ParseAST
FrontendAction::EndSourceFile

In ExecuteCompilerInvocation, a FrontAction is created based on the CompilerInstance argument and then executed. When using the -emit-obj option, the selected FrontAction is an EmitObjAction, which is a derivative of CodeGenAction.

During FrontendAction::BeginSourceFile, several classes mentioned earlier are created, and a BackendConsumer is also established. The BackendConsumer serves as a wrapper around CodeGenerator, which is another derivative of ASTConsumer. Finally, in FrontendAction::BeginSourceFile, CompilerInstance::setASTConsumer is called to create a CodeGenModule object, responsible for managing an LLVM IR module.

In FrontendAction::Execute, CodeGenAction::ExecuteAction is invoked, primarily handling the compilation of LLVM IR files. This function, in turn, calls the base function ASTFrontendAction::ExecuteAction, which, in essence, triggers the entry point of clangParse: ParseAST.

clangParse and clangSema

clangParse consumes tokens from clangLex and invokes parser actions, many of which are named Act*, defined in clangSema. clangSema performs semantic analysis and generates AST nodes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
ParseAST
ParseFirstTopLevelDecl
Sema::ActOnStartOfTranslationUnit
ParseTopLevelDecl
ParseDeclarationOrFunctionDefinition
ParseDeclOrFunctionDefInternal
ParseDeclGroup
ParseFunctionDefinition
ParseFunctionStatementBody
ParseCompoundStatementBody
ParseStatementOrDeclaration
ParseStatementOrDeclarationAfterAttributes
Sema::ActOnDeclStmt
Sema::ActOnCompoundStmt
Sema::ActOnFinishFunctionBody
Sema::ConvertDeclToDeclGroup
BackendConsumer::HandleTopLevelDecl
BackendConsumer::HandleTranslationUnit

In the end, we get a full AST (actually a misnomer as the representation is not abstract, not only about syntax, and is not a tree). ParseAST calls virtual functions HandleTopLevelDecl and HandleTranslationUnit.

clangCodeGen

BackendConsumer defined in clangCodeGen overrides HandleTopLevelDecl and HandleTranslationUnit to perform LLVM IR and machine code generation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
BackendConsumer::HandleTopLevelDecl
CodeGenModule::EmitTopLevelDecl
CodeGenModule::EmitGlobal
CodeGenModule::EmitGlobalDefinition
CodeGenModule::EmitGlobalFunctionDefinition
CodeGenFunction::CodeGenFunction
CodeGenFunction::GenerateCode
CodeGenFunction::StartFunction
CodeGenFunction::EmitFunctionBody
BackendConsumer::HandleTranslationUnit
setupLLVMOptimizationRemarks
EmitBackendOutput
EmitAssemblyHelper::EmitAssembly
EmitAssemblyHelper::RunOptimizationPipeline
PassBuilder::buildPerModuleDefaultPipeline // There are other build*Pipeline alternatives
MPM.run(*TheModule, MAM);
EmitAssemblyHelper::RunCodegenPipeline
EmitAssemblyHelper::AddEmitPasses
LLVMTargetMachine::addPassesToEmitFile
CodeGenPasses.run(*TheModule);

BackendConsumer::HandleTopLevelDecl generates LLVM IR for each top-level declaration. This means that Clang generates a function at a time.

BackendConsumer::HandleTranslationUnit invokes EmitBackendOutput to create an LLVM IR file, an assembly file, or a relocatable object file. EmitBackendOutput establishes an optimization pipeline and a machine code generation pipeline.

Now let's explore CodeGenFunction::EmitFunctionBody. Generating IR for a variable declaration and a return statement involve the following functions, among others:

1
2
3
4
5
6
7
8
9
10
11
EmitFunctionBody
EmitCompoundStmtWithoutScope
EmitStmt
EmitSimpleStmt
EmitDeclStmt
EmitDecl
EmitVarDecl
EmitStopPoint
EmitReturnStmt
EmitScalarExpr
ScalarExprEmitter::EmitBinOps

After generating the LLVM IR, clangCodeGen proceeds to execute EmitAssemblyHelper::RunOptimizationPipeline to perform middle-end optimizations and subsequently EmitAssemblyHelper::RunCodegenPipeline to generate machine code.

Compiler middle end

EmitAssemblyHelper::RunOptimizationPipeline creates a pass manager to schedule the middle-end optimization pipeline. This pass manager executes numerous optimization passes and analyses.

The option -mllvm -print-pipeline-passes provides insight into these passes:

1
2
% clang -c -O1 -mllvm -print-pipeline-passes a.c
annotation2metadata,forceattrs,declare-to-assign,inferattrs,coro-early,...

Compiler back end

The demarcation between the middle end and the back end may not be entirely distinct. Within LLVMTargetMachine::addPassesToEmitFile, several IR passes are scheduled. It's reasonable to consider these IR passes as part of the middle end, while the phase beginning with instruction selection can be regarded as the actual back end.

Here is an overview of LLVMTargetMachine::addPassesToEmitFile:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
LLVMTargetMachine::addPassesToEmitFile
addPassesToGenerateCode
TargetPassConfig::addISelPasses
TargetPassConfig::addIRPasses => X86PassConfig::addIRPasses
TargetPassConfig::addCodeGenPrepare # -O1 or above
TargetPassConfig::addPassesToHandleExceptions
TargetPassConfig::addISelPrepare
TargetPassConfig::addPreISel => X86PassConfig::addPreISel
addPass(createCallBrPass());
addPass(createPrintFunctionPass(...)); # if -print-isel-input
addPass(createVerifierPass());
TargetPassConfig::addCoreISelPasses # SelectionDAG or GlobalISel
TargetPassConfig::addMachinePasses
LLVMTargetMachine::addAsmPrinter
PM.add(createPrintMIRPass(Out)); // if -stop-before or -stop-after
PM.add(createFreeMachineFunctionPass());

These IR and machine passes are scheduled by the legacy pass manager. The option -mllvm -debug-pass=Structure provides insight into these passes:

1
clang -c -O1 a.c -mllvm -debug-pass=Structure

Instruction selector

There are three instruction selectors: SelectionDAG, FastISel, and GlobalISel. FastISel is integrated within the SelectionDAG framework.

For most targets, FastISel is the default for clang -O0 while SelectionDAG is the default for optimized builds. However, for most AArch64 -O0 configurations, GlobalISel is the default.

SelectionDAG

See https://llvm.org/docs/WritingAnLLVMBackend.html#instruction-selector.

1
2
3
4
5
SectionDAG: normal code path
LLVM IR =(visit)=> SDNode =(DAGCombiner,LegalizeTypes,DAGCombiner,Legalize,DAGCombiner,Select,Schedule)=> MachineInstr

SectionDAG: FastISel (fast but not optimal)
LLVM IR =(FastISel)=> MachineInstr
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
TargetPassConfig::addCoreISelPasses
addInstSelector(); // add an instance of a target-specific derived class of SelectionDAGISel
addPass(&FinalizeISelID);

SelectionDAGISel::runOnMachineFunction
TargetMachine::resetTargetOptions
SelectionDAGISel::SelectAllBasicBlocks
SelectionDAGISel::SelectBasicBlock
SelectionDAGBuilder::visit
SelectionDAGISel::CodeGenAndEmitDAG
CurDAG->Combine(BeforeLegalizeTypes, AA, OptLevel);
Changed = CurDAG->LegalizeTypes();
if (Changed)
CurDAG->Combine(AfterLegalizeTypes, AA, OptLevel);
Changed = CurDAG->LegalizeVectors();
if (Changed) {
CurDAG->LegalizeTypes();
CurDAG->Combine(AfterLegalizeVectorOps, AA, OptLevel);
}
CurDAG->Legalize();
DoInstructionSelection
Select
SelectCode
PreprocessISelDAG()
Scheduler->Run(CurDAG, FuncInfo->MBB);
Scheduler->EmitSchedule(FuncInfo->InsertPt);
EmitNode
CreateMachineInstr

Each backend implements a derived class of SelectionDAGISel. For example, the X86 backend implements X86DAGToDAGISel and overrides runOnMachineFunction to set up variables like X86Subtarget and then invokes the base function SelectionDAGISel::runOnMachineFunction.

SelectionDAGISel creates a SelectionDAGBuilder. For each basic block, SelectionDAGISel::SelectBasicBlock iterates over all IR instructions and calls SelectionDAGBuilder::visit on them, creating a new SDNode for each Value that becomes part of the DAG.

The initial DAG may contain types and operations that are not natively supported by the target. SelectionDAGISel::CodeGenAndEmitDAG invokes LegalizeTypes and Legalize to convert unsupported types and operations to supported ones.

For llvm.memset, the call stack may resemble the following:

1
2
3
4
SelectionDAGBuilder::visit
SelectionDAGBuilder::visitCall
SelectionDAGBuilder::visitIntrinsicCall
SelectionDAG::getMemset

ScheduleDAGSDNodes::EmitSchedule emits the machine code (MachineInstrs) in the scheduled order.

FastISel, typically used for clang -O0, represents a fast path of SelectionDAG that generates less optimized machine code.

When FastISel is enabled, SelectAllBasicBlocks tries to skip SelectBasicBlock and select instructions with FastISel. However, FastISel only handles a subset of IR instructions. For unhandled instructions, SelectAllBasicBlocks falls back to SelectBasicBlock to handle the remaining instructions in the basic block.

GlobalISel

GlobalISel is a new instruction selection framework that operates on the entire function, in contrast to the basic block view of SelectionDAG. GlobalISel offers improved performance and modularity.

The design of the generic MachineInstr replaces an intermediate representation, SDNode, which was used in the SelectionDAG framework.

1
LLVM IR =(IRTranslator)=> generic MachineInstr =(Legalizer,RegBankSelect,GlobalInstructionSelect)=> MachineInstr
1
2
3
4
5
6
7
8
9
10
TargetPassConfig::addCoreISelPasses
addIRTranslator();
addPreLegalizeMachineIR();
addPreRegBankSelect();
addRegBankSelect();
addPreGlobalInstructionSelect();
addGlobalInstructionSelect();
Pass to reset the MachineFunction if the ISel failed.
addInstSelector();
addPass(&FinalizeISelID);

Machine passes

1
2
3
4
5
6
7
8
9
10
11
TargetPassConfig::addMachinePasses
TargetPassConfig::addSSAOptimization
TargetPassConfig::addPreRegAlloc
TargetPassConfig::addOptimizedRegAlloc
TargetPassConfig::addPostRegAlloc
addPass(createPrologEpilogInserterPass());
TargetPassConfig::addMachineLateOptimization
TargetPassConfig::addPreSched2
TargetPassConfig::addPreEmitPass
// basic block section related passes
TargetPassConfig::addPreEmitPass2

AsmPrinter

This target-specific AsmPrinter pass converts MachineInstrs to MCInsts and emits them to a MCStreamer.

MC

Clang has the capability to output either assembly code or an object file. Generating an object file directly without involving an assembler is referred to as "direct object emission".

To provide a unified interface, MCStreamer is created to handle the emission of both assembly code and object files. The two primary subclasses of MCStreamer are MCAsmStreamer and MCObjectStreamer, responsible for emitting assembly code and machine code respectively.

LLVMAsmPrinter calls the MCStreamer API to emit assembly code or machine code.

In the case of an assembly input file, LLVM creates an MCAsmParser object (LLVMMCParser) and a target-specific MCTargetAsmParser object. The MCAsmParser is responsible for tokenizing the input, parsing assembler directives, and invoking the MCTargetAsmParser to parse an instruction. Both the MCAsmParser and MCTargetAsmParser objects can call the MCStreamer API to emit assembly code or machine code.

For an instruction parsed by the MCTargetAsmParser, if the streamer is an MCAsmStreamer, the MCInst will be pretty-printed. If the streamer is an MCELFStreamer (other object file formats are similar), MCELFStreamer::emitInstToData will use ${Target}MCCodeEmitter from LLVM${Target}Desc to encode the MCInst, emit its byte sequence, and records needed relocations. An ELFObjectWriter object is used to write the relocatable object file.

You may read my post Assemblers for more information about the LLVM integrated assembler.


文章来源: https://maskray.me/blog/2023-09-24-a-deep-dive-into-clang-source-file-compilation
如有侵权请联系:admin#unsafe.sh