Compiler driver and cross compilation
2021-03-28 16:00:00 Author: maskray.me(查看原文) 阅读量:211 收藏

Compiler driver

The gcc program is a compiler driver. It invokes other programs to do the work of compiling (cc1, cc1plus), assembling (GNU as), and linking (collect2). The behavior is controlled by spec strings, which are provided by a plain-text spec file.

You can run gcc -dumpspecs to dump the built-in spec file. It is complex but the main idea is construction of cc1/assembler/linker command lines. Note: the interaction with the assembler/the linker should be clear from the output.

The g++ program is another compiler driver. It additionally links against the C++ library. The two programs are otherwise equivalent.

You can specify -specs= to override built-in directives. Here is an spec file derived from musl-gcc:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
*srcdir:
/tmp/musl

*prefix:
/tmp/musl/Debug

%rename cpp_options old_cpp_options

*cpp_options:
-nostdinc -isystem %(srcdir)/arch/x86_64 -isystem %(srcdir)/arch/generic -isystem %(srcdir)/include -isystem %(prefix)/obj/include -isystem include%s %(old_cpp_options)

*cc1:
%(cc1_cpu) -isystem %(srcdir)/arch/x86_64 -isystem %(srcdir)/arch/generic -isystem %(srcdir)/include -isystem %(prefix)/obj/include -nostdinc -isystem include%s

*link_libgcc:
-L%(prefix)/lib -L .%s

*libgcc:
libgcc.a%s %:if-exists(libgcc_eh.a%s)

*startfile:
%{!shared: %(prefix)/lib/Scrt1.o} %(prefix)/lib/crti.o crtbeginS.o%s

*endfile:
crtendS.o%s %(prefix)/lib/crtn.o

*link:
-dynamic-linker %(prefix)/lib/libc.so -nostdlib %{shared:-shared} %{static:-static} %{rdynamic:-export-dynamic}

This makes it easy to try out musl on a glibc-based system.

While a spec file can control some behaviors of gcc, many behaviors (target preferences) are guarded by macros are configured at build time. It is quite common for toolchain developers to experiment with different configure options.

The Clang driver is similar to the gcc program in concepts but does more things. You can specify clang --target=aarch64-linux-gnu to get aarch64 defaults. The specified other options are translated by the driver into cc1 options. In many cases you can observe the differences between two targets by comparing their cc1 output. This design makes testing easy. It is recommended to test features with cc1 options and place the target-specific behavior tests under clang/test/Driver/.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
static CodeGenOptions::FramePointerKind
getFramePointerKind(const ArgList &Args, const llvm::Triple &Triple) {
Arg *A = Args.getLastArg(options::OPT_fomit_frame_pointer,
options::OPT_fno_omit_frame_pointer);
bool OmitFP = A && A->getOption().matches(options::OPT_fomit_frame_pointer);
bool NoOmitFP =
A && A->getOption().matches(options::OPT_fno_omit_frame_pointer);
bool OmitLeafFP = Args.hasFlag(options::OPT_momit_leaf_frame_pointer,
options::OPT_mno_omit_leaf_frame_pointer,
Triple.isAArch64() || Triple.isPS4CPU());
if (NoOmitFP || mustUseNonLeafFramePointerForTarget(Triple) ||
(!OmitFP && useFramePointerForTargetByDefault(Args, Triple))) {
if (OmitLeafFP)
return CodeGenOptions::FramePointerKind::NonLeaf;
return CodeGenOptions::FramePointerKind::All;
}
return CodeGenOptions::FramePointerKind::None;
}

CodeGenOptions::FramePointerKind FPKeepKind = getFramePointerKind(Args, RawTriple);
const char *FPKeepKindStr = nullptr; 6 refs
switch (FPKeepKind) {
case CodeGenOptions::FramePointerKind::None:
FPKeepKindStr = "-mframe-pointer=none";
break;
case CodeGenOptions::FramePointerKind::NonLeaf:
FPKeepKindStr = "-mframe-pointer=non-leaf";
break;
case CodeGenOptions::FramePointerKind::All:
FPKeepKindStr = "-mframe-pointer=all";
break;
}
CmdArgs.push_back(FPKeepKindStr);

Input kind and output kind

The driver recognizes the file name suffix to determine the compilation pipeline.

  • *.c: C source code which must be preprocessed
  • *.cc *.cpp: C++ source code which must be preprocessed
  • *.h: C header file to precompile
  • *.hh *.hpp: C++ header file to precompile
  • *.i: C source code which should not be preprocessed
  • *.ii: C++ source code which should not be preprocessed
  • ...
  • other: object file to be fed straight into linking

gcc a.c performs preprocessing/analysis/compiling/assembly generation/assembling/linking. gcc a.i skips preprocessing. g++ a.cc b.cc performs every phase before linking for each input file and does a link on all object files.

Some options can cause the driver/compiler to dispatch/do less work. The most common ones are:

  • -E: preprocess
  • -fsyntax-only/clang cc1 -emit-ast: semantic analysis
  • -S: compile, emit assembly
  • -c: compile, emit object file
  • default: link

Clang has an integrated assembler which is enabled by default for most cases. When it is enabled, clang -c and clang -S just choose the different streamers (assembly vs object file). clang -S -fno-integrated-as may behave differently because certain features may be integrated assembler only, or only supported by very new GNU as. I added -fbinutils-version= to give users a choice not to worry about old GNU as/ld.

GCC does not have an integrated assembler. -c causes GCC to additionally feed the assembly to GNU as.

Debugging

-v and -### can print the command lines. -### skips execution.

1
2
3
4
5
6
7
% /tmp/RelA/bin/clang a.c '-###'
clang version 13.0.0
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/RelA/bin
"/tmp/RelA/bin/clang-13" "-cc1" "-triple" "x86_64-unknown-linux-gnu" "-emit-obj" "-mrelax-all" "--mrelax-relocations" "-disable-free" "-main-file-name" "a.c" "-mrelocation-model" "static" "-mframe-pointer=all" "-fmath-errno" "-fno-rounding-math" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-tune-cpu" "generic" "-debugger-tuning=gdb" "-fcoverage-compilation-dir=/tmp/c" "-resource-dir" "/tmp/RelA/lib/clang/13.0.0" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/usr/lib/gcc/x86_64-linux-gnu/10/../../../../x86_64-linux-gnu/include" "-internal-isystem" "/tmp/RelA/lib/clang/13.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fdebug-compilation-dir=/tmp/c" "-ferror-limit" "19" "-fgnuc-version=4.2.1" "-fcolor-diagnostics" "-faddrsig" "-D__GCC_HAVE_DWARF2_CFI_ASM=1" "-o" "/tmp/a-04ff16.o" "-x" "c" "a.c"
"/usr/local/bin/ld" "--eh-frame-hdr" "-m" "elf_x86_64" "-dynamic-linker" "/lib64/ld-linux-x86-64.so.2" "-o" "a.out" "/lib/x86_64-linux-gnu/crt1.o" "/lib/x86_64-linux-gnu/crti.o" "/usr/lib/gcc/x86_64-linux-gnu/10/crtbegin.o" "-L/usr/lib/gcc/x86_64-linux-gnu/10" "-L/usr/lib/gcc/x86_64-linux-gnu/10/../../../../lib64" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/usr/lib/../lib64" "-L/usr/lib/gcc/x86_64-linux-gnu/10/../../.." "-L/tmp/RelA/bin/../lib" "-L/lib" "-L/usr/lib" "/tmp/a-04ff16.o" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "-lc" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "/usr/lib/gcc/x86_64-linux-gnu/10/crtend.o" "/lib/x86_64-linux-gnu/crtn.o"

Include and library paths

In GCC, the built-in include paths are computed by cc1/cc1plus (global variable include_prefixes). The rule is very complex. -v can give us the precedence.

Vanilla GCC

--enable-multi-arch is the default for native builds if glibc supports it.

Let's look at a multiarch build.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Configured with --disable-bootstrap --enable-languages=c,c++ --disable-multilib.
% /tmp/opt/gcc-debug/bin/gcc --print-multiarch
x86_64-linux-gnu
% /tmp/opt/gcc-debug/bin/gcc --print-multi-os
../lib64
% /tmp/opt/gcc-debug/bin/gcc --print-multi-lib
.;
% /tmp/opt/gcc-debug/bin/gcc -fsyntax-only a.cc -v
...
#include "..." search starts here:
#include <...> search starts here:
/tmp/opt/gcc-debug/lib/gcc/x86_64-pc-linux-gnu/11.0.1/../../../../include/c++/11.0.1
/tmp/opt/gcc-debug/lib/gcc/x86_64-pc-linux-gnu/11.0.1/../../../../include/c++/11.0.1/x86_64-pc-linux-gnu
/tmp/opt/gcc-debug/lib/gcc/x86_64-pc-linux-gnu/11.0.1/../../../../include/c++/11.0.1/backward
/tmp/opt/gcc-debug/lib/gcc/x86_64-pc-linux-gnu/11.0.1/include
/usr/local/include/x86_64-linux-gnu # affected by sysroot, multiarch, usually nonexistent
/usr/local/include # affected by sysroot
/tmp/opt/gcc-debug/include
/tmp/opt/gcc-debug/lib/gcc/x86_64-pc-linux-gnu/11.0.1/include-fixed
/tmp/opt/gcc-debug/lib/gcc/x86_64-pc-linux-gnu/11.0.1/../../../../x86_64-pc-linux-gnu/include
/usr/include/x86_64-linux-gnu # affected by sysroot, multiarch
/usr/include # affected by sysroot
...
% /tmp/opt/gcc-debug/bin/g++ a.cc '-###' |& sed -E 's/ "?-[iIL]/\n&/g'
...
-L/tmp/opt/gcc-debug/lib/gcc/x86_64-pc-linux-gnu/11.0.1
-L/tmp/opt/gcc-debug/lib/gcc/x86_64-pc-linux-gnu/11.0.1/../../../../lib64
-L/lib/x86_64-linux-gnu # affected by sysroot, multiarch
-L/lib/../lib64 # affected by sysroot
-L/usr/lib/x86_64-linux-gnu # affected by sysroot, multiarch
-L/usr/lib/../lib64 # affected by sysroot
-L/tmp/opt/gcc-debug/lib/gcc/x86_64-pc-linux-gnu/11.0.1/../../..
# -L$sysroot/lib if sysroot is not "" or "/"
# -L$sysroot/usr/lib if sysroot is not "" or "/"
...

Some paths are relative to the GCC installation:

  • The first three search paths (include/c++) are for libstdc++. Debian patched native gcc has altered the search paths.
  • /tmp/opt/gcc-debug/lib/gcc/x86_64-pc-linux-gnu/11.0.1/include refers to GCC's private headers.

The others are relative to sysroot. I have annotated the lines in the output. Due to multiarch, $sysroot/usr/local/include and $sysroot/usr/local/include are preceded by their $multiarch counterparts. That is the main point: different architectures have separate include directories while they can share some common directories. However, the library directories cannot really be shared and the common directories just cause issues. The problem in practice is that Debian has local multiarch patches which do things differently - the differences seem entirely unnecessary to me. Read on.

Let's see the output of a vanilla --disable-multi-arch native compiler. The sysroot directory should be clear from the output.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Configured with --disable-multi-arch.
% many=/tmp/glibc-many
% $many/install/compilers/x86_64-linux-gnu/bin/x86_64-glibc-linux-gnu-g++ --print-multiarch

% $many/install/compilers/x86_64-linux-gnu/bin/x86_64-glibc-linux-gnu-g++ --print-multi-os
../lib64
% $many/install/compilers/x86_64-linux-gnu/bin/x86_64-glibc-linux-gnu-g++ --print-multi-lib
.;
32;@m32
x32;@mx32
% $many/install/compilers/x86_64-linux-gnu/bin/x86_64-glibc-linux-gnu-g++ -fsyntax-only a.cc -v
...
#include "..." search starts here:
#include <...> search starts here:
/tmp/glibc-many/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/10.2.1/../../../../x86_64-glibc-linux-gnu/include/c++/10.2.1
/tmp/glibc-many/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/10.2.1/../../../../x86_64-glibc-linux-gnu/include/c++/10.2.1/x86_64-glibc-linux-gnu
/tmp/glibc-many/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/10.2.1/../../../../x86_64-glibc-linux-gnu/include/c++/10.2.1/backward
/tmp/glibc-many/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/10.2.1/include
/tmp/glibc-many/install/compilers/x86_64-linux-gnu/sysroot/usr/local/include
/tmp/glibc-many/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/10.2.1/include-fixed
/tmp/glibc-many/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/10.2.1/../../../../x86_64-glibc-linux-gnu/include
/tmp/glibc-many/install/compilers/x86_64-linux-gnu/sysroot/usr/include
...
% $many/install/compilers/x86_64-linux-gnu/bin/x86_64-glibc-linux-gnu-g++ a.cc '-###' |& sed -E 's/ "?-[iIL]/\n&/g'
...
-L/tmp/glibc-many/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/10.2.1
-L/tmp/glibc-many/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/10.2.1/../../../../x86_64-glibc-linux-gnu/lib/../lib64
-L/tmp/glibc-many/install/compilers/x86_64-linux-gnu/sysroot/lib/../lib64
-L/tmp/glibc-many/install/compilers/x86_64-linux-gnu/sysroot/usr/lib/../lib64
-L/tmp/glibc-many/install/compilers/x86_64-linux-gnu/lib/gcc/x86_64-glibc-linux-gnu/10.2.1/../../../../x86_64-glibc-linux-gnu/lib
-L/tmp/glibc-many/install/compilers/x86_64-linux-gnu/sysroot/lib
-L/tmp/glibc-many/install/compilers/x86_64-linux-gnu/sysroot/usr/lib

Let's see the output of a vanilla --disable-multi-arch cross compiler.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
% many=/tmp/glibc-many
% $many/install/compilers/aarch64-linux-gnu/bin/aarch64-glibc-linux-gnu-g++ --print-multiarch

% $many/install/compilers/aarch64-linux-gnu/bin/aarch64-glibc-linux-gnu-g++ --print-multi-os
../lib64
% $many/install/compilers/aarch64-linux-gnu/bin/aarch64-glibc-linux-gnu-g++ --print-multi-lib
.;
% $many/install/compilers/aarch64-linux-gnu/bin/aarch64-glibc-linux-gnu-g++ -fsyntax-only a.cc -v
...
#include "..." search starts here:
#include <...> search starts here:
/tmp/glibc-many/install/compilers/aarch64-linux-gnu/lib/gcc/aarch64-glibc-linux-gnu/10.2.1/../../../../aarch64-glibc-linux-gnu/include/c++/10.2.1
/tmp/glibc-many/install/compilers/aarch64-linux-gnu/lib/gcc/aarch64-glibc-linux-gnu/10.2.1/../../../../aarch64-glibc-linux-gnu/include/c++/10.2.1/aarch64-glibc-linux-gnu
/tmp/glibc-many/install/compilers/aarch64-linux-gnu/lib/gcc/aarch64-glibc-linux-gnu/10.2.1/../../../../aarch64-glibc-linux-gnu/include/c++/10.2.1/backward
/tmp/glibc-many/install/compilers/aarch64-linux-gnu/lib/gcc/aarch64-glibc-linux-gnu/10.2.1/include
/tmp/glibc-many/install/compilers/aarch64-linux-gnu/sysroot/usr/local/include
/tmp/glibc-many/install/compilers/aarch64-linux-gnu/lib/gcc/aarch64-glibc-linux-gnu/10.2.1/include-fixed
/tmp/glibc-many/install/compilers/aarch64-linux-gnu/lib/gcc/aarch64-glibc-linux-gnu/10.2.1/../../../../aarch64-glibc-linux-gnu/include
/tmp/glibc-many/install/compilers/aarch64-linux-gnu/sysroot/usr/include
...
% $many/install/compilers/aarch64-linux-gnu/bin/aarch64-glibc-linux-gnu-g++ a.cc '-###' |& sed -E 's/ "?-[iIL]/\n&/g'
...
-L/tmp/glibc-many/install/compilers/aarch64-linux-gnu/lib/gcc/aarch64-glibc-linux-gnu/10.2.1
-L/tmp/glibc-many/install/compilers/aarch64-linux-gnu/lib/gcc/aarch64-glibc-linux-gnu/10.2.1/../../../../aarch64-glibc-linux-gnu/lib/../lib64
-L/tmp/glibc-many/install/compilers/aarch64-linux-gnu/sysroot/lib/../lib64
-L/tmp/glibc-many/install/compilers/aarch64-linux-gnu/sysroot/usr/lib/../lib64
-L/tmp/glibc-many/install/compilers/aarch64-linux-gnu/lib/gcc/aarch64-glibc-linux-gnu/10.2.1/../../../../aarch64-glibc-linux-gnu/lib
-L/tmp/glibc-many/install/compilers/aarch64-linux-gnu/sysroot/lib
-L/tmp/glibc-many/install/compilers/aarch64-linux-gnu/sysroot/usr/lib
...

Debian

Debian likes to have different opinions. (ⓛ ω ⓛ *) Debian uses multiarch but its native compiler has different search paths.

Because MULTILIB_OSDIRNAMES is patched, (with gcc-multilib-multiarch.diff or gcc-multiarch.diff), the upstream ../lib64 becomes ../lib.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
% g++ --print-multiarch
x86_64-linux-gnu
% g++ --print-multi-os
../lib
% g++ --print-multi-lib
.;
32;@m32
x32;@mx32
% g++ -fsyntax-only a.cc -v
...
ignoring duplicate directory "/usr/include/x86_64-linux-gnu/c++/10"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/10/include-fixed"
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/10/../../../../x86_64-linux-gnu/include"
#include "..." search starts here:
#include <...> search starts here:
/usr/include/c++/10
/usr/include/x86_64-linux-gnu/c++/10
/usr/include/c++/10/backward
/usr/lib/gcc/x86_64-linux-gnu/10/include
/usr/local/include/x86_64-linux-gnu # affected by sysroot, usually nonexistent
/usr/local/include # affected by sysroot
/usr/include/x86_64-linux-gnu # affected by sysroot
/usr/include # affected by sysroot
...
% g++ a.cc '-###' |& sed -E 's/ "?-[iIL]/\n&/g'
...
-L/usr/lib/gcc/x86_64-linux-gnu/10
-L/usr/lib/gcc/x86_64-linux-gnu/10/../../../x86_64-linux-gnu
-L/usr/lib/gcc/x86_64-linux-gnu/10/../../../../lib
-L/lib/x86_64-linux-gnu
-L/lib/../lib
-L/usr/lib/x86_64-linux-gnu
-L/usr/lib/../lib
-L/usr/lib/gcc/x86_64-linux-gnu/10/../../..
...

Cross compiler. The libstdc++ search paths are not altered.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
% aarch64-linux-gnu-g++ --print-multiarch
aarch64-linux-gnu
% aarch64-linux-gnu-g++ --print-multi-os
../lib
% aarch64-linux-gnu-g++ --print-multi-lib
.;
% aarch64linux-gnu-g++ -fsyntax-only a.cc -v
...
ignoring nonexistent directory "/usr/lib/gcc-cross/aarch64-linux-gnu/10/include-fixed"
#include "..." search starts here:
#include <...> search starts here:
/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/include/c++/10
/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/include/c++/10/aarch64-linux-gnu
/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/include/c++/10/backward
/usr/lib/gcc-cross/aarch64-linux-gnu/10/include
/usr/local/include/aarch64-linux-gnu # affected by sysroot, usually nonexistent
/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/include # Debian specific, g++-multiarch-incdir.diff
/usr/include/aarch64-linux-gnu # affected by sysroot, usually nonexistent
/usr/include # affected by sysroot
...
% aarch64-linux-gnu-g++ a.cc '-###' |& sed -E 's/ "?-[iIL]/\n&/g'
...
-L/usr/lib/gcc-cross/aarch64-linux-gnu/10
-L/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/lib/../lib
-L/lib/aarch64-linux-gnu # affected by sysroot, Debian specific
-L/lib/../lib # affected by sysroot
-L/usr/lib/aarch64-linux-gnu # affected by sysroot, Debian specific
-L/usr/lib/../lib # affected by sysroot
-L/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/lib
...

I hope from various dumpings you have some idea what multiarch/multi-os are. multilib is for integrating -m32/-mx32 functionality into an x86-64 targeted compiler, and situations similar to that. This section does not touch multilib.

multi-os looks very broken to me. The ../lib64 or ../lib makes no sense.

Arch Linux

aarch64-linux-gnu-gcc --print-sysroot prints /usr/aarch64-linux-gnu-gcc. Compilers for different architectures have disjoint include paths. This can cause some redundancy.

Clang

In Clang, the include paths are computed by the driver.

You can specify --target= to ask for cross compiling. Clang will happily detect system GCC installations and add appropriate include and library paths.

Note that Clang before 13.0.0 incorrectly assumes that cross gcc follows the Debian native gcc behavior.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
% ~/Stable/bin/clang++ --target=aarch64-linux-gnu '-###' a.cc |& sed -E 's/ "?-[iIL]/\n&/g'
...
"-internal-isystem" "/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../include/c++/10"
"-internal-isystem" "/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../include/aarch64-linux-gnu/c++/10" # nonexistent
"-internal-isystem" "/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../include/aarch64-linux-gnu/c++/10" # nonexistent, strangely duplicated
"-internal-isystem" "/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../include/c++/10/backward"
"-internal-isystem" "/usr/local/include"
"-internal-isystem" "/usr/local/google/home/maskray/Stable/lib/clang/13.0.0/include"
"-internal-externc-isystem" "/include"
"-internal-externc-isystem" "/usr/include"
...
"-L/usr/lib/gcc-cross/aarch64-linux-gnu/10"
"-L/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../aarch64-linux-gnu"
"-L/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../lib64"
"-L/lib/aarch64-linux-gnu"
"-L/lib/../lib64"
"-L/usr/lib/aarch64-linux-gnu"
"-L/usr/lib/../lib64"
"-L/usr/lib/aarch64-linux-gnu/../../lib64"
"-L/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/lib"
"-L/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../.."
"-L/usr/local/google/home/maskray/Stable/bin/../lib"
"-L/lib"
"-L/usr/lib"

Note that "-internal-isystem" "/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../include/aarch64-linux-gnu/c++/10" referrs to a nonexistent directory, so compiling a file with C++ headers will lead to such an error:

1
2
3
/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../include/c++/10/iostream:38:10: fatal error: 'bits/c++config.h' file not found
#include <bits/c++config.h>
^~~~~~~~~~~~~~~~~~

I have fixed the problem in 13.0.0 and cleaned up unneeded search paths. My guideline is to make Clang able to pick up both vanilla and Debian gcc libstdc++/start files.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
 "-internal-isystem" "/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/include/c++/10"
"-internal-isystem" "/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/include/c++/10/aarch64-linux-gnu"
"-internal-isystem" "/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/include/c++/10/backward"
"-internal-isystem" "/usr/local/include"
"-internal-isystem" "/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/include"
"-internal-isystem" "/tmp/RelA/lib/clang/13.0.0/include"
"-internal-externc-isystem" "/include"
"-internal-externc-isystem" "/usr/include"
...
"/usr/bin/aarch64-linux-gnu-ld" "-EL" "--eh-frame-hdr" "-m" "aarch64linux" "-dynamic-linker" "/lib/ld-linux-aarch64.so.1" "-o" "a.out" "/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/lib/crt1.o" "/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/lib/crti.o" "/usr/lib/gcc-cross/aarch64-linux-gnu/10/crtbegin.o"
"-L/usr/lib/gcc-cross/aarch64-linux-gnu/10"
"-L/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../lib64"
"-L/lib/aarch64-linux-gnu" # affected by sysroot, multiarch
"-L/lib/../lib64" # affected by sysroot
"-L/usr/lib/aarch64-linux-gnu" # affected by sysroot, multiarch
"-L/usr/lib/../lib64" # affected by sysroot
"-L/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../../../aarch64-linux-gnu/lib"
"-L/usr/lib/gcc-cross/aarch64-linux-gnu/10/../../.."
"-L/tmp/RelA/bin/../lib" # So that -lc++ -lc++abi can pick up libc++/libc++abi built together with clang
"-L/lib" # affected by sysroot, always added (unlike gcc)
"-L/usr/lib" # affected by sysroot, always added (unlike gcc)

In Clang, --sysroot= additionally changes where Clang detects GCC installations ($sysroot and $sysroot/usr). So the include/library paths for libstdc++/crtbegin/crtend will change as well.

You may specify --gcc-toolchain= to override the prefix used to detect GCC installations.

If you link a program with a compiler driver (clang/gcc) in a standard way (not -nostdlib), the following components are usually on the linker command line.

  • crt1.o (glibc/musl): -no-pie/-pie/-static-pie
    • crt1.o: -no-pie
    • Scrt1.o: -pie, -shared
    • rcrt1.o: -static-pie
    • gcrt1.o:
  • crti.o (glibc/musl)
  • crtbegin.o
    • crtbegin.o: -no-pie
    • crtbeginS.o: -pie, -shared
    • crtbeginT.o: -static-pie
  • user input
  • -lstdc++
  • Some combination of -lc -lgcc_s -lgcc -lgcc_eh
  • crtn.o (glibc/musl)
  • crtend.o
    • crtend.o: -no-pie
    • crtendS.o: -pie, -shared
    • crtendT.o: -static-pie

crt1.o

This file is only used by executables.

In glibc, the file is -r linked from csu/start.c csu/abi-note.c csu/init.c csu/static-reloc.c. It used to call __libc_start_main with arguments main, __libc_csu_init, __libc_csu_fini (defined by libc_nonshared.a(elf-init.oS)). From BZ #23323 onwards, on most architectures, start.S:_start calls __libc_main_start with two zero arguments instead, and __libc_csu_init and __libc_csu_fini are moved into csu/libc-start.c.

In musl, this file calls __libc_start_main with main, _init, and _fini.

crti.o/crtn.o

crti.o defines _init in the .init section and _fini in the .fini section. The defined _init (_fini) is a fragment which is expected to be concatenated with other files and finally crtn.o to get the full definition.

In glibc x86-64, sysdeps/x86_64/crti.S and sysdeps/x86_64/crtn.S provide the definitions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# crti.o
Disassembly of section .init:

0000000000000000 <_init>:
0: 48 83 ec 08 subq $8, %rsp
4: 48 8b 05 00 00 00 00 movq (%rip), %rax # b <_init+0xb>
0000000000000007: R_X86_64_REX_GOTPCRELX __gmon_start__-0x4
b: 48 85 c0 testq %rax, %rax
e: 74 02 je 0x12 <_init+0x12>
10: ff d0 callq *%rax

Disassembly of section .fini:

0000000000000000 <_fini>:
0: 48 83 ec 08 subq $8, %rsp

# crtn.o
Disassembly of section .init:

0000000000000000 <.init>:
0: addq $8, %rsp
4: retq

Disassembly of section .fini:

0000000000000000 <.fini>:
0: addq $8, %rsp
4: retq

crti.o calls __gmon_start__ (gmon profiling system) if defined. This is used by gcc -pg.

The linker defines DT_INIT if _init (default value for -init) is defined, and DT_FINI if _fini is defined.

The section fragment idea is fragile. On RISC-V, DT_INIT/_init is not used. crti.o and crtn.o have no code/.init/.fini.

crtbegin.o/crtend.o

libgcc/crtstuff.c

If __LIBGCC_INIT_ARRAY_SECTION_ASM_OP__ is not defined and __LIBGCC_INIT_SECTION_ASM_OP__ is defined (HAVE_INITFINI_ARRAY_SUPPORT is 1 in $builddir/gcc/auto-host.h),

  • crtend.o defines a .init section which calls __do_global_ctors_aux. __do_global_ctors_aux calls the static constructors in the .ctors section.
  • crtbegin.o defines a .fini section which calls __do_global_dtors_aux. __do_global_dtors_aux calls the static constructors in the .dtors section.
  • crtbegin.o defines .ctors and .dtors with a single -1 value.
  • crtend.o defines .ctors and .dtors with a single 0 value.

On modern distributions, __LIBGCC_INIT_ARRAY_SECTION_ASM_OP__ is 0 and crtend.o contains no .text/.ctors/.dtors.

glibc startup sequence

Below the control flows are flattened.

Dynamically linked executable

In rtld:

  • sysdeps/x86_64/dl-machine.h:_user
  • elf/rtld.c:_dl_start
  • sysdeps/x86_64/dl-machine.h:_dl_start_user
  • elf/dl-init.c:_dl_init
  • Jump to the main executable e_entry

In the main executable:

  • sysdeps/x86_64/start.S:_start
  • csu/libc-start.c:__libc_start_main, the SHARED branch
  • (if ELF_INITFINI is defined) Run DT_INIT
  • Run DT_INITARRAY
  • Run main
  • Run exit
  • stdlib/exit.c:__run_exit_handlers

Statically linked executable

In the main executable:

  • sysdeps/x86_64/start.S:_start
  • csu/libc-start.c:__libc_start_main, the !SHARED branch
  • _dl_relocate_static_pie
  • ARCH_SETUP_IREL
  • ARCH_SETUP_TLS
  • csu/libc-start.c:call_init
    • Run [__preinit_array_start, __preinit_array_end)
    • (if ELF_INITFINI is defined) Run _init
    • Run [__init_array_start, __init_array_end)
  • Run main
  • Run exit

musl startup sequence

For a dynamically linked executable, the rtld process:

  • arch/x86_64/crt_arch.h:_dlstart
  • ldso/dlstart.c:_dlstart_c
  • ldso/dynlink.c:__dls2 relocate rtld
  • ldso/dynlink.c:__dls2b setup early thread pointer
  • ldso/dynlink.c:__dls3
  • Jump to the main executable e_entry

In the main executable:

  • arch/x86_64/crt_arch.h:_start
  • crt/crt1.c:_start_c
  • src/env/__libc_start_main.c:__libc_start_main
  • __init_libc initialize auxv/TLS/stack protector/etc
  • libc_start_main_stage2
  • __libc_start_init
  • exit(main(argc, argv, envp));

__libc_start_init has different behaviors for dynamically and statically linked executables. For a dynamically linked executable: it runs DT_INIT (unless NO_LEGACY_INITFINI) then DT_INIT_ARRAY. Note: libc.so has a dummy _init.

GCC

  • exec_prefixes
  • startfile_prefixes
  • include_prefixes

-B $prefix adds $prefix to all the three lists.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
for_each_path
if (!skip_multi_dir) {
for (pl = paths->plist; pl; pl = pl->next) {
Try pl->prefix + "x86_64-linux-gnu/10/"
if (pl->require_machine_suffix == 2) { // rare
Try pl->prefix + "x86_64-linux-gnu/"
}
if (pl->require_machine_suffix == 0 && multiarch_dir) { // rare
Try pl->prefix + "x86_64-linux-gnu/"
}
if (!pl->os_multilib || !skip_multi_os_dir) {
Try pl->prefix + "x86_64-linux-gnu/"
}
}
}

Debian

https://wiki.debian.org/Multiarch/Tuples

1
2
3
4
% gcc -print-multi-lib
.;
32;@m32
x32;@mx32

Clang


文章来源: http://maskray.me/blog/2021-03-28-compiler-driver-and-cross-compilation
如有侵权请联系:admin#unsafe.sh