Compressed debug sections
2022-1-23 16:0:0 Author: maskray.me(查看原文) 阅读量:30 收藏

In a Clang/GCC -g1 or -g2 build, the debug information is often much larger than text sections. Some assemblers and linkers offer an optional feature which compresses debug sections.

History

In 2007-11, Craig Silverstein added --compress-debug-sections=zlib to gold. When the option was specified, gold compressed the content of a .debug* section with zlib and changed the section name to .debug*.zlib.$uncompressed_size.

In 2008-04, Craig Silverstein changed the format and contributed Patch to handle compressed sections to gdb. The compressed section was renamed to .zdebug*.

In 2010-06, Cary Coutant added --compress-debug-sections to gas and added reading support to objdump and readelf.

ELF Section Compression has a nice summary of the .zdebug format. The article lists some problems with the format which led to a new format standardized by the generic ELF ABI in 2012. I recommend that folks interested in the ELF format read this article. My thinking of implementing ELF features has been influenced by profound discussions like this article and other discussions on the generic ABI mailing list.

In Solaris 11.2, its linker introduced -z compress-sections to compress candidate sections.

The generic ABI format led to modification to the existing assembler and linker options in binutils. In 2015-04, H.J. Lu added --compress-debug-sections=[none|zlib|zlib-gnu|zlib-gabi] to gas and added --compress-debug-sections=[none|zlib|zlib-gnu|zlib-gabi] to GNU ld. In 2015-07, H.J. Lu added --compress-debug-sections=[none|zlib|zlib-gnu|zlib-gabi] to gold. zlib and zlib-gnu indicated the .zdebug format while zlib-gabi indicated the generic ABI format.

In 2015-07, [PATCH] Make default compression gABI compliant (milestone: binutils 2.26) changed zlib to indicate the generic ABI format.

The assembler and linker --compress-debug-sections= options are long and difficult to use. In 2014-06, Rainer Orth added -gz and -gz=[none|zlib|zlib-gnu] to GCC. For the assembly action, it passes --compress-debug-sections= to as. The link action is similar.

Usage

For object generation, specify the compiler driver option -gz to enable compression and select the default format (zlib with a generic ABI specified Elf64_Chdr header). -gz for the phase indicates -Wa,--compress-debug-sections=zlib.

For linking, specify -gz to enable compression and select the default format. -gz for the phase indicates -Wl,--compress-debug-sections=zlib. If you need uncompression for linker input but don't need compression for the linker output, don't bother with -gz. The linker recognizes compressed input and uncompressed it automatically.

You may not want to use -gz if you combine assembly and linking in one step (gcc -g -gz a.c). The intermediate .o file is not saved. The assembler compressed debug sections will immediately be uncompressed by the linker, causing wasted efforts.

Pros and cons

The pros are obvious: compression decreases sizes for either (or both) relocatable object files and linked images.

Compression imposes memory usage costs at many stages of development. The assembler, the linker, and the debugger need to allocate additional memory to hold the compressed or uncompressed data.

For object generation, the performance depends on the environment. On one hand, compression performs more work and creates uncompression work for the linker. On the other hand, compressed data is smaller, so I/O is faster.

For linking, compression may greatly increase link time. In a project, a .o file may go into multiple linked images. The same data may be compressed multime times but it is nearly impossible to reuse a compression result because every linked image has a different post-relocation content.

Compressed debug sections may make the debugger significantly slower.

Linkers

Uncompressing input sections is embarrassingly parallel.

Compressing input sections is a major bottleneck. I tested a -DCMAKE_BUILD_TYPE=Debug build of clang with 265MiB SHF_ALLOC sections and 920MiB uncompressed debug sections. If I specify --compress-debug-sections=zlib in a --threads=1 link, "Compress debug sections" takes 2/3 time. In a --threads=8 link "Compress debug sections" takes nearly 70% time.

We basically have three choices to improve the situation.

  • tune zlib parameters
  • alternative compression format
  • more optimized library
  • parallel divide-and-conquer

The first was done. ld.lld switched to compression level 1 (Z_BEST_SPEED) for -O1 (default). The previous default (6) was known to not compress much while taking too much time.

The second choice is not really feasible. zstd is better than zlib in all metrics: compression speed, decompression speed, and compression ratio. However, it is not standardized (the only flag the generic ABI specifies is ELFCOMPRESS_ZLIB). It does not have debug producer/consumer support. Using a better format is an ecosystem issue that requires significant undertaking and stakeholder buy-in.

The third choice is difficult for a linker like lld. Importing a library to llvm-project has a significantly high barrier. A new CMake configuration has bad discoverability and benefits very few groups. libdeflate is efficient and seems to do a good job, but I do now know how efficient can make it justify an import.

The fourth choice is feasible. Rui Ueyama told me that mold optimizes --compress-debug-sections=zlib with sharding. I researched a bit. pigz has a great comment about how it leverages multi-threading: https://github.com/madler/pigz/blob/master/pigz.c. It has some sophisticated features that a linker does (may) not need:

  • unless -i is specified, the last 32KiB (window size) from the previous chunk is used as a dictionary (deflateSetDictionary) to improve compression ratio.
  • sync marker even in the absence of --rsyncable

I created [ELF] Parallelize --compress-debug-sections=zlib to improve the algorithm.


文章来源: https://maskray.me/blog/2022-01-23-compressed-debug-sections
如有侵权请联系:admin#unsafe.sh