什么是调用约定?
在计算机科学中,调用约定是一种定义子过程从调用处接受参数以及返回结果的方法的约定
如下汇编代码中,通过中断 0x10 将 BP 指向的字符串输出到显示设备中,字符串长度储存在 CX 中, CPU 处理中断时会通过 CX、BP 等寄存器获得调用参数,将字符串 Hello World
写入显存中,进而展示在屏幕上。
hello: DB "Hello World" ; store stringstart:
MOV AH, 0x13 ; move BIOS interrupt number in AH
MOV CX, 11 ; move length of string in cx
MOV BX, 0 ; mov 0 to bx, so we can move it to es
MOV ES, BX ; move segment start of string to es, 0
MOV BP, OFFSET hello ; move start offset of string in bp
MOV DL, 0 ; start writing from col 0
int 0x10 ; BIOS interrupt
该汇编代码使用的是 intel 语法,除此之外还有 AT&T 语法,Go 语言的 Plan 9 语法就和 AT&T 很相似。
// 逆向工程权威指南
在Intel语法中:<instruction> <destination operand> <source operand>
在AT&T语法中:<instruction> <source operand> <destination operand>有一个理解它们的方法: 当你面对 intel 语法的时候,你可以想象把等号放到 2 个操作数中间,当面对 AT&T 语法的时候,你可以放一个右箭头 (→)到两个操作数之间。
AT&T: 在寄存器名之前需要写一个百分号(%)并且在数字前面需要美元符($)。
高级语言的函数调用约定区分于不同的语言(C、Go)、不同的架构(x86、x86_64)、不同的编译器版本(go1.16、go1.17),可能通过寄存器传递入参和出参,也可能通过栈传递。
在分析高级语言的调用约定时,不约而同地都需要将高级语言源码编译为汇编语言,然后分析调用过程中的传参特性。
本文主要分析 C/C++ 和 Go 语言的调用约定。C/C++ 语言更贴近底层,通过它来了解调用约定更清晰。
C 语言源代码使用 gcc 编译,C++ 源代码则使用 g++ 编译;g++ 就是 gcc 的 c++ 版本。
编译 1.cc 源文件,生成 1.s 汇编代码。
• -m32/-m64 表示编译为 32 位或 64 位架构代码
• -S 表示编译汇编
• -fverbose-asm 表示把 C 语言中的语句作为汇编的注释
• -masm=intel gcc 编译的汇编默认是 AT&T 语法,如果想要编译为 intel 语法,可用此 flag
gcc -m32 -S -fverbose-asm 1.cc -o 1.s
// 1.cc
int sum(int a, int b) {
return a + b;
}int main() {
int c = sum(1, 2);
return c;
}
汇编代码 (32 位) 如下,使用 AT&T 语法。本文注重于汇编调用的过程,后续汇编代码中无关内容不再呈现,例如:有很多宏(用点开始) 可以忽略不看,大部分注释可忽略。
.file "1.cc"
# GNU C++14 (Debian 10.2.1-6) version 10.2.1 20210110 (x86_64-linux-gnu)
# compiled by GNU C version 10.2.1 20210110, GMP version 6.2.1, MPFR version 4.1.0, MPC version 1.2.0, isl version isl-0.23-GMP# GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
# options passed: -imultilib 32 -imultiarch i386-linux-gnu -D_GNU_SOURCE
# 1.cc -m32 -mtune=generic -march=i686 -auxbase-strip 1.s -fverbose-asm
# -fasynchronous-unwind-tables
# options enabled: -fPIC -fPIE -faggressive-loop-optimizations
# -fallocation-dce -fasynchronous-unwind-tables -fauto-inc-dec
# -fdelete-null-pointer-checks -fdwarf2-cfi-asm -fearly-inlining
# -feliminate-unused-debug-symbols -feliminate-unused-debug-types
# -fexceptions -ffp-int-builtin-inexact -ffunction-cse -fgcse-lm
# -fgnu-unique -fident -finline-atomics -fipa-stack-alignment
# -fira-hoist-pressure -fira-share-save-slots -fira-share-spill-slots
# -fivopts -fkeep-static-consts -fleading-underscore -flifetime-dse
# -fmath-errno -fmerge-debug-strings -fpcc-struct-return -fpeephole -fplt
# -fprefetch-loop-arrays -fsched-critical-path-heuristic
# -fsched-dep-count-heuristic -fsched-group-heuristic -fsched-interblock
# -fsched-last-insn-heuristic -fsched-rank-heuristic -fsched-spec
# -fsched-spec-insn-heuristic -fsched-stalled-insns-dep -fschedule-fusion
# -fsemantic-interposition -fshow-column -fshrink-wrap-separate
# -fsigned-zeros -fsplit-ivs-in-unroller -fssa-backprop -fstdarg-opt
# -fstrict-volatile-bitfields -fsync-libcalls -ftrapping-math -ftree-cselim
# -ftree-forwprop -ftree-loop-if-convert -ftree-loop-im -ftree-loop-ivcanon
# -ftree-loop-optimize -ftree-parallelize-loops= -ftree-phiprop
# -ftree-reassoc -ftree-scev-cprop -funit-at-a-time -funwind-tables
# -fverbose-asm -fzero-initialized-in-bss -m32 -m80387 -m96bit-long-double
# -malign-stringops -mavx256-split-unaligned-load
# -mavx256-split-unaligned-store -mfancy-math-387 -mfp-ret-in-387 -mglibc
# -mieee-fp -mlong-double-80 -mno-red-zone -mno-sse4 -mpush-args -msahf
# -mstv -mtls-direct-seg-refs -mvzeroupper
.text
.globl _Z3sumii
.type _Z3sumii, @function
_Z3sumii:
.LFB0:
.cfi_startproc
pushl %ebp #
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp #,
.cfi_def_cfa_register 5
call __x86.get_pc_thunk.ax #
addl $_GLOBAL_OFFSET_TABLE_, %eax # tmp82,
# 1.cc:3: return a + b;
movl 8(%ebp), %edx # a, tmp85
movl 12(%ebp), %eax # b, tmp86
addl %edx, %eax # tmp85, _3
# 1.cc:4: }
popl %ebp #
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE0:
.size _Z3sumii, .-_Z3sumii
.globl main
.type main, @function
main:
.LFB1:
.cfi_startproc
pushl %ebp #
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp #,
.cfi_def_cfa_register 5
subl $16, %esp #,
call __x86.get_pc_thunk.ax #
addl $_GLOBAL_OFFSET_TABLE_, %eax # tmp82,
# 1.cc:7: int c = sum(1, 2);
pushl $2 #
pushl $1 #
call _Z3sumii #
addl $8, %esp #,
movl %eax, -4(%ebp) # tmp85, c
# 1.cc:8: return c;
movl -4(%ebp), %eax # c, _4
# 1.cc:9: }
leave
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE1:
.size main, .-main
.section .text.__x86.get_pc_thunk.ax,"axG",@progbits,__x86.get_pc_thunk.ax,comdat
.globl __x86.get_pc_thunk.ax
.hidden __x86.get_pc_thunk.ax
.type __x86.get_pc_thunk.ax, @function
__x86.get_pc_thunk.ax:
.LFB2:
.cfi_startproc
movl (%esp), %eax #,
ret
.cfi_endproc
.LFE2:
.ident "GCC: (Debian 10.2.1-6) 10.2.1 20210110"
.section .note.GNU-stack,"",@progbits
编译 main.go 文件,生成汇编代码:
> go tool compile -N -l -S main.go
其中:
• -N,禁止优化
• -l,关闭内联
• -S,生成汇编
main.go
package mainimport "fmt"
func main() {
c := add(1, 2)
fmt.Println(c)
}
func add(a, b int) int {
fmt.Printf("%p", &a)
return a + b
}
编译成的汇编较多,可通过参考中的 Go 汇编详解
了解 Go 的汇编知识。
main.main STEXT size=188 args=0x0 locals=0x60 funcid=0x0 align=0x0
0x0000 00000 (main.go:5) TEXT main.main(SB), ABIInternal, $96-0
0x0000 00000 (main.go:5) CMPQ SP, 16(R14)
0x0004 00004 (main.go:5) PCDATA $0, $-2
0x0004 00004 (main.go:5) JLS 178
0x000a 00010 (main.go:5) PCDATA $0, $-1
0x000a 00010 (main.go:5) SUBQ $96, SP
0x000e 00014 (main.go:5) MOVQ BP, 88(SP)
0x0013 00019 (main.go:5) LEAQ 88(SP), BP
0x0018 00024 (main.go:5) FUNCDATA $0, gclocals·J5F+7Qw7O7ve2QcWC7DpeQ==(SB)
0x0018 00024 (main.go:5) FUNCDATA $1, gclocals·bDfKCdmtOiGIuJz/x+yQyQ==(SB)
0x0018 00024 (main.go:5) FUNCDATA $2, main.main.stkobj(SB)
0x0018 00024 (main.go:6) MOVL $1, AX
0x001d 00029 (main.go:6) MOVL $2, BX
0x0022 00034 (main.go:6) PCDATA $1, $0
0x0022 00034 (main.go:6) CALL main.add(SB)
0x0027 00039 (main.go:6) MOVQ AX, main.c+24(SP)
0x002c 00044 (main.go:7) MOVUPS X15, main..autotmp_1+48(SP)
0x0032 00050 (main.go:7) LEAQ main..autotmp_1+48(SP), CX
0x0037 00055 (main.go:7) MOVQ CX, main..autotmp_3+40(SP)
0x003c 00060 (main.go:7) MOVQ main.c+24(SP), AX
0x0041 00065 (main.go:7) PCDATA $1, $1
0x0041 00065 (main.go:7) CALL runtime.convT64(SB)
0x0046 00070 (main.go:7) MOVQ AX, main..autotmp_4+32(SP)
0x004b 00075 (main.go:7) MOVQ main..autotmp_3+40(SP), DI
0x0050 00080 (main.go:7) TESTB AL, (DI)
0x0052 00082 (main.go:7) LEAQ type.int(SB), CX
0x0059 00089 (main.go:7) MOVQ CX, (DI)
0x005c 00092 (main.go:7) LEAQ 8(DI), CX
0x0060 00096 (main.go:7) PCDATA $0, $-2
0x0060 00096 (main.go:7) CMPL runtime.writeBarrier(SB), $0
0x0067 00103 (main.go:7) JEQ 107
0x0069 00105 (main.go:7) JMP 113
0x006b 00107 (main.go:7) MOVQ AX, 8(DI)
0x006f 00111 (main.go:7) JMP 123
0x0071 00113 (main.go:7) MOVQ CX, DI
0x0074 00116 (main.go:7) CALL runtime.gcWriteBarrier(SB)
0x0079 00121 (main.go:7) JMP 123
0x007b 00123 (main.go:7) PCDATA $0, $-1
0x007b 00123 (main.go:7) MOVQ main..autotmp_3+40(SP), AX
0x0080 00128 (main.go:7) TESTB AL, (AX)
0x0082 00130 (main.go:7) JMP 132
0x0084 00132 (main.go:7) MOVQ AX, main..autotmp_2+64(SP)
0x0089 00137 (main.go:7) MOVQ $1, main..autotmp_2+72(SP)
0x0092 00146 (main.go:7) MOVQ $1, main..autotmp_2+80(SP)
0x009b 00155 (main.go:7) MOVL $1, BX
0x00a0 00160 (main.go:7) MOVQ BX, CX
0x00a3 00163 (main.go:7) PCDATA $1, $0
0x00a3 00163 (main.go:7) CALL fmt.Println(SB)
0x00a8 00168 (main.go:8) MOVQ 88(SP), BP
0x00ad 00173 (main.go:8) ADDQ $96, SP
0x00b1 00177 (main.go:8) RET
0x00b2 00178 (main.go:8) NOP
0x00b2 00178 (main.go:5) PCDATA $1, $-1
0x00b2 00178 (main.go:5) PCDATA $0, $-2
0x00b2 00178 (main.go:5) CALL runtime.morestack_noctxt(SB)
0x00b7 00183 (main.go:5) PCDATA $0, $-1
0x00b7 00183 (main.go:5) JMP 0
0x0000 49 3b 66 10 0f 86 a8 00 00 00 48 83 ec 60 48 89 I;f.......H..`H.
0x0010 6c 24 58 48 8d 6c 24 58 b8 01 00 00 00 bb 02 00 l$XH.l$X........
0x0020 00 00 e8 00 00 00 00 48 89 44 24 18 44 0f 11 7c .......H.D$.D..|
0x0030 24 30 48 8d 4c 24 30 48 89 4c 24 28 48 8b 44 24 $0H.L$0H.L$(H.D$
0x0040 18 e8 00 00 00 00 48 89 44 24 20 48 8b 7c 24 28 ......H.D$ H.|$(
0x0050 84 07 48 8d 0d 00 00 00 00 48 89 0f 48 8d 4f 08 ..H......H..H.O.
0x0060 83 3d 00 00 00 00 00 74 02 eb 06 48 89 47 08 eb .=.....t...H.G..
0x0070 0a 48 89 cf e8 00 00 00 00 eb 00 48 8b 44 24 28 .H.........H.D$(
0x0080 84 00 eb 00 48 89 44 24 40 48 c7 44 24 48 01 00 [email protected]$H..
0x0090 00 00 48 c7 44 24 50 01 00 00 00 bb 01 00 00 00 ..H.D$P.........
0x00a0 48 89 d9 e8 00 00 00 00 48 8b 6c 24 58 48 83 c4 H.......H.l$XH..
0x00b0 60 c3 e8 00 00 00 00 e9 44 ff ff ff `.......D...
rel 3+0 t=23 type.int+0
rel 35+4 t=7 main.add+0
rel 66+4 t=7 runtime.convT64+0
rel 85+4 t=14 type.int+0
rel 98+4 t=14 runtime.writeBarrier+-1
rel 117+4 t=7 runtime.gcWriteBarrier+0
rel 164+4 t=7 fmt.Println+0
rel 179+4 t=7 runtime.morestack_noctxt+0
main.add STEXT size=244 args=0x10 locals=0x78 funcid=0x0 align=0x0
0x0000 00000 (main.go:9) TEXT main.add(SB), ABIInternal, $120-16
0x0000 00000 (main.go:9) CMPQ SP, 16(R14)
0x0004 00004 (main.go:9) PCDATA $0, $-2
0x0004 00004 (main.go:9) JLS 210
0x000a 00010 (main.go:9) PCDATA $0, $-1
0x000a 00010 (main.go:9) SUBQ $120, SP
0x000e 00014 (main.go:9) MOVQ BP, 112(SP)
0x0013 00019 (main.go:9) LEAQ 112(SP), BP
0x0018 00024 (main.go:9) FUNCDATA $0, gclocals·J5F+7Qw7O7ve2QcWC7DpeQ==(SB)
0x0018 00024 (main.go:9) FUNCDATA $1, gclocals·ojjeUbKZdec9+9S2K3feng==(SB)
0x0018 00024 (main.go:9) FUNCDATA $2, main.add.stkobj(SB)
0x0018 00024 (main.go:9) FUNCDATA $5, main.add.arginfo1(SB)
0x0018 00024 (main.go:9) MOVQ AX, main.a+128(SP)
0x0020 00032 (main.go:9) MOVQ BX, main.b+136(SP)
0x0028 00040 (main.go:9) MOVQ $0, main.~r0+40(SP)
0x0031 00049 (main.go:9) LEAQ type.int(SB), AX
0x0038 00056 (main.go:9) PCDATA $1, $0
0x0038 00056 (main.go:9) CALL runtime.newobject(SB)
0x003d 00061 (main.go:9) MOVQ AX, main.&a+64(SP)
0x0042 00066 (main.go:9) MOVQ main.a+128(SP), CX
0x004a 00074 (main.go:9) MOVQ CX, (AX)
0x004d 00077 (main.go:10) MOVQ main.&a+64(SP), CX
0x0052 00082 (main.go:10) MOVQ CX, main..autotmp_3+56(SP)
0x0057 00087 (main.go:10) MOVUPS X15, main..autotmp_4+72(SP)
0x005d 00093 (main.go:10) LEAQ main..autotmp_4+72(SP), CX
0x0062 00098 (main.go:10) MOVQ CX, main..autotmp_6+48(SP)
0x0067 00103 (main.go:10) TESTB AL, (CX)
0x0069 00105 (main.go:10) MOVQ main..autotmp_3+56(SP), DX
0x006e 00110 (main.go:10) LEAQ type.*int(SB), BX
0x0075 00117 (main.go:10) MOVQ BX, main..autotmp_4+72(SP)
0x007a 00122 (main.go:10) MOVQ DX, main..autotmp_4+80(SP)
0x007f 00127 (main.go:10) TESTB AL, (CX)
0x0081 00129 (main.go:10) JMP 131
0x0083 00131 (main.go:10) MOVQ CX, main..autotmp_5+88(SP)
0x0088 00136 (main.go:10) MOVQ $1, main..autotmp_5+96(SP)
0x0091 00145 (main.go:10) MOVQ $1, main..autotmp_5+104(SP)
0x009a 00154 (main.go:10) LEAQ go.string."%p"(SB), AX
0x00a1 00161 (main.go:10) MOVL $2, BX
0x00a6 00166 (main.go:10) MOVL $1, DI
0x00ab 00171 (main.go:10) MOVQ DI, SI
0x00ae 00174 (main.go:10) PCDATA $1, $1
0x00ae 00174 (main.go:10) CALL fmt.Printf(SB)
0x00b3 00179 (main.go:11) MOVQ main.&a+64(SP), DX
0x00b8 00184 (main.go:11) MOVQ (DX), AX
0x00bb 00187 (main.go:11) ADDQ main.b+136(SP), AX
0x00c3 00195 (main.go:11) MOVQ AX, main.~r0+40(SP)
0x00c8 00200 (main.go:11) MOVQ 112(SP), BP
0x00cd 00205 (main.go:11) ADDQ $120, SP
0x00d1 00209 (main.go:11) RET
0x00d2 00210 (main.go:11) NOP
0x00d2 00210 (main.go:9) PCDATA $1, $-1
0x00d2 00210 (main.go:9) PCDATA $0, $-2
0x00d2 00210 (main.go:9) MOVQ AX, 8(SP)
0x00d7 00215 (main.go:9) MOVQ BX, 16(SP)
0x00dc 00220 (main.go:9) NOP
0x00e0 00224 (main.go:9) CALL runtime.morestack_noctxt(SB)
0x00e5 00229 (main.go:9) MOVQ 8(SP), AX
0x00ea 00234 (main.go:9) MOVQ 16(SP), BX
0x00ef 00239 (main.go:9) PCDATA $0, $-1
0x00ef 00239 (main.go:9) JMP 0
0x0000 49 3b 66 10 0f 86 c8 00 00 00 48 83 ec 78 48 89 I;f.......H..xH.
0x0010 6c 24 70 48 8d 6c 24 70 48 89 84 24 80 00 00 00 l$pH.l$pH..$....
0x0020 48 89 9c 24 88 00 00 00 48 c7 44 24 28 00 00 00 H..$....H.D$(...
0x0030 00 48 8d 05 00 00 00 00 e8 00 00 00 00 48 89 44 .H...........H.D
0x0040 24 40 48 8b 8c 24 80 00 00 00 48 89 08 48 8b 4c [email protected]$....H..H.L
0x0050 24 40 48 89 4c 24 38 44 0f 11 7c 24 48 48 8d 4c [email protected]$8D..|$HH.L
0x0060 24 48 48 89 4c 24 30 84 01 48 8b 54 24 38 48 8d $HH.L$0..H.T$8H.
0x0070 1d 00 00 00 00 48 89 5c 24 48 48 89 54 24 50 84 .....H.\$HH.T$P.
0x0080 01 eb 00 48 89 4c 24 58 48 c7 44 24 60 01 00 00 ...H.L$XH.D$`...
0x0090 00 48 c7 44 24 68 01 00 00 00 48 8d 05 00 00 00 .H.D$h....H.....
0x00a0 00 bb 02 00 00 00 bf 01 00 00 00 48 89 fe e8 00 ...........H....
0x00b0 00 00 00 48 8b 54 24 40 48 8b 02 48 03 84 24 88 [email protected]$.
0x00c0 00 00 00 48 89 44 24 28 48 8b 6c 24 70 48 83 c4 ...H.D$(H.l$pH..
0x00d0 78 c3 48 89 44 24 08 48 89 5c 24 10 0f 1f 40 00 x.H.D$.H.\[email protected]
0x00e0 e8 00 00 00 00 48 8b 44 24 08 48 8b 5c 24 10 e9 .....H.D$.H.\$..
0x00f0 0c ff ff ff ....
rel 3+0 t=23 type.*int+0
rel 52+4 t=14 type.int+0
rel 57+4 t=7 runtime.newobject+0
rel 113+4 t=14 type.*int+0
rel 157+4 t=14 go.string."%p"+0
rel 175+4 t=7 fmt.Printf+0
rel 225+4 t=7 runtime.morestack_noctxt+0
go.cuinfo.producer.<unlinkable> SDWARFCUINFO dupok size=0
0x0000 2d 4e 20 2d 6c 20 72 65 67 61 62 69 -N -l regabi
go.cuinfo.packagename.main SDWARFCUINFO dupok size=0
0x0000 6d 61 69 6e main
main..inittask SNOPTRDATA size=32
0x0000 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................
0x0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
rel 24+8 t=1 fmt..inittask+0
go.string."%p" SRODATA dupok size=2
0x0000 25 70 %p
runtime.nilinterequal·f SRODATA dupok size=8
0x0000 00 00 00 00 00 00 00 00 ........
rel 0+8 t=1 runtime.nilinterequal+0
runtime.memequal64·f SRODATA dupok size=8
0x0000 00 00 00 00 00 00 00 00 ........
rel 0+8 t=1 runtime.memequal64+0
runtime.gcbits.01 SRODATA dupok size=1
0x0000 01 .
type..namedata.*interface {}- SRODATA dupok size=15
0x0000 00 0d 2a 69 6e 74 65 72 66 61 63 65 20 7b 7d ..*interface {}
type.*interface {} SRODATA dupok size=56
0x0000 08 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 ................
0x0010 3b fc f8 8f 08 08 08 36 00 00 00 00 00 00 00 00 ;......6........
0x0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0030 00 00 00 00 00 00 00 00 ........
rel 24+8 t=1 runtime.memequal64·f+0
rel 32+8 t=1 runtime.gcbits.01+0
rel 40+4 t=5 type..namedata.*interface {}-+0
rel 48+8 t=1 type.interface {}+0
runtime.gcbits.02 SRODATA dupok size=1
0x0000 02 .
type.interface {} SRODATA dupok size=80
0x0000 10 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 ................
0x0010 39 7a 09 0f 02 08 08 14 00 00 00 00 00 00 00 00 9z..............
0x0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
rel 24+8 t=1 runtime.nilinterequal·f+0
rel 32+8 t=1 runtime.gcbits.02+0
rel 40+4 t=5 type..namedata.*interface {}-+0
rel 44+4 t=-32763 type.*interface {}+0
rel 56+8 t=1 type.interface {}+80
type..namedata.*[]interface {}- SRODATA dupok size=17
0x0000 00 0f 2a 5b 5d 69 6e 74 65 72 66 61 63 65 20 7b ..*[]interface {
0x0010 7d }
type.*[]interface {} SRODATA dupok size=56
0x0000 08 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 ................
0x0010 9d 9c 0e 59 08 08 08 36 00 00 00 00 00 00 00 00 ...Y...6........
0x0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0030 00 00 00 00 00 00 00 00 ........
rel 24+8 t=1 runtime.memequal64·f+0
rel 32+8 t=1 runtime.gcbits.01+0
rel 40+4 t=5 type..namedata.*[]interface {}-+0
rel 48+8 t=1 type.[]interface {}+0
type.[]interface {} SRODATA dupok size=56
0x0000 18 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 ................
0x0010 76 de 99 0d 02 08 08 17 00 00 00 00 00 00 00 00 v...............
0x0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0030 00 00 00 00 00 00 00 00 ........
rel 32+8 t=1 runtime.gcbits.01+0
rel 40+4 t=5 type..namedata.*[]interface {}-+0
rel 44+4 t=-32763 type.*[]interface {}+0
rel 48+8 t=1 type.interface {}+0
type..namedata.*[1]interface {}- SRODATA dupok size=18
0x0000 00 10 2a 5b 31 5d 69 6e 74 65 72 66 61 63 65 20 ..*[1]interface
0x0010 7b 7d {}
type.[1]interface {} SRODATA dupok size=72
0x0000 10 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 ................
0x0010 6e 20 6a 3d 02 08 08 11 00 00 00 00 00 00 00 00 n j=............
0x0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0040 01 00 00 00 00 00 00 00 ........
rel 24+8 t=1 runtime.nilinterequal·f+0
rel 32+8 t=1 runtime.gcbits.02+0
rel 40+4 t=5 type..namedata.*[1]interface {}-+0
rel 44+4 t=-32763 type.*[1]interface {}+0
rel 48+8 t=1 type.interface {}+0
rel 56+8 t=1 type.[]interface {}+0
type.*[1]interface {} SRODATA dupok size=56
0x0000 08 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 ................
0x0010 a8 0e 57 36 08 08 08 36 00 00 00 00 00 00 00 00 ..W6...6........
0x0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0030 00 00 00 00 00 00 00 00 ........
rel 24+8 t=1 runtime.memequal64·f+0
rel 32+8 t=1 runtime.gcbits.01+0
rel 40+4 t=5 type..namedata.*[1]interface {}-+0
rel 48+8 t=1 type.[1]interface {}+0
type..importpath.fmt. SRODATA dupok size=5
0x0000 00 03 66 6d 74 ..fmt
gclocals·J5F+7Qw7O7ve2QcWC7DpeQ== SRODATA dupok size=8
0x0000 02 00 00 00 00 00 00 00 ........
gclocals·bDfKCdmtOiGIuJz/x+yQyQ== SRODATA dupok size=10
0x0000 02 00 00 00 07 00 00 00 00 02 ..........
main.main.stkobj SRODATA static size=24
0x0000 01 00 00 00 00 00 00 00 d8 ff ff ff 10 00 00 00 ................
0x0010 10 00 00 00 00 00 00 00 ........
rel 20+4 t=5 runtime.gcbits.02+0
gclocals·ojjeUbKZdec9+9S2K3feng== SRODATA dupok size=10
0x0000 02 00 00 00 08 00 00 00 00 04 ..........
main.add.stkobj SRODATA static size=24
0x0000 01 00 00 00 00 00 00 00 d8 ff ff ff 10 00 00 00 ................
0x0010 10 00 00 00 00 00 00 00 ........
rel 20+4 t=5 runtime.gcbits.02+0
main.add.arginfo1 SRODATA static dupok size=5
0x0000 00 08 08 08 ff .....
以 linux C 为例,x86 和 x86 架构的调用约定不同。
C 语言常用的三种调用约定:
• cdecl,入参从右往左依次入栈,由调用者清理参数占用的堆栈
• stdcall,入参从右往左依次入栈,由被调自己清理参数占用的堆栈
• fastcall,使用 ecx、edx 传递前两个参数,剩下的参数从右向左依次入栈,且由被调自己清理参数占用的堆栈
x86_cdecl.cc 源代码如下,通过 __attribute__((cdecl))
声明函数使用该调用约定。
main 函数调用 sum,计算 1+2+3 的值,并作为返回值返回。
__attribute__((cdecl)) int sum(int a, int b, int c) {
return a + b + c;
}int main() {
return sum(1, 2, 3);
}
汇编代码在调用 _Z3sumiii 之前通过三个 pushl 语句将入参压栈,_Z3sumiii 使用 eax 储存返回值,且返回后 main 函数通过 addl $12, %esp
清理输入参数占用的堆栈。
你可能会问:_Z3sumiii 也通过 subl $16, %esp
分配了 16 个字节的栈内存,在哪里释放的呢?
答案是 leave 指令,它就相当于:
movl %ebp, %esp
popl %ebp
汇编代码如下:
_Z3sumiii:
pushl %ebp #
movl %esp, %ebp #,
subl $16, %esp #,
call __x86.get_pc_thunk.ax #
addl $_GLOBAL_OFFSET_TABLE_, %eax # tmp82,
# x86_cdecl.cc:2: int d = a + b + c;
movl 8(%ebp), %edx # a, tmp86
movl 12(%ebp), %eax # b, tmp87
addl %eax, %edx # tmp87, _1
# x86_cdecl.cc:2: int d = a + b + c;
movl 16(%ebp), %eax # c, tmp91
addl %edx, %eax # _1, tmp90
movl %eax, -4(%ebp) # tmp90, d
# x86_cdecl.cc:3: return d;
movl -4(%ebp), %eax # d, _6
# x86_cdecl.cc:4: }
leave
ret
main:
pushl %ebp #
movl %esp, %ebp #,
subl $16, %esp #,
call __x86.get_pc_thunk.ax #
addl $_GLOBAL_OFFSET_TABLE_, %eax # tmp82,
# x86_cdecl.cc:7: int c = sum(1, 2, 3);
pushl $3 # 从右到左依次入栈
pushl $2 #
pushl $1 #
call _Z3sumiii #
addl $12, %esp # 清理堆栈
movl %eax, -4(%ebp) # tmp85, c
# x86_cdecl.cc:8: return c;
movl -4(%ebp), %eax # c, _4
# x86_cdecl.cc:9: }
leave
ret
__x86.get_pc_thunk.ax:
movl (%esp), %eax #,
ret
stdcall(Standard Call)与cdecl规范类似,只是有一点不同:被调用方函数在返回之前会执行"RET x"指令还原参数栈,而不会使用单纯的"RET"指令直接返回。这里x的计算方式是:x=参数个数*指针长度。
x86_stdecl.cc 源代码,通过 __attribute__((stdcall))
声明约定。
__attribute__((stdcall)) int sum(int a, int b, int c) {
int d = a + b + c;
return d;
}int main() {
int c = sum(1, 2, 3);
return c;
}
汇编中,main 函数通过三个 pushl 将入参压栈,调用 _Z3sumiii 返回后没有清理堆栈。
_Z3sumiii 通过 ret $12
清理参数占用的堆栈,并通过 eax 储存返回值。
_Z3sumiii:
pushl %ebp #
movl %esp, %ebp #,
subl $16, %esp #,
call __x86.get_pc_thunk.ax #
addl $_GLOBAL_OFFSET_TABLE_, %eax # tmp82,
# x86_stdecl.cc:2: int d = a + b + c;
movl 8(%ebp), %edx # a, tmp86
movl 12(%ebp), %eax # b, tmp87
addl %eax, %edx # tmp87, _1
# x86_stdecl.cc:2: int d = a + b + c;
movl 16(%ebp), %eax # c, tmp91
addl %edx, %eax # _1, tmp90
movl %eax, -4(%ebp) # tmp90, d
# x86_stdecl.cc:3: return d;
movl -4(%ebp), %eax # d, _6
# x86_stdecl.cc:4: }
leave
ret $12 # 被调自己清理堆栈
main:
pushl %ebp #
movl %esp, %ebp #,
subl $16, %esp #,
call __x86.get_pc_thunk.ax #
addl $_GLOBAL_OFFSET_TABLE_, %eax # tmp82,
# x86_stdecl.cc:7: int c = sum(1, 2, 3);
pushl $3 # 按参数从右到左依次压栈
pushl $2 #
pushl $1 #
call _Z3sumiii #
movl %eax, -4(%ebp) # tmp85, c
# x86_stdecl.cc:8: return c;
movl -4(%ebp), %eax # c, _4
# x86_stdecl.cc:9: }
leave
ret
__x86.get_pc_thunk.ax:
movl (%esp), %eax #,
ret
x86_fastcall.cc 源代码中通过 __attribute__((fastcall))
声明约定。
__attribute__((fastcall)) int sum(int a, int b, int c) {
int d = a + b + c;
return d;
}int main() {
int c = sum(1, 2, 3);
return c;
}
汇编中通过 edx、ecx 传递前两个参数,剩余的压栈传递。_Z3sumiii 函数返回之前通过 ret $4
清理参数的堆栈。
_Z3sumiii:
pushl %ebp #
movl %esp, %ebp #,
subl $24, %esp #,
call __x86.get_pc_thunk.ax #
addl $_GLOBAL_OFFSET_TABLE_, %eax # tmp82,
movl %ecx, -20(%ebp) # a, a
movl %edx, -24(%ebp) # b, b
# x86_fastcall.cc:2: int d = a + b + c;
movl -20(%ebp), %edx # a, tmp86
movl -24(%ebp), %eax # b, tmp87
addl %eax, %edx # tmp87, _1
# x86_fastcall.cc:2: int d = a + b + c;
movl 8(%ebp), %eax # c, tmp91
addl %edx, %eax # _1, tmp90
movl %eax, -4(%ebp) # tmp90, d
# x86_fastcall.cc:3: return d;
movl -4(%ebp), %eax # d, _6
# x86_fastcall.cc:4: }
leave
ret $4 # 被调自己清理堆栈
main:
pushl %ebp #
movl %esp, %ebp #,
subl $16, %esp #,
call __x86.get_pc_thunk.ax #
addl $_GLOBAL_OFFSET_TABLE_, %eax # tmp82,
# x86_fastcall.cc:7: int c = sum(1, 2, 3);
pushl $3 # 前两个通过 ecx、edx 传,后面的从右向左压栈
movl $2, %edx #,
movl $1, %ecx #,
call _Z3sumiii #
movl %eax, -4(%ebp) # tmp85, c
# x86_fastcall.cc:8: return c;
movl -4(%ebp), %eax # c, _4
# x86_fastcall.cc:9: }
leave
ret
__x86.get_pc_thunk.ax:
movl (%esp), %eax #,
ret
64 位平台中,函数前 6 个参数通过寄存器 rdi、rsi、rdx、rcx、r8、r9 传递,超出的参数从右向左依次入栈。
x86_64.cc 源代码,无需声明 __attribute__。
int sum(int a, int b, int c, int d, int e, int f, int g) {
int h = a + b + c + d + e + f + g;
return h;
}int main() {
int c = sum(1, 2, 3, 4, 5, 6, 7);
return c;
}
汇编代码优先通过寄存器传递参数,超出的参数通过压栈传递,main 函数负责清理被调 _Z3sumiiiiiii 参数占用的堆栈。
_Z3sumiiiiiii:
pushq %rbp #
movq %rsp, %rbp #,
movl %edi, -20(%rbp) # a, a
movl %esi, -24(%rbp) # b, b
movl %edx, -28(%rbp) # c, c
movl %ecx, -32(%rbp) # d, d
movl %r8d, -36(%rbp) # e, e
movl %r9d, -40(%rbp) # f, f
# x86_64.cc:2: int h = a + b + c + d + e + f + g;
movl -20(%rbp), %edx # a, tmp89
movl -24(%rbp), %eax # b, tmp90
addl %eax, %edx # tmp90, _1
# x86_64.cc:2: int h = a + b + c + d + e + f + g;
movl -28(%rbp), %eax # c, tmp91
addl %eax, %edx # tmp91, _2
# x86_64.cc:2: int h = a + b + c + d + e + f + g;
movl -32(%rbp), %eax # d, tmp92
addl %eax, %edx # tmp92, _3
# x86_64.cc:2: int h = a + b + c + d + e + f + g;
movl -36(%rbp), %eax # e, tmp93
addl %eax, %edx # tmp93, _4
# x86_64.cc:2: int h = a + b + c + d + e + f + g;
movl -40(%rbp), %eax # f, tmp94
addl %eax, %edx # tmp94, _5
# x86_64.cc:2: int h = a + b + c + d + e + f + g;
movl 16(%rbp), %eax # g, tmp98
addl %edx, %eax # _5, tmp97
movl %eax, -4(%rbp) # tmp97, h
# x86_64.cc:3: return h;
movl -4(%rbp), %eax # h, _14
# x86_64.cc:4: }
popq %rbp #
ret
main:
pushq %rbp #
movq %rsp, %rbp #,
subq $16, %rsp #,
# x86_64.cc:7: int c = sum(1, 2, 3, 4, 5, 6, 7);
pushq $7 #
movl $6, %r9d #,
movl $5, %r8d #,
movl $4, %ecx #,
movl $3, %edx #,
movl $2, %esi #,
movl $1, %edi #,
call _Z3sumiiiiiii #
addq $8, %rsp #,
movl %eax, -4(%rbp) # tmp84, c
# x86_64.cc:8: return c;
movl -4(%rbp), %eax # c, _4
# x86_64.cc:9: }
leave
ret
Go1.17 使用寄存器替代栈传递参数,使得性能提升 5%,编译后二进制包大小降低 2%。
Go 1.17 implements a new way of passing function arguments and results using registers instead of the stack. Benchmarks for a representative set of Go packages and programs show performance improvements of about 5%, and a typical reduction in binary size of about 2%.
因此分别使用 Go1.16、Go1.17 编译源文件,分析其调用传参特点。
源文件如下,定义 test 函数,输入 11 个参数,将这 11 个参数自增后再输出。
1: package main
2:
3: func test(a, b, c, d, e, f, g, h, i, j, k int) (
4: int, int, int, int, int, int, int, int, int, int, int,
5: ) {
6: return a + 1, b + 1, c + 1, d + 1, e + 1, f + 1, g + 1, h + 1, i + 1, j + 1, k + 1
7: }
8:
9: func main() {
10: _, _, _, _, _, _, _, _, _, _, _ = test(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11)
11: }
平台:x86_64,地址长度 8 字节。
Go 使用的伪寄存器:
• SB:全局静态指针,即程序地址空间的开始地址。一般用在声明函数、全局变量中。
• FP:指向的是主调传的第一个参数的位置(最后一个压栈的参数),需要用
symbol+offset(FP)
来获取入参的参数值。Plan 9 的伪寄存器都需要添加 symbol,否则会报错。• SP:SP寄存器 分为伪 SP 寄存器和硬件 SP 寄存器。
symbol+offset(SP)
形式,则表示伪寄存器 SP (这个也简称为 BP)。如果是offset(SP)
则表示硬件寄存器 SP。伪 SP 寄存器指向当前栈帧第一个局部变量的结束位置;硬件SP指向的是整个函数栈结束的位置。• PC:在 x86 平台下对应 ip 寄存器,amd64 上则是 rip。
编译指令:
> CGO_ENABLE=0 /home/hsp/sdk/go1.16.4/bin/go tool compile -N -l -S call_convention.go
汇编如下,可知:
• main 函数分配 184 字节的栈空间,可容纳 23 个 8 字节数据;并将当前 bp 压栈(Plan 9 没有 push、pop 等操作,只能通过 move data, offset(sp) 设置)
0x0018 00024 (call_convention.go:9) SUBQ $184, SP
0x001f 00031 (call_convention.go:9) MOVQ BP, 176(SP)
• main 函数从栈顶 sp 向栈底 bp 方向依次设置 test 函数的 11 个入参:1-11,其实也是参数从右到左依次入栈
0x002f 00047 (call_convention.go:10) MOVQ $1, (SP)
0x0037 00055 (call_convention.go:10) MOVQ $2, 8(SP)
0x0040 00064 (call_convention.go:10) MOVQ $3, 16(SP)
0x0049 00073 (call_convention.go:10) MOVQ $4, 24(SP)
0x0052 00082 (call_convention.go:10) MOVQ $5, 32(SP)
0x005b 00091 (call_convention.go:10) MOVQ $6, 40(SP)
0x0064 00100 (call_convention.go:10) MOVQ $7, 48(SP)
0x006d 00109 (call_convention.go:10) MOVQ $8, 56(SP)
0x0076 00118 (call_convention.go:10) MOVQ $9, 64(SP)
0x007f 00127 (call_convention.go:10) MOVQ $10, 72(SP)
0x0088 00136 (call_convention.go:10) MOVQ $11, 80(SP)
• main 函数调用 test 函数时会将下一条指令压栈 88(sp)
0x0091 00145 (call_convention.go:10) CALL "".test(SB)
• test 函数则使用 96(sp) - 176(sp) 储存返回的 11 个数据,其实也是返回参数从右到左依次入栈
0x0000 00000 (call_convention.go:3) MOVQ $0, "".~r11+96(SP)
0x0009 00009 (call_convention.go:3) MOVQ $0, "".~r12+104(SP)
0x0012 00018 (call_convention.go:3) MOVQ $0, "".~r13+112(SP)
0x001b 00027 (call_convention.go:3) MOVQ $0, "".~r14+120(SP)
0x0024 00036 (call_convention.go:3) MOVQ $0, "".~r15+128(SP)
0x0030 00048 (call_convention.go:3) MOVQ $0, "".~r16+136(SP)
0x003c 00060 (call_convention.go:3) MOVQ $0, "".~r17+144(SP)
0x0048 00072 (call_convention.go:3) MOVQ $0, "".~r18+152(SP)
0x0054 00084 (call_convention.go:3) MOVQ $0, "".~r19+160(SP)
0x0060 00096 (call_convention.go:3) MOVQ $0, "".~r20+168(SP)
0x006c 00108 (call_convention.go:3) MOVQ $0, "".~r21+176(SP)
• test 函数处理完之后,main 会清理堆栈,弹出保存的 bsp
0x0096 00150 (call_convention.go:11) MOVQ 176(SP), BP
0x009e 00158 (call_convention.go:11) ADDQ $184, SP
由上可知,Go1.16 分配的栈内存中,从栈底到栈顶先储存返回参数,然后储存输入参数,压栈顺序按参数顺序从右到左。且 main 函数分配的栈内存由 main 函数自己销毁。
main 函数调用 test 函数时的栈帧示意图:
全部汇编代码:
"".test STEXT nosplit size=285 args=0xb0 locals=0x0 funcid=0x0
0x0000 00000 (call_convention.go:3) TEXT "".test(SB), NOSPLIT|ABIInternal, $0-176
0x0000 00000 (call_convention.go:3) MOVQ $0, "".~r11+96(SP)
0x0009 00009 (call_convention.go:3) MOVQ $0, "".~r12+104(SP)
0x0012 00018 (call_convention.go:3) MOVQ $0, "".~r13+112(SP)
0x001b 00027 (call_convention.go:3) MOVQ $0, "".~r14+120(SP)
0x0024 00036 (call_convention.go:3) MOVQ $0, "".~r15+128(SP)
0x0030 00048 (call_convention.go:3) MOVQ $0, "".~r16+136(SP)
0x003c 00060 (call_convention.go:3) MOVQ $0, "".~r17+144(SP)
0x0048 00072 (call_convention.go:3) MOVQ $0, "".~r18+152(SP)
0x0054 00084 (call_convention.go:3) MOVQ $0, "".~r19+160(SP)
0x0060 00096 (call_convention.go:3) MOVQ $0, "".~r20+168(SP)
0x006c 00108 (call_convention.go:3) MOVQ $0, "".~r21+176(SP)
0x0078 00120 (call_convention.go:6) MOVQ "".a+8(SP), AX
0x007d 00125 (call_convention.go:6) INCQ AX
0x0080 00128 (call_convention.go:6) MOVQ AX, "".~r11+96(SP)
0x0085 00133 (call_convention.go:6) MOVQ "".b+16(SP), AX
0x008a 00138 (call_convention.go:6) INCQ AX
0x008d 00141 (call_convention.go:6) MOVQ AX, "".~r12+104(SP)
0x0092 00146 (call_convenion.go:6) MOVQ "".c+24(SP), AX
0x0097 00151 (call_convention.go:6) INCQ AX
0x009a 00154 (call_convention.go:6) MOVQ AX, "".~r13+112(SP)
0x009f 00159 (call_convention.go:6) MOVQ "".d+32(SP), AX
0x00a4 00164 (call_convention.go:6) INCQ AX
0x00a7 00167 (call_convention.go:6) MOVQ AX, "".~r14+120(SP)
0x00ac 00172 (call_convention.go:6) MOVQ "".e+40(SP), AX
0x00b1 00177 (call_convention.go:6) INCQ AX
0x00b4 00180 (call_convention.go:6) MOVQ AX, "".~r15+128(SP)
0x00bc 00188 (call_convention.go:6) MOVQ "".f+48(SP), AX
0x00c1 00193 (call_convention.go:6) INCQ AX
0x00c4 00196 (call_convention.go:6) MOVQ AX, "".~r16+136(SP)
0x00cc 00204 (call_convention.go:6) MOVQ "".g+56(SP), AX
0x00d1 00209 (call_convention.go:6) INCQ AX
0x00d4 00212 (call_convention.go:6) MOVQ AX, "".~r17+144(SP)
0x00dc 00220 (call_convention.go:6) MOVQ "".h+64(SP), AX
0x00e1 00225 (call_convention.go:6) INCQ AX
0x00e4 00228 (call_convention.go:6) MOVQ AX, "".~r18+152(SP)
0x00ec 00236 (call_convention.go:6) MOVQ "".i+72(SP), AX
0x00f1 00241 (call_convention.go:6) INCQ AX
0x00f4 00244 (call_convention.go:6) MOVQ AX, "".~r19+160(SP)
0x00fc 00252 (call_convention.go:6) MOVQ "".j+80(SP), AX
0x0101 00257 (call_convention.go:6) INCQ AX
0x0104 00260 (call_convention.go:6) MOVQ AX, "".~r20+168(SP)
0x010c 00268 (call_convention.go:6) MOVQ "".k+88(SP), AX
0x0111 00273 (call_convention.go:6) INCQ AX
0x0114 00276 (call_convention.go:6) MOVQ AX, "".~r21+176(SP)
0x011c 00284 (call_convention.go:6) RET
"".main STEXT size=176 args=0x0 locals=0xb8 funcid=0x0
0x0000 00000 (call_convention.go:9) TEXT "".main(SB), ABIInternal, $184-0
0x0000 00000 (call_convention.go:9) MOVQ (TLS), CX
0x0009 00009 (call_convention.go:9) LEAQ -56(SP), AX
0x000e 00014 (call_convention.go:9) CMPQ AX, 16(CX)
0x0012 00018 (call_convention.go:9) PCDATA $0, $-2
0x0012 00018 (call_convention.go:9) JLS 166
0x0018 00024 (call_convention.go:9) PCDATA $0, $-1
0x0018 00024 (call_convention.go:9) SUBQ $184, SP
0x001f 00031 (call_convention.go:9) MOVQ BP, 176(SP)
0x0027 00039 (call_convention.go:9) LEAQ 176(SP), BP
0x002f 00047 (call_convention.go:9) FUNCDATA $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x002f 00047 (call_convention.go:9) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x002f 00047 (call_convention.go:10) MOVQ $1, (SP)
0x0037 00055 (call_convention.go:10) MOVQ $2, 8(SP)
0x0040 00064 (call_convention.go:10) MOVQ $3, 16(SP)
0x0049 00073 (call_convention.go:10) MOVQ $4, 24(SP)
0x0052 00082 (call_convention.go:10) MOVQ $5, 32(SP)
0x005b 00091 (call_convention.go:10) MOVQ $6, 40(SP)
0x0064 00100 (call_convention.go:10) MOVQ $7, 48(SP)
0x006d 00109 (call_convention.go:10) MOVQ $8, 56(SP)
0x0076 00118 (call_convention.go:10) MOVQ $9, 64(SP)
0x007f 00127 (call_convention.go:10) MOVQ $10, 72(SP)
0x0088 00136 (call_convention.go:10) MOVQ $11, 80(SP)
0x0091 00145 (call_convention.go:10) PCDATA $1, $0
0x0091 00145 (call_convention.go:10) CALL "".test(SB)
0x0096 00150 (call_convention.go:11) MOVQ 176(SP), BP
0x009e 00158 (call_convention.go:11) ADDQ $184, SP
0x00a5 00165 (call_convention.go:11) RET
0x00a6 00166 (call_convention.go:11) NOP
0x00a6 00166 (call_convention.go:9) PCDATA $1, $-1
0x00a6 00166 (call_convention.go:9) PCDATA $0, $-2
0x00a6 00166 (call_convention.go:9) CALL runtime.morestack_noctxt(SB)
0x00ab 00171 (call_convention.go:9) PCDATA $0, $-1
0x00ab 00171 (call_convention.go:9) JMP 0
编译指令:
> CGO_ENABLE=0 /home/hsp/sdk/go1.17.4/bin/go tool compile -N -l -S call_convention.go > ca
具体处理操作:
• main 分配 112 字节栈空间,保存当前 bp
0x0006 00006 (call_convention.go:9) SUBQ $112, SP
0x000a 00010 (call_convention.go:9) MOVQ BP, 104(SP)
• main 使用 AX,BX,CX,DI,SI,R8,R9,R10,R11 传递前 9 个参数,剩余 2 个参数按从右到左的顺序依次压栈
0x0014 00020 (call_convention.go:10) MOVQ $10, (SP)
0x001c 00028 (call_convention.go:10) MOVQ $11, 8(SP)
0x0025 00037 (call_convention.go:10) MOVL $1, AX
0x002a 00042 (call_convention.go:10) MOVL $2, BX
0x002f 00047 (call_convention.go:10) MOVL $3, CX
0x0034 00052 (call_convention.go:10) MOVL $4, DI
0x0039 00057 (call_convention.go:10) MOVL $5, SI
0x003e 00062 (call_convention.go:10) MOVL $6, R8
0x0044 00068 (call_convention.go:10) MOVL $7, R9
0x004a 00074 (call_convention.go:10) MOVL $8, R10
0x0050 00080 (call_convention.go:10) MOVL $9, R11
• main 调用 test,会将 call 语句后的指令 MOVQ 104(SP), BP
的代码段地址压栈
0x0056 00086 (call_convention.go:10) CALL "".test(SB)
0x005b 00091 (call_convention.go:11) MOVQ 104(SP), BP
• test 分配 80 字节的栈空间,并压栈 main 函数中的 bp,但仍旧使用原有 bp
0x0000 00000 (call_convention.go:3) SUBQ $80, SP
0x0004 00004 (call_convention.go:3) MOVQ BP, 72(SP)
0x0009 00009 (call_convention.go:3) LEAQ 72(SP), BP
• test 处理完数据后,弹出 bp,清理自己的堆栈
0x0187 00391 (call_convention.go:6) MOVQ 72(SP), BP
0x018c 00396 (call_convention.go:6) ADDQ $80, SP
• 从 test 返回 main 后,main 弹出 bp,清理自己的堆栈
0x005b 00091 (call_convention.go:11) MOVQ 104(SP), BP
0x0060 00096 (call_convention.go:11) ADDQ $112, SP
test 清理堆栈且返回前的堆栈示意图:
汇编代码如下,其中 MOVQ AX, "".a+120(SP)
这种格式使用的是真 sp 寄存器。
"".test STEXT nosplit size=401 args=0x68 locals=0x50 funcid=0x0
0x0000 00000 (call_convention.go:3) TEXT "".test(SB), NOSPLIT|ABIInternal, $80-104
0x0000 00000 (call_convention.go:3) SUBQ $80, SP
0x0004 00004 (call_convention.go:3) MOVQ BP, 72(SP)
0x0009 00009 (call_convention.go:3) LEAQ 72(SP), BP
0x000e 00014 (call_convention.go:3) FUNCDATA $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x000e 00014 (call_convention.go:3) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x000e 00014 (call_convention.go:3) FUNCDATA $5, "".test.arginfo1(SB)
0x000e 00014 (call_convention.go:3) MOVQ AX, "".a+120(SP)
0x0013 00019 (call_convention.go:3) MOVQ BX, "".b+128(SP)
0x001b 00027 (call_convention.go:3) MOVQ CX, "".c+136(SP)
0x0023 00035 (call_convention.go:3) MOVQ DI, "".d+144(SP)
0x002b 00043 (call_convention.go:3) MOVQ SI, "".e+152(SP)
0x0033 00051 (call_convention.go:3) MOVQ R8, "".f+160(SP)
0x003b 00059 (call_convention.go:3) MOVQ R9, "".g+168(SP)
0x0043 00067 (call_convention.go:3) MOVQ R10, "".h+176(SP)
0x004b 00075 (call_convention.go:3) MOVQ R11, "".i+184(SP)
0x0053 00083 (call_convention.go:3) MOVQ $0, "".~r11+64(SP)
0x005c 00092 (call_convention.go:3) MOVQ $0, "".~r12+56(SP)
0x0065 00101 (call_convention.go:3) MOVQ $0, "".~r13+48(SP)
0x006e 00110 (call_convention.go:3) MOVQ $0, "".~r14+40(SP)
0x0077 00119 (call_convention.go:3) MOVQ $0, "".~r15+32(SP)
0x0080 00128 (call_convention.go:3) MOVQ $0, "".~r16+24(SP)
0x0089 00137 (call_convention.go:3) MOVQ $0, "".~r17+16(SP)
0x0092 00146 (call_convention.go:3) MOVQ $0, "".~r18+8(SP)
0x009b 00155 (call_convention.go:3) MOVQ $0, "".~r19(SP)
0x00a3 00163 (call_convention.go:3) MOVQ $0, "".~r20+104(SP)
0x00ac 00172 (call_convention.go:3) MOVQ 0, "".~r21+112(SP)
0x00b5 00181 (call_convention.go:6) MOVQ "".a+120(SP), DX
0x00ba 00186 (call_convention.go:6) INCQ DX
0x00bd 00189 (call_convention.go:6) MOVQ DX, "".~r11+64(SP)
0x00c2 00194 (call_convention.go:6) MOVQ "".b+128(SP), DX
0x00ca 00202 (call_convention.go:6) INCQ DX
0x00cd 00205 (call_convention.go:6) MOVQ DX, "".~r12+56(SP)
0x00d2 00210 (call_convention.go:6) MOVQ "".c+136(SP), DX
0x00da 00218 (call_convention.go:6) INCQ DX
0x00dd 00221 (call_convention.go:6) MOVQ DX, "".~r13+48(SP)
0x00e2 00226 (call_convention.go:6) MOVQ "".d+144(SP), DX
0x00ea 00234 (call_convention.go:6) INCQ DX
0x00ed 00237 (call_convention.go:6) MOVQ DX, "".~r14+40(SP)
0x00f2 00242 (call_convention.go:6) MOVQ "".e+152(SP), DX
0x00fa 00250 (call_convention.go:6) INCQ DX
0x00fd 00253 (call_convention.go:6) MOVQ DX, "".~r15+32(SP)
0x0102 00258 (call_convention.go:6) MOVQ "".f+160(SP), DX
0x010a 00266 (call_convention.go:6) INCQ DX
0x010d 00269 (call_convention.go:6) MOVQ DX, "".~r16+24(SP)
0x0112 00274 (call_convention.go:6) MOVQ "".g+168(SP), DX
0x011a 00282 (call_convention.go:6) INCQ DX
0x011d 00285 (call_convention.go:6) MOVQ DX, "".~r17+16(SP)
0x0122 00290 (call_convention.go:6) MOVQ "".h+176(SP), DX
0x012a 00298 (call_convention.go:6) INCQ DX
0x012d 00301 (call_convention.go:6) MOVQ DX, "".~r18+8(SP)
0x0132 00306 (call_convention.go:6) MOVQ "".i+184(SP), DX
0x013a 00314 (call_convention.go:6) INCQ DX
0x013d 00317 (call_convention.go:6) MOVQ DX, "".~r19(SP)
0x0141 00321 (call_convention.go:6) MOVQ "".j+88(SP), DX
0x0146 00326 (call_convention.go:6) INCQ DX
0x0149 00329 (call_convention.go:6) MOVQ DX, "".~r20+104(SP)
0x014e 00334 (call_convention.go:6) MOVQ "".k+96(SP), DX
0x0153 00339 (call_convention.go:6) INCQ DX
0x0156 00342 (call_convention.go:6) MOVQ DX, "".~r21+112(SP)
0x015b 00347 (call_convention.go:6) MOVQ "".~r11+64(SP), AX
0x0160 00352 (call_convention.go:6) MOVQ "".~r12+56(SP), BX
0x0165 00357 (call_convention.go:6) MOVQ "".~r13+48(SP), CX
0x016a 00362 (call_convention.go:6) MOVQ "".~r14+40(SP), DI
0x016f 00367 (call_convention.go:6) MOVQ "".~r15+32(SP), SI
0x0174 00372 (call_convention.go:6) MOVQ "".~r16+24(SP), R8
0x0179 00377 (call_convention.go:6) MOVQ "".~r17+16(SP), R9
0x017e 00382 (call_convention.go:6) MOVQ "".~r18+8(SP), R10
0x0183 00387 (call_convention.go:6) MOVQ "".~r19(SP), R11
0x0187 00391 (call_convention.go:6) MOVQ 72(SP), BP
0x018c 00396 (call_convention.go:6) ADDQ $80, SP
0x0190 00400 (call_convention.go:6) RET .
"".main STEXT size=108 args=0x0 locals=0x70 funcid=0x0
0x0000 00000 (call_convention.go:9) TEXT "".main(SB), ABIInternal, $112-0
0x0000 00000 (call_convention.go:9) CMPQ SP, 16(R14)
0x0004 00004 (call_convention.go:9) PCDATA $0, $-2
0x0004 00004 (call_convention.go:9) JLS 101
0x0006 00006 (call_convention.go:9) PCDATA $0, $-1
0x0006 00006 (call_convention.go:9) SUBQ $112, SP
0x000a 00010 (call_convention.go:9) MOVQ BP, 104(SP)
0x000f 00015 (call_convention.go:9) LEAQ 104(SP), BP
0x0014 00020 (call_convention.go:9) FUNCDATA $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0014 00020 (call_convention.go:9) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0014 00020 (call_convention.go:10) MOVQ $10, (SP)
0x001c 00028 (call_convention.go:10) MOVQ $11, 8(SP)
0x0025 00037 (call_convention.go:10) MOVL $1, AX
0x002a 00042 (call_convention.go:10) MOVL $2, BX
0x002f 00047 (call_convention.go:10) MOVL $3, CX
0x0034 00052 (call_convention.go:10) MOVL $4, DI
0x0039 00057 (call_convention.go:10) MOVL $5, SI
0x003e 00062 (call_convention.go:10) MOVL $6, R8
0x0044 00068 (call_convention.go:10) MOVL $7, R9
0x004a 00074 (call_convention.go:10) MOVL $8, R10
0x0050 00080 (call_convention.go:10) MOVL $9, R11
0x0056 00086 (call_convention.go:10) PCDATA $1, $0
0x0056 00086 (call_convention.go:10) CALL "".test(SB)
0x005b 00091 (call_convention.go:11) MOVQ 104(SP), BP
0x0060 00096 (call_convention.go:11) ADDQ $112, SP
0x0064 00100 (call_convention.go:11) RET
0x0065 00101 (call_convention.go:11) NOP
0x0065 00101 (call_convention.go:9) PCDATA $1, $-1
0x0065 00101 (call_convention.go:9) PCDATA $0, $-2
0x0065 00101 (call_convention.go:9) CALL runtime.morestack_noctxt(SB)
0x006a 00106 (call_convention.go:9) PCDATA $0, $-1
0x006a 00106 (call_convention.go:9) JMP 0
调用约定根据不同语言版本、不同架构而不同。
根据架构的不同,C/C++ 在 x86 和 x86_64 展现不同的特性
• x86 可定义使用 stdcall、cdecl、fastcall 等约定;cdecl 使用栈传递参数,主调负责释放参数占用的栈空间;stdcall 使用栈传递参数,被调负责释放参数占用的栈空间;fastcall 优先使用寄存器传参,多余参数用栈传递,被调负责释放参数占用的栈空间。
• x86_64 优先使用寄存器传参,多余参数使用栈传递,主调负责释放参数占用的栈空间
根据语言版本的不同,Go1.16 和 Go1.17 展现不同的特性
• Go1.16 使用栈传递参数,由主调负责释放参数占用的栈空间
• Go1.17 优先使用寄存器传参,多余参数使用栈传递,主调负责释放参数占用的栈空间
推荐阅读