0%

周谈(51)- gdb调试多线程

前言

小结一下多线程下的gdb调试技巧。

多线程gdb调试

测试程序创建两个线程,每个线程都有一个循环递增数值,同时分别调用sleep。

线程信息查看

1
2
3
4
5
6
7
8
9
10
11
12
13
14

info threads

(gdb) info threads
Id Target Id Frame
1 Thread 0x7ffff7faa740 (LWP 2778) "sem" clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
* 2 Thread 0x7ffff7bff640 (LWP 2781) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
3 Thread 0x7ffff73fe640 (LWP 2782) "sem" __sleep (seconds=2) at ../sysdeps/posix/sleep.c:34
(gdb) bt
#0 enqueue (arg=0x7fffffffe300) at test_sem.c:48
#1 0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#2 0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81


每个线程有三个ID:

Pthread库为线程分配的pthread ID,也就是用pthread_self()返回的ID, Thread 0x7ffff7faa740
Linux kernel为线程分配的thread ID,也就是gettid()返回的ID, LWP 2956
GDB为线程分配的ID。执行GDB调试命令时要指定的线程ID,如无特殊说明,都是指的这个ID, 最前面的1,2,3

如上, 带*号表示当前在2号线程, 可以在另外的终端使用ps -eT | grep sem查看线程信息。

1
2
3
4
5
6
root@keep-VirtualBox:~# ps -eT | grep sem
2778 2778 pts/1 00:00:00 sem
2778 2781 pts/1 00:00:00 sem
2778 2782 pts/1 00:00:00 sem
root@keep-VirtualBox:~#

默认情况下,执行的GDB命令是针对当前线程。比如此时执行bt(backtrace)命令,获取的是线程2的调用栈

切换当前线程

thread 命令可以切换当前线程,如thread 1把线程1切换为当前线程。

针对指定线程执行命令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
(gdb) thread 1
[Switching to thread 1 (Thread 0x7ffff7faa740 (LWP 2778))]
#0 clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
62 ../sysdeps/unix/sysv/linux/x86_64/clone3.S: 没有那个文件或目录.
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7ffff7faa740 (LWP 2778) "sem" clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
2 Thread 0x7ffff7bff640 (LWP 2781) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
3 Thread 0x7ffff73fe640 (LWP 2782) "sem" __sleep (seconds=2) at ../sysdeps/posix/sleep.c:34
(gdb) bt
#0 clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
#1 0x00007ffff7d26a91 in __GI___clone_internal (cl_args=cl_args@entry=0x7fffffffe0b0,
func=func@entry=0x7ffff7c947d0 <start_thread>, arg=arg@entry=0x7ffff73fe640)
at ../sysdeps/unix/sysv/linux/clone-internal.c:54
#2 0x00007ffff7c946d9 in create_thread (pd=pd@entry=0x7ffff73fe640, attr=attr@entry=0x7fffffffe1d0,
stopped_start=stopped_start@entry=0x7fffffffe1ce, stackaddr=stackaddr@entry=0x7ffff6bfe000, stacksize=8388352,
thread_ran=thread_ran@entry=0x7fffffffe1cf) at ./nptl/pthread_create.c:295
#3 0x00007ffff7c95200 in __pthread_create_2_1 (newthread=<optimized out>, attr=<optimized out>,
start_routine=<optimized out>, arg=<optimized out>) at ./nptl/pthread_create.c:828
#4 0x00005555555554ad in main () at test_sem.c:77
(gdb)

在指定线程执行命令

thread apply [thread-id-list | all] command 可以针对指定线程执行命令

如:
thread apply all bt:打印所有线程的调用栈信息
thread apply 3 bt:打印线程3的调用栈信息
thread apply 2-3 bt:打印线程2和线程3的调用栈信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
(gdb) thread apply all bt

Thread 3 (Thread 0x7ffff73fe640 (LWP 2782) "sem"):
#0 __sleep (seconds=2) at ../sysdeps/posix/sleep.c:34
#1 0x00005555555552d9 in dequeue (arg=0x7fffffffe300) at test_sem.c:31
#2 0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#3 0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 2 (Thread 0x7ffff7bff640 (LWP 2781) "sem"):
#0 enqueue (arg=0x7fffffffe300) at test_sem.c:48
#1 0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#2 0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 1 (Thread 0x7ffff7faa740 (LWP 2778) "sem"):
#0 clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:62
#1 0x00007ffff7d26a91 in __GI___clone_internal (cl_args=cl_args@entry=0x7fffffffe0b0, func=func@entry=0x7ffff7c947d0 <start_thread>, arg=arg@entry=0x7ffff73fe640) at ../sysdeps/unix/sysv/linux/clone-internal.c:54
#2 0x00007ffff7c946d9 in create_thread (pd=pd@entry=0x7ffff73fe640, attr=attr@entry=0x7fffffffe1d0, stopped_start=stopped_start@entry=0x7fffffffe1ce, stackaddr=stackaddr@entry=0x7ffff6bfe000, stacksize=8388352, thread_ran=thread_ran@entry=0x7fffffffe1cf) at ./nptl/pthread_create.c:295
#3 0x00007ffff7c95200 in __pthread_create_2_1 (newthread=<optimized out>, attr=<optimized out>, start_routine=<optimized out>, arg=<optimized out>) at ./nptl/pthread_create.c:828
#4 0x00005555555554ad in main () at test_sem.c:77
(gdb)


在特定线程设置断点

break # 在设置断点,并对所有线程生效
break thread # 在设置断点,仅对指定的线程生效
break thread if # 在设置条件断点断点,仅对指定的线程生效

控制线程创建和退出信息

gdb的时候,默认创建和退出线程会打印信息。 可以通过命令关闭该打印。

set print thread-events on/off

命令缩写

taas command 相当于 thread apply all -s command
tfaas command 相当于 thread apply all -s – frame apply all -s command

tfaas这个命令非常有用。比如,有时我们只记得一个变量或参数的名字,却忘了或不知道它是在哪个具体的函数中,就可以用这个命令:tfaas p var_name,这个命令会搜索所有线程的调用栈,找到名字为var_name的变量,并打印它的值,如:

1
2
3
4
5
6
7
8
(gdb) tfaas p enq

Thread 1 (Thread 0x7ffff7faa740 (LWP 2956) "sem"):
#4 0x0000555555555499 in main () at test_sem.c:93
$1 = 140737349940800
(gdb)


控制程序执行的两种模式

为了更好的调试多线程程序,GDB提供了两种模式来控制程序的执行:

All-Stop Mode:在该模式下,不管因为什么原因,一个线程被中断执行,其他所有的线程都会同时被中断执行。
Non-Stop Mode:在该模式下,一个线程被中断执行,不会影响其他线程的正常执行。

All-Stop Mode

默认处于All-Stop Mode。这也给程序调试带来了一些困难,比如,无法100%精确地进行单步调试。有时你会发现,在执行step命令之后,程序却停在了另外一个线程中。

可以通过命令 set scheduler-locking mode来锁定线程模式。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# mode
# off -- 不锁定线程,恢复时所有线程继续执行
# on -- 锁定当前线程,执行continue, step, next, finish等命令后,只有当前线程继续执行,其他线程还是停止状态
# step -- 只有step时,当前线程 继续执行,其他线程还是停止状态。 其他命令 所有线程都会恢复执行
# replay -- 反向调试时, 当前线程执行,其他线程停止


(gdb) show scheduler-locking
Mode for locking scheduler during execution is "replay".
(gdb) set schedule-l
Display all 150 possibilities? (y or n)
(gdb) set schedule-lock
lock lock_fd locked_map_ptr locked_vfxprintf lockf lockf64
(gdb) set schedule-lock
lock lock_fd locked_map_ptr locked_vfxprintf lockf lockf64
(gdb) set schedule-lock
Display all 200 possibilities? (y or n)
(gdb) set scheduler-locking on
(gdb) show scheduler-locking
Mode for locking scheduler during execution is "on".
(gdb)


Non-Stop Mode

在Non-Stop模式下,一个线程被中断执行,并不会影响到其他线程。比如,一个线程触发断点,只有这一个线程会被中断执行,其余线程不受影响继续执行。同样的,在程序运行时,执行Ctrl+C,也只会中断一个线程。

开启Non-stop Mode

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
set pagination off
set non-stop on

set non-stop off # 恢复 All-stop模式

show non-stop

(gdb) set non-stop on
(gdb) show non-stop
Controlling the inferior in non-stop mode is on.
(gdb) b enqueue
Breakpoint 1 at 0x138d: file test_sem.c, line 48.
(gdb) r
Starting program: /media/VM_SHARE/code/blog_code/condition/sem
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7bff640 (LWP 2930)]
[New Thread 0x7ffff73fe640 (LWP 2931)]

Thread 2 "sem" hit Breakpoint 1, enqueue (arg=0x7fffffffe300) at test_sem.c:48
48 thrd_args_t *thrd_args = (thrd_args_t *)arg;
(gdb) set print thread-events off
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7ffff7faa740 (LWP 2927) "sem" (running)
2 Thread 0x7ffff7bff640 (LWP 2930) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
3 Thread 0x7ffff73fe640 (LWP 2931) "sem" (running)
(gdb)
(gdb) interrupt -a
(gdb)
Thread 3 "sem" stopped.
0x00007ffff7ce57f8 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
78 ../sysdeps/unix/sysv/linux/clock_nanosleep.c: 没有那个文件或目录.

Thread 1 "sem" stopped.
__futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265, expected=2930, futex_word=0x7ffff7bff910) at ./nptl/futex-internal.c:57
57 ./nptl/futex-internal.c: 没有那个文件或目录.
info threads
Id Target Id Frame
* 1 Thread 0x7ffff7faa740 (LWP 2927) "sem" __futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265,
expected=2930, futex_word=0x7ffff7bff910) at ./nptl/futex-internal.c:57
2 Thread 0x7ffff7bff640 (LWP 2930) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
3 Thread 0x7ffff73fe640 (LWP 2931) "sem" 0x00007ffff7ce57f8 in __GI___clock_nanosleep (clock_id=clock_id@entry=0,
flags=flags@entry=0, req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0)
at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
(gdb)


可以看到执行到断点函数enqueue时, 只有线程2停止,线程1, 3还在执行

interrupt -a命令可以中断所有线程的执行。

命令后台执行

像shell命令一样,后面加一个&符号,可以把程序放在后台执行。在GDB中同样可以在命令后面加一个&符号,这样就能把命令放在后台执行。这样子,gdb就可以继续接收命令,比如我们可以run &, 然后interrupt 中断线程,继续查看线程状态。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
(gdb) c&
Continuing.
(gdb) info threads
Id Target Id Frame
* 1 Thread 0x7ffff7faa740 (LWP 2927) "sem" (running)
2 Thread 0x7ffff7bff640 (LWP 2930) "sem" enqueue (arg=0x7fffffffe300) at test_sem.c:48
3 Thread 0x7ffff73fe640 (LWP 2931) "sem" 0x00007ffff7ce57f8 in __GI___clock_nanosleep (clock_id=clock_id@entry=0,
flags=flags@entry=0, req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0)
at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
(gdb) thread apply all bt

Thread 3 (Thread 0x7ffff73fe640 (LWP 2931) "sem"):
#0 0x00007ffff7ce57f8 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
#1 0x00007ffff7cea677 in __GI___nanosleep (req=req@entry=0x7ffff73fddf0, rem=rem@entry=0x7ffff73fddf0) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2 0x00007ffff7cea5ae in __sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
#3 0x00005555555552d9 in dequeue (arg=0x7fffffffe300) at test_sem.c:31
#4 0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#5 0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 2 (Thread 0x7ffff7bff640 (LWP 2930) "sem"):
#0 __sleep (seconds=1) at ../sysdeps/posix/sleep.c:34
#1 0x00005555555553b5 in enqueue (arg=0x7fffffffe300) at test_sem.c:52
#2 0x00007ffff7c94ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#3 0x00007ffff7d26a40 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

Thread 1 (Thread 0x7ffff7faa740 (LWP 2927) "sem"):
Selected thread is running.


更多

这个文章主要就是记录一些命令的使用,后续便于查找。很多时候我们用过或者看过一些命令,知道有这个东西,但是就是想不起来怎么用了,那么写博文就可以帮助到我们。

测试程序代码路径: https://gitee.com/fishmwei/blog_code/blob/master/gdb/test_sem.c


行动,才不会被动!

欢迎关注个人公众号 微信 -> 搜索 -> fishmwei,沟通交流。

欢迎关注

博客地址: https://fishmwei.github.io

掘金主页: https://juejin.cn/user/2084329776486919