Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • D dynamorio
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 1,467
    • Issues 1,467
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 44
    • Merge requests 44
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • DynamoRIO
  • dynamorio
  • Issues
  • #4075
Closed
Open
Issue created Feb 02, 2020 by Derek Bruening@derekbrueningContributor

DEADLOCK when nudge sent: thread_initexit_lock has no owner but lock_requests==4

The client.nudge_test test on 32-bit seems to hang quite a bit. I reproduced one such hang here. Others were from asserts (#4074 (closed)) and the test harness not handling app exits from asserts.

Running the client.nudge_test command:

10:26 PM ~/dr/git/build_x86_dbg_tests
% "C:/Program Files/CMake/bin/cmake.exe" -D toolbindir=D:/derek/dr/git/build_x86_dbg_tests/bin32 -D cmd="D:/derek/dr/git/build_x86_dbg_tests/bin32/run_in_bg@-out@D:/derek/dr/git/build_x86_dbg_tests/suite/tests/client.nudge_test-out@D:/derek/dr/git/build_x86_dbg_tests/bin32/drrun.exe@-pidfile@D:/derek/dr/git/build_x86_dbg_tests/suite/tests/client.nudge_test-pid@-quiet@-debug@-use_dll@D:/derek/dr/git/build_x86_dbg_tests/lib32/debug/dynamorio.dll@-exit0@-stderr_mask@0xC@-msgbox_mask@12@-dumpcore_mask@0x7d@-staged@-code_api@-c@D:/derek/dr/git/build_x86_dbg_tests/suite/tests/bin/client.nudge_test.dll.dll@--@D:/derek/dr/git/build_x86_dbg_tests/suite/tests/bin/win32.infloop.exe" -D out=D:/derek/dr/git/build_x86_dbg_tests/suite/tests/client.nudge_test-out -D "nudge=-client;0;10" -D clear= -D pidfile=D:/derek/dr/git/build_x86_dbg_tests/suite/tests/client.nudge_test-pid -P D:/derek/dr/git/src/suite/tests/runall.cmake
Running command: |D:/derek/dr/git/build_x86_dbg_tests/bin32/run_in_bg;-out;D:/derek/dr/git/build_x86_dbg_tests/suite/tests/client.nudge_test-out;D:/derek/dr/git/build_x86_dbg_tests/bin32/drrun.exe;-pidfile;D:/derek/dr/git/build_x86_dbg_tests/suite/tests/client.nudge_test-pid;-quiet;-debug;-use_dll;D:/derek/dr/git/build_x86_dbg_tests/lib32/debug/dynamorio.dll;-exit0;-stderr_mask;0xC;-msgbox_mask;12;-dumpcore_mask;0x7d;-staged;-code_api;-c;D:/derek/dr/git/build_x86_dbg_tests/suite/tests/bin/client.nudge_test.dll.dll;--;D:/derek/dr/git/build_x86_dbg_tests/suite/tests/bin/win32.infloop.exe|
Command returned 0
Found D:/derek/dr/git/build_x86_dbg_tests/suite/tests/client.nudge_test-pid: pid is 15524
Found D:/derek/dr/git/build_x86_dbg_tests/suite/tests/client.nudge_test-out
Sending nudge -nudge_pid;15524;0;10

I once saw infloop hanging and not accepting the nudge:

0:000> ~*kn

.  0  Id: 3ca4.46dc Suspend: 1 Teb: 00dd0000 Unfrozen
 # ChildEBP RetAddr  
00 1eafe588 5af63824 ntdll!NtWaitForSingleObject+0xc
01 1eafe5a0 5af32dac dynamorio!nt_wait_event_with_timeout+0x14 [d:\derek\dr\git\src\core\win32\ntdll.c @ 3744] 
02 1eafe84c 5af25d5d dynamorio!os_wait_event+0x25c [d:\derek\dr\git\src\core\win32\os.c @ 8462] 
03 1eafe878 5ad5d37e dynamorio!mutex_wait_contended_lock+0xad [d:\derek\dr\git\src\core\win32\os.c @ 8570] 
04 1eafe890 5ad5d1de dynamorio!d_r_mutex_lock_app+0x9e [d:\derek\dr\git\src\core\utils.c @ 891] 
05 1eafe8a0 5ae12c41 dynamorio!d_r_mutex_lock+0xe [d:\derek\dr\git\src\core\utils.c @ 904] 
06 1eafe900 5ae00b85 dynamorio!move_lazy_list_to_pending_delete+0xb1 [d:\derek\dr\git\src\core\vmareas.c @ 9183] 
07 1eafe95c 5acb7c9e dynamorio!add_to_lazy_deletion_list+0x535 [d:\derek\dr\git\src\core\vmareas.c @ 9313] 
08 1eafe9c0 5adee11d dynamorio!fragment_remove_shared_no_flush+0x69e [d:\derek\dr\git\src\core\fragment.c @ 3242] 
09 1eafec10 5ade60dc dynamorio!end_and_emit_trace+0x287d [d:\derek\dr\git\src\core\monitor.c @ 1546] 
0a 1eafef18 5ad3dab3 dynamorio!monitor_cache_enter+0x168c [d:\derek\dr\git\src\core\monitor.c @ 2069] 
0b 1eafeff4 1eb2220e dynamorio!d_r_dispatch+0x383 [d:\derek\dr\git\src\core\dispatch.c @ 197] 
WARNING: Frame IP not in any known module. Following frames may be wrong.
0c 00eff01c 00000000 0x1eb2220e

   1  Id: 3ca4.3fb0 Suspend: 1 Teb: 00dd3000 Unfrozen
 # ChildEBP RetAddr  
00 0132fa88 77066359 dynamorio!interception_code_array+0x1266
01 0132fa98 77e77b74 KERNEL32!BaseThreadInitThunk+0x19
02 0132faf4 77e77b44 ntdll!__RtlUserThreadStart+0x2f
03 0132fb04 00000000 ntdll!_RtlUserThreadStart+0x1b

   2  Id: 3ca4.3af4 Suspend: 1 Teb: 00dd6000 Unfrozen
 # ChildEBP RetAddr  
00 0142f99c 77066359 dynamorio!interception_code_array+0x1266
01 0142f9ac 77e77b74 KERNEL32!BaseThreadInitThunk+0x19
02 0142fa08 77e77b44 ntdll!__RtlUserThreadStart+0x2f
03 0142fa18 00000000 ntdll!_RtlUserThreadStart+0x1b

   3  Id: 3ca4.37d8 Suspend: 1 Teb: 00dd9000 Unfrozen
 # ChildEBP RetAddr  
00 0152fbe8 77066359 dynamorio!interception_code_array+0x1266
01 0152fbf8 77e77b74 KERNEL32!BaseThreadInitThunk+0x19
02 0152fc54 77e77b44 ntdll!__RtlUserThreadStart+0x2f
03 0152fc64 00000000 ntdll!_RtlUserThreadStart+0x1b

   4  Id: 3ca4.2e68 Suspend: 1 Teb: 00ddc000 Unfrozen
WARNING: Stack pointer is outside the normal stack bounds. Stack unwinding can be inaccurate.
 # ChildEBP RetAddr  
00 1ea7e720 5af63824 ntdll!NtWaitForSingleObject+0xc
01 1ea7e738 5af32dac dynamorio!nt_wait_event_with_timeout+0x14 [d:\derek\dr\git\src\core\win32\ntdll.c @ 3744] 
02 1ea7e9e4 5af25d5d dynamorio!os_wait_event+0x25c [d:\derek\dr\git\src\core\win32\os.c @ 8462] 
03 1ea7ea10 5ad5d37e dynamorio!mutex_wait_contended_lock+0xad [d:\derek\dr\git\src\core\win32\os.c @ 8570] 
04 1ea7ea28 5ad5d1de dynamorio!d_r_mutex_lock_app+0x9e [d:\derek\dr\git\src\core\utils.c @ 891] 
05 1ea7ea38 5ac9df04 dynamorio!d_r_mutex_lock+0xe [d:\derek\dr\git\src\core\utils.c @ 904] 
06 1ea7eaa8 5af29e35 dynamorio!dynamo_thread_init+0x124 [d:\derek\dr\git\src\core\dynamo.c @ 2247] 
07 1ea7ed50 5af5dae2 dynamorio!thread_attach_setup+0xe5 [d:\derek\dr\git\src\core\win32\os.c @ 2787] 
08 1ea7ed64 5b0d5361 dynamorio!thread_attach_takeover_callee+0x12 [d:\derek\dr\git\src\core\win32\callback.c @ 8725] 
09 1ea7ed6c 00000000 dynamorio!interception_code_array+0x1361

   5  Id: 3ca4.3bbc Suspend: 1 Teb: 00ddf000 Unfrozen
 # ChildEBP RetAddr  
00 0325ec64 5af63824 ntdll!NtWaitForSingleObject+0xc
01 0325ec7c 5af32dac dynamorio!nt_wait_event_with_timeout+0x14 [d:\derek\dr\git\src\core\win32\ntdll.c @ 3744] 
02 0325ef28 5af25d5d dynamorio!os_wait_event+0x25c [d:\derek\dr\git\src\core\win32\os.c @ 8462] 
03 0325ef54 5ad5d37e dynamorio!mutex_wait_contended_lock+0xad [d:\derek\dr\git\src\core\win32\os.c @ 8570] 
04 0325ef6c 5ad5d1de dynamorio!d_r_mutex_lock_app+0x9e [d:\derek\dr\git\src\core\utils.c @ 891] 
05 0325ef7c 5ac9df04 dynamorio!d_r_mutex_lock+0xe [d:\derek\dr\git\src\core\utils.c @ 904] 
06 0325efec 5af53ad6 dynamorio!dynamo_thread_init+0x124 [d:\derek\dr\git\src\core\dynamo.c @ 2247] 
07 0325f370 5af542f2 dynamorio!intercept_new_thread+0x126 [d:\derek\dr\git\src\core\win32\callback.c @ 3139] 
08 0325f38c 5b0d4cac dynamorio!intercept_ldr_init+0x82 [d:\derek\dr\git\src\core\win32\callback.c @ 3390] 
09 00000000 00000000 dynamorio!interception_code_array+0xcac
0:000> !runaway
 User Mode Time
  Thread       Time
    0:46dc     0 days 0:00:00.562
    5:3bbc     0 days 0:00:00.000
    4:2e68     0 days 0:00:00.000
    3:37d8     0 days 0:00:00.000
    2:3af4     0 days 0:00:00.000
    1:3fb0     0 days 0:00:00.000
0:000> g
Break-in sent, waiting 4 seconds...
WARNING: Break-in timed out, suspending.
         This is usually caused by another thread holding the loader lock
(3ca4.46dc): Wake debugger - code 80000007 (first chance)
eax=00000000 ebx=1ead5040 ecx=00000001 edx=00000004 esi=00000000 edi=00000000
eip=77e81d9c esp=1eafe58c ebp=1eafe5a0 iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
ntdll!NtWaitForSingleObject+0xc:
77e81d9c c20c00          ret     0Ch
0:000> !runaway
 User Mode Time
  Thread       Time
    0:46dc     0 days 0:00:00.562
    6:4510     0 days 0:00:00.000
    5:3bbc     0 days 0:00:00.000
    4:2e68     0 days 0:00:00.000
    3:37d8     0 days 0:00:00.000
    2:3af4     0 days 0:00:00.000
    1:3fb0     0 days 0:00:00.000
0:000> ~*kn

.  0  Id: 3ca4.46dc Suspend: 1 Teb: 00dd0000 Unfrozen
 # ChildEBP RetAddr  
00 1eafe588 5af63824 ntdll!NtWaitForSingleObject+0xc
01 1eafe5a0 5af32dac dynamorio!nt_wait_event_with_timeout+0x14 [d:\derek\dr\git\src\core\win32\ntdll.c @ 3744] 
02 1eafe84c 5af25d5d dynamorio!os_wait_event+0x25c [d:\derek\dr\git\src\core\win32\os.c @ 8462] 
03 1eafe878 5ad5d37e dynamorio!mutex_wait_contended_lock+0xad [d:\derek\dr\git\src\core\win32\os.c @ 8570] 
04 1eafe890 5ad5d1de dynamorio!d_r_mutex_lock_app+0x9e [d:\derek\dr\git\src\core\utils.c @ 891] 
05 1eafe8a0 5ae12c41 dynamorio!d_r_mutex_lock+0xe [d:\derek\dr\git\src\core\utils.c @ 904] 
06 1eafe900 5ae00b85 dynamorio!move_lazy_list_to_pending_delete+0xb1 [d:\derek\dr\git\src\core\vmareas.c @ 9183] 
07 1eafe95c 5acb7c9e dynamorio!add_to_lazy_deletion_list+0x535 [d:\derek\dr\git\src\core\vmareas.c @ 9313] 
08 1eafe9c0 5adee11d dynamorio!fragment_remove_shared_no_flush+0x69e [d:\derek\dr\git\src\core\fragment.c @ 3242] 
09 1eafec10 5ade60dc dynamorio!end_and_emit_trace+0x287d [d:\derek\dr\git\src\core\monitor.c @ 1546] 
0a 1eafef18 5ad3dab3 dynamorio!monitor_cache_enter+0x168c [d:\derek\dr\git\src\core\monitor.c @ 2069] 
0b 1eafeff4 1eb2220e dynamorio!d_r_dispatch+0x383 [d:\derek\dr\git\src\core\dispatch.c @ 197] 
WARNING: Frame IP not in any known module. Following frames may be wrong.
0c 00eff01c 00000000 0x1eb2220e

   1  Id: 3ca4.3fb0 Suspend: 1 Teb: 00dd3000 Unfrozen
 # ChildEBP RetAddr  
00 0132fa88 77066359 dynamorio!interception_code_array+0x1266
01 0132fa98 77e77b74 KERNEL32!BaseThreadInitThunk+0x19
02 0132faf4 77e77b44 ntdll!__RtlUserThreadStart+0x2f
03 0132fb04 00000000 ntdll!_RtlUserThreadStart+0x1b

   2  Id: 3ca4.3af4 Suspend: 1 Teb: 00dd6000 Unfrozen
 # ChildEBP RetAddr  
00 0142f99c 77066359 dynamorio!interception_code_array+0x1266
01 0142f9ac 77e77b74 KERNEL32!BaseThreadInitThunk+0x19
02 0142fa08 77e77b44 ntdll!__RtlUserThreadStart+0x2f
03 0142fa18 00000000 ntdll!_RtlUserThreadStart+0x1b

   3  Id: 3ca4.37d8 Suspend: 1 Teb: 00dd9000 Unfrozen
 # ChildEBP RetAddr  
00 0152fbe8 77066359 dynamorio!interception_code_array+0x1266
01 0152fbf8 77e77b74 KERNEL32!BaseThreadInitThunk+0x19
02 0152fc54 77e77b44 ntdll!__RtlUserThreadStart+0x2f
03 0152fc64 00000000 ntdll!_RtlUserThreadStart+0x1b

   4  Id: 3ca4.2e68 Suspend: 1 Teb: 00ddc000 Unfrozen
WARNING: Stack pointer is outside the normal stack bounds. Stack unwinding can be inaccurate.
 # ChildEBP RetAddr  
00 1ea7e720 5af63824 ntdll!NtWaitForSingleObject+0xc
01 1ea7e738 5af32dac dynamorio!nt_wait_event_with_timeout+0x14 [d:\derek\dr\git\src\core\win32\ntdll.c @ 3744] 
02 1ea7e9e4 5af25d5d dynamorio!os_wait_event+0x25c [d:\derek\dr\git\src\core\win32\os.c @ 8462] 
03 1ea7ea10 5ad5d37e dynamorio!mutex_wait_contended_lock+0xad [d:\derek\dr\git\src\core\win32\os.c @ 8570] 
04 1ea7ea28 5ad5d1de dynamorio!d_r_mutex_lock_app+0x9e [d:\derek\dr\git\src\core\utils.c @ 891] 
05 1ea7ea38 5ac9df04 dynamorio!d_r_mutex_lock+0xe [d:\derek\dr\git\src\core\utils.c @ 904] 
06 1ea7eaa8 5af29e35 dynamorio!dynamo_thread_init+0x124 [d:\derek\dr\git\src\core\dynamo.c @ 2247] 
07 1ea7ed50 5af5dae2 dynamorio!thread_attach_setup+0xe5 [d:\derek\dr\git\src\core\win32\os.c @ 2787] 
08 1ea7ed64 5b0d5361 dynamorio!thread_attach_takeover_callee+0x12 [d:\derek\dr\git\src\core\win32\callback.c @ 8725] 
09 1ea7ed6c 00000000 dynamorio!interception_code_array+0x1361

   5  Id: 3ca4.3bbc Suspend: 1 Teb: 00ddf000 Unfrozen
 # ChildEBP RetAddr  
00 0325ec64 5af63824 ntdll!NtWaitForSingleObject+0xc
01 0325ec7c 5af32dac dynamorio!nt_wait_event_with_timeout+0x14 [d:\derek\dr\git\src\core\win32\ntdll.c @ 3744] 
02 0325ef28 5af25d5d dynamorio!os_wait_event+0x25c [d:\derek\dr\git\src\core\win32\os.c @ 8462] 
03 0325ef54 5ad5d37e dynamorio!mutex_wait_contended_lock+0xad [d:\derek\dr\git\src\core\win32\os.c @ 8570] 
04 0325ef6c 5ad5d1de dynamorio!d_r_mutex_lock_app+0x9e [d:\derek\dr\git\src\core\utils.c @ 891] 
05 0325ef7c 5ac9df04 dynamorio!d_r_mutex_lock+0xe [d:\derek\dr\git\src\core\utils.c @ 904] 
06 0325efec 5af53ad6 dynamorio!dynamo_thread_init+0x124 [d:\derek\dr\git\src\core\dynamo.c @ 2247] 
07 0325f370 5af542f2 dynamorio!intercept_new_thread+0x126 [d:\derek\dr\git\src\core\win32\callback.c @ 3139] 
08 0325f38c 5b0d4cac dynamorio!intercept_ldr_init+0x82 [d:\derek\dr\git\src\core\win32\callback.c @ 3390] 
09 00000000 00000000 dynamorio!interception_code_array+0xcac

   6  Id: 3ca4.4510 Suspend: 1 Teb: 00de2000 Unfrozen
 # ChildEBP RetAddr  
00 0335ed24 5af63824 ntdll!NtWaitForSingleObject+0xc
01 0335ed3c 5af32dac dynamorio!nt_wait_event_with_timeout+0x14 [d:\derek\dr\git\src\core\win32\ntdll.c @ 3744] 
02 0335efe8 5af25d5d dynamorio!os_wait_event+0x25c [d:\derek\dr\git\src\core\win32\os.c @ 8462] 
03 0335f014 5ad5d37e dynamorio!mutex_wait_contended_lock+0xad [d:\derek\dr\git\src\core\win32\os.c @ 8570] 
04 0335f02c 5ad5d1de dynamorio!d_r_mutex_lock_app+0x9e [d:\derek\dr\git\src\core\utils.c @ 891] 
05 0335f03c 5ac9df04 dynamorio!d_r_mutex_lock+0xe [d:\derek\dr\git\src\core\utils.c @ 904] 
06 0335f0ac 5af53ad6 dynamorio!dynamo_thread_init+0x124 [d:\derek\dr\git\src\core\dynamo.c @ 2247] 
07 0335f430 5af542f2 dynamorio!intercept_new_thread+0x126 [d:\derek\dr\git\src\core\win32\callback.c @ 3139] 
08 0335f44c 5b0d4cac dynamorio!intercept_ldr_init+0x82 [d:\derek\dr\git\src\core\win32\callback.c @ 3390] 
09 00000000 00000000 dynamorio!interception_code_array+0xcac
0:000> dt thread_initexit_lock
dynamorio!thread_initexit_lock
   +0x000 lock_requests    : 0n4
   +0x004 contended_event  : 0x00000128 Void
   +0x008 name             : 0x5afb84b8  "thread_initexit_lock(mutex)@D:\derek\dr\git\src\core\dynamo.c:276"
   +0x00c rank             : 4
   +0x010 owner            : 0
   +0x014 owning_dcontext  : (null) 
   +0x018 prev_owned_lock  : (null) 
   +0x01c count_times_acquired : 2
   +0x020 count_times_contended : 0xa
   +0x024 count_times_spin_pause : 0
   +0x028 max_contended_requests : 0
   +0x02c count_times_spin_only : 3
   +0x030 prev_process_lock : 0x5b0e7158 _mutex_t
   +0x034 next_process_lock : 0x5b0e70b8 _mutex_t
   +0x038 callstack        : [4] (null) 
   +0x048 app_lock         : 0 ''
   +0x049 deleted          : 0 ''

So no owner but lock_requests==4.

The interception_code_array+0x1266 is the start of the sequence for thread_attach_takeover_callee. There is a loop later for initstack, but this is the start. Why aren't these threads being scheduled? They must be the 3 we see at the end ("<Failed to suspend attached-but-never-scheduled thread 12244>"): special system threads.

Assignee
Assign to
Time tracking