drreg state restore problems with DR spill slots

First, drreg fails to properly identify DR spill slots, b/c it assumes the TLS slots are ordered low to high.

Second, when that bug is fixed, it leads to restoring regs it wasn't before (b/c it thought they weren't its spills) which leads to an assert:

<Application /work/dr/git/build_x64_dbg_tests/clients/bin64/tool.drcacheoff.burst_threads (16044) DynamoRIO usage error : dr_read_saved_reg(): drcontext does not belong to current thread>

#5  0x00000000004b2803 in external_error (file=0x73aac8 "/work/dr/git/src/core/lib/instrument.c", line=5734, 
    msg=0x73c910 "dr_read_saved_reg(): drcontext does not belong to current thread") at /work/dr/git/src/core/utils.c:200
#6  0x00000000005be4f1 in dr_read_saved_reg (drcontext=0x475c2080, slot=SPILL_SLOT_1) at /work/dr/git/src/core/lib/instrument.c:5733
#7  0x00000000006d6580 in get_spilled_value (drcontext=0x475c2080, slot=4) at /work/dr/git/src/ext/drreg/drreg.c:253
#8  0x00000000006db73e in drreg_event_restore_state (drcontext=0x475c2080, restore_memory=0 '\000', info=0x7f3feacec130)
    at /work/dr/git/src/ext/drreg/drreg.c:1767
#9  0x00000000006def9f in drmgr_restore_state_event (drcontext=0x475c2080, restore_memory=0 '\000', info=0x7f3feacec130)
    at /work/dr/git/src/ext/drmgr/drmgr.c:1864
#10 0x00000000005b54dc in instrument_restore_state (dcontext=0x475c2080, restore_memory=false, info=0x7f3feacec130)
    at /work/dr/git/src/core/lib/instrument.c:1715
#11 0x00000000005c91c6 in recreate_app_state_internal (tdcontext=0x475c2080, mcontext=0x7f3feacec800, just_pc=false, owning_f=0x0, 
    restore_memory=false) at /work/dr/git/src/core/translate.c:1208
#12 0x00000000005c9598 in recreate_app_state (tdcontext=0x475c2080, mcontext=0x7f3feacec800, restore_memory=false, f=0x0)
    at /work/dr/git/src/core/translate.c:1331
#13 0x00000000005a2c99 in at_safe_spot (trec=0x474ae128, mc=0x7f3feacec800, desired_state=THREAD_SYNCH_SUSPENDED_VALID_MCONTEXT)
    at /work/dr/git/src/core/synch.c:616
#14 0x00000000005a4156 in synch_with_thread (id=16047, block=false, hold_initexit_lock=true, caller_state=THREAD_SYNCH_NONE, 
    desired_state=THREAD_SYNCH_SUSPENDED_VALID_MCONTEXT, flags=17) at /work/dr/git/src/core/synch.c:1034
#15 0x00000000005a5801 in synch_with_all_threads (desired_synch_state=THREAD_SYNCH_SUSPENDED_VALID_MCONTEXT, 
    threads_out=0x7f3feacece90, num_threads_out=0x7f3feacece8c, cur_state=THREAD_SYNCH_NO_LOCKS_NO_XFER, flags=20)
    at /work/dr/git/src/core/synch.c:1414
#16 0x00000000005a7baf in detach_on_permanent_stack (internal=true, do_cleanup=true) at /work/dr/git/src/core/synch.c:2058

That assert should be removed: dcontext for another thread is a valid usage scenario for synchall.

The underlying issues is that drreg sees what looks like an rax spill but no restore:

   0x4ef7373f:  movabs %rax,%gs:0x8
   0x4ef7374a:  lahf   
   0x4ef7374b:  seto   %al
   0x4ef7374e:  cmp    %gs:0x8,%rcx
   0x4ef73757:  jne    0x4ef3830f

Here is the source, x64 stay-on-trace cmp:

 +267  L3                      65 48 a3 00 00 00 00 mov    %rax -> %gs:0x00[8byte]
                               00 00 00 00
 +278  L3                      48 b8 a4 45 61 4c 60 mov    $0x00007f604c6145a4 -> %rax
                               7f 00 00
 +288  L3                      65 48 a3 08 00 00 00 mov    %rax -> %gs:0x08[8byte]
                               00 00 00 00
 +299  L3                      9f                   lahf    -> %ah
 +300  L3                      0f 90 c0             seto    -> %al
 +303  L3                      65 48 3b 0c 25 08 00 cmp    %rcx %gs:0x08[8byte]
                               00 00
 +312  L4 @0x0000000054505698  0f 85 19 63 f8 ff    jnz    $0x00000000543f8f0f <shared_trace_cmp_ret>
 +318  L3                      65 48 8b 0c 25 10 00 mov    %gs:0x10[8byte] -> %rcx
                               00 00
 +327  L3                      65 48 a1 00 00 00 00 mov    %gs:0x00[8byte] -> %rax
                               00 00 00 00

So that's not a spill: DR is just using that TLS slot instead of a 3rd scratch reg. That's TLS_REG1_SLOT, which is the 3rd and final one exposed to clients.

Note that drreg is already handling DR slots not lasting across app instrs by non-lazily restoring, but it looks like we have fundamental issues with state restore and DR slots. For simple DR uses like a local spill and restore (such as for rip-rel, though that uses a slot not exposed to clients), a state restore in between would simply be restored twice, once by DR and once by drreg, with no ill effects.

But, we have this no-restore DR use for trace cmp, and another tool component could use DR slots too.

Is the only safe thing to do to require enough dedicated drreg-only raw-TLS slots to avoid ever using DR slots? Just avoiding the 3rd slot would reduce the likelihood but not eliminate the theoretical problems.

It would be nice if, at least for recreate and not stored-info state restores, we could combine extra info during recreation with the raw mcontext snapshot