Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • D dynamorio
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 1,467
    • Issues 1,467
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 44
    • Merge requests 44
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • DynamoRIO
  • dynamorio
  • Issues
  • #3956
Closed
Open
Issue created Nov 20, 2019 by Assad Hashmi@AssadHashmiContributor

Multi-threading failure (HANG) on AArch64. ASSERT in utils.c: !lock->owner

We're seeing intermittent hangs with an AArch64 guest binary compiled with OpenMP. All the indications are that it is a multi-threading bug in DR. The hang happens on a DR release build. With the DEBUG build, the following assert fires and exits so doesn't get as far as hanging:

<Application /path/to/test_case.exe (44407).  Internal Error: DynamoRIO debug check failure: /path/to/dynamorio/core/utils.c:576 !lock->owner

Which happens in:

static void
deadlock_avoidance_lock(mutex_t *lock, bool acquired, bool ownable)
{
    if (acquired) {
        . . .
        if (ownable) {
            ASSERT(!lock->owner);
            lock->owner = d_r_get_thread_id();
            lock->owning_dcontext = get_thread_private_dcontext();
        }
        . . .

The guest binary is built with armclang and linked to the Arm Performance Libraries on RHEL7.5: armclang -fopenmp -armpl=lp64,parallel test_case.c -o test_case.exe

It fails without clients: drrun ./test_case.exe

These also appear during the -debug run:

<get_memory_info mismatch! (can happen if os combines entries in /proc/pid/maps) 
        os says: 0x0000fffde4000000-0x0000fffe0c000000 prot=0x00000000           
        cache says: 0x0000fffde4000000-0x0000fffe08000000 prot=0x00000000`

<ran out of stolen fd space>

It takes between 3 and 60 runs to get the assert to fire and only seems to fail on ThunderX2 machines.

Running with -loglevel 3 gives the following thread statistics:

(Begin) Thread statistics @6735 global, 0 thread fragments (0:05.953):
                      BB fragments targeted by IBL (thread):                 3
                               Fcache exits, total (thread):                 4
                            Fcache exits, from BBs (thread):                 4
             Fcache exits, total indirect branches (thread):                 3
         Fcache exits, non-trace indirect branches (thread):                 3
   Fcache exits, ind target in cache but not table (thread):                 3
             Fcache exits, from BB, ind target ... (thread):                 3
              Fcache exits, BB->BB, ind target ... (thread):                 3
             Fcache exits, dir target not in cache (thread):                 1
                                Special heap units (thread):                 1
                           Peak special heap units (thread):                 1
             Current special heap capacity (bytes) (thread):             65536
                Peak special heap capacity (bytes) (thread):             65536
                              Heap headers (bytes) (thread):                56
                          Heap align space (bytes) (thread):                12
                     Peak heap align space (bytes) (thread):                12
                     Heap bucket pad space (bytes) (thread):              1136
                Peak heap bucket pad space (bytes) (thread):              1136
                            Heap allocs in buckets (thread):                15
                        Heap allocs variable-sized (thread):                 7
                             Total reserved memory (thread):            393216
                        Peak total reserved memory (thread):            393216
               Guard pages, reserved virtual pages (thread):                 4
          Peak guard pages, reserved virtual pages (thread):                 4
                    Current stack capacity (bytes) (thread):             65536
                       Peak stack capacity (bytes) (thread):             65536
                              Heap claimed (bytes) (thread):             17192
                         Peak heap claimed (bytes) (thread):             17192
                     Current heap capacity (bytes) (thread):             65536
                        Peak heap capacity (bytes) (thread):             65536
              Current total memory from OS (bytes) (thread):            393216
                 Peak total memory from OS (bytes) (thread):            393216
                      Current vmm blocks for stack (thread):                 3
                         Peak vmm blocks for stack (thread):                 3
               Current vmm blocks for special heap (thread):                 3
                  Peak vmm blocks for special heap (thread):                 3
                  Our virtual memory blocks in use (thread):                 6
             Peak our virtual memory blocks in use (thread):                 6
             Allocations using multiple vmm blocks (thread):                 2
                Blocks used for multi-block allocs (thread):                 6
         Current vmm virtual memory in use (bytes) (thread):            393216
            Peak vmm virtual memory in use (bytes) (thread):            393216
                              Number of safe reads (thread):                17
(End) Thread statistics

Does anything look unusual?

There's lots of other thread related tracing in the logs but I don't know what to look for.

Clearly ownable and lock->owner are contradicting each other. Where and when could that be happening?

Thanks

Assignee
Assign to
Time tracking