Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • D dynamorio
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 1,467
    • Issues 1,467
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 44
    • Merge requests 44
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • DynamoRIO
  • dynamorio
  • Issues
  • #4060
Closed
Open
Issue created Jan 27, 2020 by Derek Bruening@derekbrueningContributor

Switch raw2trace from a per-instr hashtable to a per-block hashtable

For #3977 I changed raw2trace to store per-instruction data with the block start PC as part of the key, to handle duplicated instructions in multiple blocks. However, raw2trace was still doing a hashtable lookup on an individual instruction basis, and that table lookup took 10% of cpu time (with i/o disabled). Plus, when using the best C++ STL table, that lookup rose to 28% of the time (see #2056 (closed) for the initial measurements; I re-measured with the new block,instr keys), and in fact showed a significant slowdown:

hashtable_t in HEAD:

6.39user 0.09system 0:06.49elapsed 99%CPU (0avgtext+0avgdata 10964maxresident)k
6.33user 0.07system 0:06.41elapsed 99%CPU (0avgtext+0avgdata 10964maxresident)k
6.45user 0.06system 0:06.57elapsed 99%CPU (0avgtext+0avgdata 10904maxresident)k

c++11 std::unordered_map.find, max_load_factor 0.5, init size 1<<16, custom hash+cmp:

7.61user 0.09system 0:07.71elapsed 99%CPU (0avgtext+0avgdata 11360maxresident)k
7.51user 0.04system 0:07.55elapsed 99%CPU (0avgtext+0avgdata 11516maxresident)k
7.44user 0.07system 0:07.60elapsed 99%CPU (0avgtext+0avgdata 11428maxresident)k

As part of #3316 (closed) I'm trying to remove all reliance on the full DR library from raw2trace. The use of the drcontainers hashtable_t is one such reliance. Since it would be rather hacky to try and make a version of drcontainers with no DR dependencies (it has persistence support and uses DR locks), instead I'm proposing refactoring the raw2trace hashtables to store per-block info, query per block, remember the last block, and store per-instr info in a vector inside the block. We do have access to the instr count and instr index for all callers, if we store the index in a note for the elision walk.

I implemented this and the speedup is nice (remember this is w/o i/o):

5.22user 0.11system 0:05.34elapsed 99%CPU (0avgtext+0avgdata 10728maxresident)k
5.21user 0.05system 0:05.27elapsed 99%CPU (0avgtext+0avgdata 10704maxresident)k
5.30user 0.06system 0:05.37elapsed 99%CPU (0avgtext+0avgdata 10664maxresident)k

The time spent in hashtable_lookup is now 0.95%. So we should be able to swap to the C++ table without noticeable overhead, making it easier to move raw2trace to use drdecode.

Assignee
Assign to
Time tracking