Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • M metaseq
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 95
    • Issues 95
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 41
    • Merge requests 41
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Administrator
  • metaseq
  • Merge requests
  • !164

Singleton checkpoint needs to include decoder.version for single-ton checkpoint to run correctly

  • Review changes

  • Download
  • Email patches
  • Plain diff
Merged Administrator requested to merge github/fork/patrickvonplaten/patch-1 into main Jun 21, 2022
  • Overview 5
  • Commits 2
  • Pipelines 0
  • Changes 1

Created by: patrickvonplaten

If we don't transfer the "decoder.version" to the singleton checkpoint, a very sneaky bug happens which was found by @thomasw21 as part of this PR: https://github.com/huggingface/transformers/pull/17785

If the decoder.version param is not present in the state_dict it follows that upon loading the single-ton checkpoint the loaded layer_norm is set to None here: https://github.com/facebookresearch/metaseq/blob/e0c4f6b0e4c523906ad8d561f727e3f2ac3a8e73/metaseq/models/transformer.py#L932

So it's absolutely crucial that we include this variable.

I will update all of the converted HF checkpoints here later today and then I think we can be sure that OPT works correctly :partying_face: https://huggingface.co/models?other=opt_metasq

Patch Description Describe your changes

Testing steps Describe how you tested your changes

Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: github/fork/patrickvonplaten/patch-1