Created by: stephenroller
--arch transformer_lm and --tensor-init-on-gpu are incompatible (at least in fsdp)
Throws an exception about mixing fp32 and fp16.