100% agree, Paul, and nice points. There's very little info about anything the model can do. I guess they don't have inference results yet or the results aren't remarkalble enough to release.

I thought the same thing. Parameter quantity isn't the only metric to compare two models - and may not even be the best one in many cases. Yet, because I had no way to back the claim that "maybe M6's parameters don't have the quality of GPT-3's" I preferred to not speculate on that aspect.