It's not the same thing saying "they can't compare" as saying "there's no info to compare." In the case of parameters, you claim that because GPT-3 is dense and M6 is sparse they can't be compared (which I didn't question, but I could). In the case of performance, there's no info to claim neither comparability nor incomparability.

I don't see the point of the criticism given that I didn't make any quantitative claim about performance.

