EfficientNetV2: Smaller Models and Faster Training

Architecture search is used to find a tunable architecture series adjustable by Params and Flops. Improvements over previous EfficientNets:

  1. progressive training, which trains initially on small images, and progressively moves on to larger ones with more augmentations to improve generalization.
  2. combining depthwise and spatial convolutions into full convolutions in some cases, for a speedup specific to Google TPUs.