ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

The authors move past performance per flop and reject group convolution and minimize channel width changes. This leads them to a new architecture with better practical performance.