SENet
Squeeze-and-Excitation Networks
Squeeze-and-Excitation (SE) blocks take a spatial block of activations with multiple channels,
e.g. 3x3x64.
Average each channel, => 1x1x64
and do a dense ReLu, => 1x1x64
then dense sigmoid activation => 1x1x64.
These per-channel activations are then multiplied by the original block of activations.
The block serves as a data-dependent channel-weighting function.
It seems pretty effective, especially for small networks.