or: Learning to See by Looking at Noise

Introduces a rotnet-style pretraining method for convNets:
Generate random but spatially coherent images (scribbles) on the fly.
The input to the network is a pair of randomly cropped image patches, to be classified as either:

  • both cropped from the same scribble
  • cropped from separate scribbles

the resulting quality isn’t as good for transfer learning as imagenet classification, but it’s interesting how spatial coherence gets you most of the way there.