Time-Contrastive Networks: Self-Supervised Learning from Video

Uses triplet loss to train a convnet to learn common human and robot joint configurations, based on multi-view video examples.