or: Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation

This paper introduces the “Hidden Unit that Adaptively Remembers and Forgets”, later known as the Gated Recurrent Unit, or GRU.

see comparison