GRU // Lexicon

or: Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation

This paper introduces the “Hidden Unit that Adaptively Remembers and Forgets”, later known as the Gated Recurrent Unit, or GRU.