or: An Empirical Exploration of Recurrent Network Architectures

Authors search for replacements for LSTM and GRU. They barely make an improvement with their MUT1 cell. Looks like LSTM and GRU are pretty solid.

see a comparison of recurrent units