Bit Diffusion
or: Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning
The authors adapt a diffusion model to predict categorical or discrete numeric data. For numeric data they autoregressively predict a series of bits, from large to small.
They also substitute a discrete diffusion model for a transformer decoder in an image captioning model. By predicting a series of bits they can produce an output of dimension to predict word tokens, instead of a K-sized 1-hot vector where K is the vocab size.