How to have long-term dependencies and
still be first order Markov
We introduce hidden states to get a hidden
Markov model:
The next hidden state depends only on the
current hidden state, but hidden states can
carry along information from more than one
time-step in the past.
The current symbol depends only on the
current hidden state.