Last time, we introduced Markov models as describing the probabilities of random variables in a sequence.
In a Hidden Markov Model (HMM), we don't know the sequence. We have a state, and then only probabilities for future transitions. For this reason, in the HMM we need to consider all the paths that might be taken through the model.
Manning and Schütze note that "HMMs are useful when one can think of underlying events probabilistically generating surface events." Thus,
- Parts of speech (or other classifiers) might be one such type of underlying series of events generating the actual words of a text.
- HMMs are one of a class of model for which there exist efficient methods of training through use of the Expectation Maximization (EM) algorithm.
- HMMs can be used to generate parameters for linear interpolation of n-gram models.
HMMs are specified by five factors:
- A set of states (S).
- An output alphabet (K).
- Initial state probabilities (Π).
- State transition probabilities (A).
- Symbol emission probabilities (B).
Once an HMM is specified, one can easily set up a computer program to simulate the running of a Markov process and to produce an output sequence. The important point here is that the program operates to simulate a Markov process. The "real" Markov process, as it were, is a given set of data--such as a text--that we assume was generated through a hidden Markov process.
I don't think we should ignore that distinction being made between simulated and real Markov processes. It's a key operation fro producing knowledge of the real, and I suspect that with further consideration we would agree it carries several assumptions that also hold in the domains of computerization generally, new media, and language.