Monday, April 25, 2011

Blogging Markov Models: What Is a Markov Model?

Continuing a romp through Chapter 9 of Foundations of Statistical Natural Language Processing (1999) by Christopher D. Manning and Hinrich Schütze.

Section 9.1 defines Markov models and introduces several key terms. Basically, a Markov model describes the probabilities of random variables in a sequence. One of the conditions defining Markov models is that to predict a future random variable, all we need to know is the value of the present one; given the present random variable, we don't need to know the values of all the past elements.

Thus, if I have a finite set of elements, such as an alphabet (and, I imagine, attendant marks), the next letter in the chain will be conditioned to a degree by the present letter. A robust Markov model will be able to generate valid sequences of letters. The set of elements is called the state space, and the possible future elements are transitions.

If a text can be encoded as a Markov process, then we can calculate the probability of a certain sequence of states in that text. And while we use the present state to predict the transition state, we can extend the present backward and use a number of previous states to determine what we expect the next state to be.

Next up: Hidden Markov Models

No comments: