Markov Chains: The Sad Case of Mr. Markov and What’s His Face 2

“Whaaa? Markov Chains? aint nobody got time fo dat!”
“Be silent!”

They say that if you have a room full of monkeys bashing on typewriters, you’ll eventually get The Bible/Hamlet/the IKEA catalog. But they say a lot of things, so we’re going to make these monkeys.

And not just any old type of monkeys – smart monkeys. Because today, my dear face-owning anonymous reader, you’re going to learn about a smart monkey – a Markov Chain.

But first, let’s take any monkey – how would it generate text?

Naive Monkey

This is your average monkey. He sees a keyboard, goes “Fuck hell, this is not a banana!” and starts bashing away at random. Here’s a letter I received from a certain Mr. Banana:

ct0 msyv3v9.k,ah8,9o3iu w2lrqhr hq8hm0u9vacog5.l46 ~ Mr. Banana

If only poems could speak.

His method has the advantage of being easy to implement. In fact, here it is:

var generate = function (len, chars) {
    var ret = '',
    charsLen = chars.length;
    while (len--) {
        ret += chars[Math.floor(Math.random() * charsLen)];
    }

    return ret;
};

It has the disadvantage of being utter bullshit. To generate the word in, it had to run approximately 1.5k times. Seriously, try it:

var goal  = 'in',
    len   = goal.length,
    chars = '01234567890 abcdefghijklmnopqrstuvwxyz,.';

for (var i = 0; generate(len, chars) !== goal; i++);
console.log(i);

In that rate, we can forget our dreams about a Hemnes pine desk! Whatever shall we do!?

Wait! Is that a monkey? No! Is that a naive monkey!? No! It’s a

Markovian Monkey

This monkey had an epiphany:

If I see which letters are more likely to follow other letters, then I can make an educated guess! ~ Mr. Banana II

A great idea, Mr. Banana II, and how oddly fitting to our subject!

Note: The following section contains Math and may contain traces of trigonometry. If you don’t care for the theory, skip ahead to the next section (the implementation).

Wikipedia tells me that a Markov Chain is a probabilistic data structure, a random-transition state machine relying only on the previous step.

An example 2-state Markov Chain

A Markov Chain is a bunch of things which satisfy the Markov Property. The accurate definition of that property is annoying, so here’s the gist of it: Memorylessness. The next state is not dependent on any previous states. So we can say:

The Markov Property

We’re not like the Capulets, for we care not for the ancestry (deny thy father and refuse thy name, or if thou wilt not be but sworn my love, and I’ll no longer be a Capulet…).

We can see that property in the example above (try it out, don’t believe me):

Markov Property as seen on TV

And so forth.

But you don’t have to fully comprehend that to utilize a Markov Chain, so you can just go ahead to

The Implementation

Implementing a Markov Chain is storing a dictionary: The key is the current state, the value is an array of possible next states. Whenever you want to go to the next state, you go to the current state and pick a random state leading from it. Code speaks louder than words:

{
    state0 : [nextState0, nextState1, ...],
    ...
}

And then you’re done. How awesome is that!? Let’s get to it! First, we need somewhere to store the memory. I know!

var memory = {};

Perfect. Now, to the learning function!

var learn = function (txt) {
    var parts = txt.split(''),
        prev  = ''; //beginning-of-word

    parts.reduce(learnLetter, memory);

    function learnLetter (mem, letter) {
        if (!mem[prev]) {
            mem[prev] = [];
        }
        mem[prev].push(letter);
        prev = letter;

        return mem;
    }
};

(Note that here we keep track of the starting letter. That is optional, but I prefer it for greater accuracy. Also note that I will represent this special non-letter with ε, epsilon.)

We extend the previous state with the current one – each iteration, we add the current letter to the previous state (or initialize the previous state if necessary).

After the sample input 'abc', memory will be:

    {
        '' : ['a'],
        a  : ['b'],
        b  : ['c']
    }

State representation after sample input

And what if we now teach it something a bit spicier, say 'acaaa'?

    {
        '' : ['a', 'a']
        a  : ['b', 'c', 'a', 'a'],
        b  : ['c'],
        c  : ['a']
    }

State representation after second sample input

Now that our machine can grow, we need to query it. Like learning, that is also a simple procedure, whose logic is apparent in the state machines above: Given a current state, pick a next state randomly from its list of attached states (assumes Array.prototype.random, available in sources later):

var ask = function (len, state) {
    if (!len) {
        return state;
    }

    //beginning-of-word
    if (!state) {
        state = '';
    }

    var nextAvailable = memory[state] || [''],
        next = nextAvailable.random();

    //we don't have anywhere to go
    if (!next) {
        return state;
    }
    return state + ask(len-1, next);
};

You probably won’t get very impressive results with the memory it currently has, but do remember that the chain is a statistical model – that is, the more info you give it, the more accurate it will be.

But now let’s expand our implementation – it’s currently a bit weak. We should wrap in an object and have it be a little bit more flexible. Currently, we only learn individual letters, but we can easily expand that into words!

var markov = {
    memory    : {},
    separator : '',

    learn : function (txt) {
        var mem = this.memory;
        this.breakText(txt, learnPart);

        function learnPart (key, value) {
            if (!mem[key]) {
                mem[key] = [];
            }
            mem[key].push(value);

            return mem;
        }
    },

    ask : function (seed) {
        //beginning-of-word
        if (!seed) {
            seed = '';
        }

        return this.step(seed, [seed]).join(this.separator);
    },

    step : function (state, ret) {
        var nextAvailable = this.memory[state] || [''],
            next = nextAvailable.random();

        //we don't have anywhere to go
        if (!next) {
            return ret;
        }

        ret.push(next);
        return this.step(next, ret);
    },

    breakText : function (txt, cb) {
        var parts = txt.split(this.separator),
            prev = '';

        parts.forEach(step);
        cb(prev, ''); //end-of-word

        function step (next) {
            cb(prev, next);
            prev = next;
        }
    }
};

Note some points of interest. Regarding the separator: In learn, we split by the separator, and in ask we join the result with the output string.

In addition, we now also keep track of end-of-word. I also disposed of the stupid length argument – who needs that anyway? We’ve also extracted functionality – learn now delegates the tokenization to breakText, and ask delegates to step.

So how can we teach it words?

markov.separator = ' ';
markov.learn('mary had a little lamb');
console.log(markov.ask());

(Note the leading space. Fixing that will be left as an exercise to the reader.)

What we made is called a first-order (or order 1) Markov Chain, because it relies on one previous state. But that makes our monkey weird – it may be able to identify that r is never followed by x, but it fails to realize that zex doesn’t ever appear! We need to make amends!

Here’s an example for an Order-2 machine (forgive the ugliness):

Example of an Order-2 Markov Chain

To make an Order-n chain, we have to do very little changes to our code. There’s only 1 important difference: We now deal with arrays instead of single words. That creates two problems:

  1. For input (learning & seedless queries), we have to generate the starting state with n epsilons
  2. When moving on the chain, we need to use arrays, not single values

The first problem is simple to solve. We add another function to our nice object:

order : 2, //for an order-2 chain

genInitial : function () {
    var ret = [];

for (
    var i = 0;
    i < this.order;
    ret.push(''), i++
);

return ret;

To fix (1), we need to change two places: The default value of seed in markov.ask, and the initial value of prev in markov.breakText.

ask : function (seed) {
    if (!seed) {
        seed = this.genInitial();
    }

    return seed.concat(this.step(seed, [])).join(this.separator);
}

//inside of markov.breakText, replace where relevant
    prev = this.genInitial();

Fixing (2) is also rather trivial. We only have to change the step functionality.

//in the end of markov.step
var nextState = state.slice(1);
nextState.push(next);
return this.step(nextState, ret);

//in markov.breakText
function step (next) {
    cb(prev, next);
    prev.shift();
    prev.push(next);
}

…and we’re done.

The Finale

var markov = {
    memory    : {},
    separator : '',
    order     : 2,

    learn : function (txt) {
        var mem = this.memory;
        this.breakText(txt, learnPart);

        function learnPart (key, value) {
            if (!mem[key]) {
                mem[key] = [];
            }
            mem[key].push(value);

            return mem;
        }
    },

    ask : function (seed) {
        if (!seed) {
            seed = this.genInitial();
        }

        return seed.concat(this.step(seed, [])).join(this.separator);
    },

    step : function (state, ret) {
        var nextAvailable = this.memory[state] || [''],
            next = nextAvailable.random();

        //we don't have anywhere to go
        if (!next) {
            return ret;
        }

        ret.push(next);

        var nextState = state.slice(1);
        nextState.push(next);
        return this.step(nextState, ret);
    },

    breakText : function (txt, cb) {
        var parts = txt.split(this.separator),
            prev = this.genInitial();

        parts.forEach(step);
        cb(prev, '');

        function step (next) {
            cb(prev, next);
            prev.shift();
            prev.push(next);
        }
    },

    genInitial : function () {
        var ret = [];

        for (
            var i = 0;
            i < this.order;
            ret.push(''), i++
        );

        return ret;
    }
};

Array.prototype.random = function () {
    return this[Math.floor(Math.random() * this.length)];
};

Examples

> markov.learn('yesterday i ate pizza and it was delicious');
> markov.ask();
"yesterdered i orday it was derdelicious"
> markov.ask(['d', 'e']);
"dered pizza and pizza and i ordered pizza and pizza and it was delicious"

> markov.memory = {} //basically, reinitialize
> markov.separator = ' '; //operate on words
> markov.learn('the quick brown fox jumped over the lazy dog');
> markov.ask();
"  the quick brown fox jumped over the lazy dog"
//we'll continue receiving the same answer

> markov.memory = {}; //let's go back a bit...
> markov.separator = ''; markov.order = 4;
> markov.learn('yesterday i ate pizza and it was delicious');
> markov.ask();
"yesterday i ate pizza and it was delicious"

Explain to yourself why the last cases happen (why repetition occurs).

Now we’ll convert the following state-machine representation of a Markov Chain to code (not practical, but a fun exercise):

Example state machine

We see that, first of all, it’s an Order 1 chain (each state contains only one information chunk). the is always the leading word, followed in 2:3 times by falcon, the rest by snake, and so forth.

markov.order = 1;
> markov.separator = ' ';
> markov.learn('the falcon');
> markov.learn('the falcon likes the snake');

We begin with the simplest case: the -> falcon -> ε. Then, we add a mapping so that:

  • the -> falcon
  • falcon -> likes
  • likes -> the
  • the -> snake
  • snake -> ε

And that’s that.

Undoubtedly you’re tired of hearing my voice by now. So I’m going to stop talking soon. Give yourself a pat on the back, you deserve it!

Exercises left for you, my most loyal and beautiful reader (have I said how lovely you look today? Seriously, you pulled today off)

  1. Fix the output formatting (use a separator other than the empty string to see the problem)
  2. Make a front-end for learning/asking. Bonus: Make it able to consume books.
  3. Change the chain’s order, see what effect it has.
  4. After teaching it a lot of things and playing with Orders, try and fine-tune it so it produces the most English-like words. Try and generate the first sentence in the Bible (“In the beginning God created the heavens and the earth“, capitalization optional). You’ll probably need to tweak the ask function to accept a maximum length parameter.
  5. Right now, as new inputs arrive, the possible-next-states in each state grows as well, and we count on duplications to provide the probability. That doesn’t scale too well. Change it so there’ll be no duplications, and so the chain will remain the same.

Clarification for the last exercise: We now have, for instance, this:

a : ['a', 'b', 'a', 'a']

Which is sub-optimal. Turn it into something better.

January 21st, 2013 by

2 thoughts on “Markov Chains: The Sad Case of Mr. Markov and What’s His Face

  1. Reply rlemon Jan 21, 2013 1:06 pm

    If the last exercise produced:
    a : ['a', 'b', 'b', 'a']
    I would not only say it was optimal, but I might be obliged to dance to it.

  2. Pingback: Simple Markov Chain in PHP | Ben Wendt

Leave a Reply