Simple Hidden Markov Models for Haskell
The hiddenmarkov-haskell
library implements a simple discretely distributed Hidden Markov Model (DDHMM) data structure.
To go along with the HMM strucure, this library includes algorithms for solving the basic problems around HMMs.
This includes implementations for:
- the forward-backward algorithm for calculating the probability of a specific string emission,
- the Viterbi algorithm for calculating the most likely hidden state sequence that generates a given string,
- and the Baum-Welch algorithm for calculating the transition probabilities for the hidden states of the HMM.
To be able to compile the application,
install the Haskell Stack.
Ensure that the stack
executable is in the system or user PATH
variable.
To download dependencies to the project directory and compile sources to binary, run stack build
in the console of your choice.
To compile on source code changes and execute the program contained in Main.hs
, run
stack run -- [ARGS]
When running the compiled application downloaded from the Releases section, simply run
> .\hiddenmarkov-haskell-exe.exe [ARGS]
The application offers the following execution modes:
print [FILE]
- print the HMM data structure defined inFILE
generate [FILE] [LENGTH]
- generate an emission string of lengthLENGTH
using the HMM defined inFILE
forward [FILE] [STRING]
- calculate the forward probability of realizationSTRING
given an HMM defined inFILE
backward [FILE] [STRING]
- calculate the backward probability of realizationSTRING
given an HMM defined inFILE
viterbi [FILE] [STRING]
- calculate the most likely hidden state sequence generating realizationSTRING
given an HMM defined inFILE
baumWelch [FILE] [STRING] [LIMIT]
- perform Baum-Welch estimation on the HMM defined inFILE
to maximize expectation ofSTRING
until the sum of all parameters is smaller thanLIMIT
(convergence limit)
To define an HMM for use with the application, you will need the following parameters:
- initial state distribution
- number of hidden states and transition matrix
- emission alphabet (characters) and emission probabilities by state
The following is an example for an HMM with three hidden states and four emission characters.
INIT 0.4 0.2 0.4
TRANS 0 0.3 0.2 0.5
TRANS 1 0.5 0.4 0.1
TRANS 2 0.8 0.1 0.1
EMIT A C G T
EMITSTATE 0 0.3 0.2 0.5 0.0
EMITSTATE 1 0.2 0.0 0.1 0.7
EMITSTATE 2 0.0 0.5 0.0 0.5
hiddenmarkov-haskell
includes a few unit tests.
These were written using the HSpec
library.
To run them, call stack test
.