Inprobability theory,aMarkov modelis astochastic modelused tomodelpseudo-randomly changing systems. It is assumed that future states depend only on the current state, not on the events that occurred before it (that is, it assumes theMarkov property). Generally, this assumption enables reasoning and computation with the model that would otherwise beintractable.For this reason, in the fields ofpredictive modellingandprobabilistic forecasting,it is desirable for a given model to exhibit the Markov property.

Introduction

edit

There are four common Markov models used in different situations, depending on whether every sequential state is observable or not, and whether the system is to be adjusted on the basis of observations made:

Markov models
System state is fully observable System state is partially observable
System is autonomous Markov chain Hidden Markov model
System is controlled Markov decision process Partially observable Markov decision process

Markov chain

edit

The simplest Markov model is theMarkov chain.It models the state of a system with arandom variablethat changes through time. In this context, the Markov property indicates that the distribution for this variable depends only on the distribution of a previous state. An example use of a Markov chain isMarkov chain Monte Carlo,which uses the Markov property to prove that a particular method for performing arandom walkwill sample from thejoint distribution.

Hidden Markov model

edit

Ahidden Markov modelis a Markov chain for which the state is only partially observable or noisily observable. In other words, observations are related to the state of the system, but they are typically insufficient to precisely determine the state. Several well-known algorithms for hidden Markov models exist. For example, given a sequence of observations, theViterbi algorithmwill compute the most-likely corresponding sequence of states, theforward algorithmwill compute the probability of the sequence of observations, and theBaum–Welch algorithmwill estimate the starting probabilities, the transition function, and the observation function of a hidden Markov model.

One common use is forspeech recognition,where the observed data is thespeech audiowaveformand the hidden state is the spoken text. In this example, the Viterbi algorithm finds the most likely sequence of spoken words given the speech audio.

Markov decision process

edit

AMarkov decision processis a Markov chain in which state transitions depend on the current state and an action vector that is applied to the system. Typically, a Markov decision process is used to compute a policy of actions that will maximize some utility with respect to expected rewards.

Partially observable Markov decision process

edit

Apartially observable Markov decision process(POMDP) is a Markov decision process in which the state of the system is only partially observed. POMDPs are known to beNP complete,but recent approximation techniques have made them useful for a variety of applications, such as controlling simple agents or robots.[1]

Markov random field

edit

AMarkov random field,or Markov network, may be considered to be a generalization of a Markov chain in multiple dimensions. In a Markov chain, state depends only on the previous state in time, whereas in a Markov random field, each state depends on its neighbors in any of multiple directions. A Markov random field may be visualized as a field or graph of random variables, where the distribution of each random variable depends on the neighboring variables with which it is connected. More specifically, the joint distribution for any random variable in the graph can be computed as the product of the "clique potentials" of all the cliques in the graph that contain that random variable. Modeling a problem as a Markov random field is useful because it implies that the joint distributions at each vertex in the graph may be computed in this manner.

Hierarchical Markov models

edit

Hierarchical Markov models can be applied to categorize human behavior at various levels of abstraction. For example, a series of simple observations, such as a person's location in a room, can be interpreted to determine more complex information, such as in what task or activity the person is performing. Two kinds of Hierarchical Markov Models are theHierarchical hidden Markov model[2]and the Abstract Hidden Markov Model.[3]Both have been used for behavior recognition[4]and certain conditional independence properties between different levels of abstraction in the model allow for faster learning and inference.[3][5]

Tolerant Markov model

edit

A Tolerant Markov model (TMM) is a probabilistic-algorithmic Markov chain model.[6]It assigns the probabilities according to a conditioning context that considers the last symbol, from the sequence to occur, as the most probable instead of the true occurring symbol. A TMM can model three different natures: substitutions, additions or deletions. Successful applications have been efficiently implemented in DNA sequences compression.[6][7]

Markov-chain forecasting models

edit

Markov-chains have been used as a forecasting methods for several topics, for example price trends,[8]wind power[9]andsolar irradiance.[10]The Markov-chain forecasting models utilize a variety of different settings, from discretizing the time-series[9]to hidden Markov-models combined with wavelets[8]and the Markov-chain mixture distribution model (MCM).[10]

See also

edit

References

edit
  1. ^Kaelbling, L. P.; Littman, M. L.; Cassandra, A. R. (1998)."Planning and acting in partially observable stochastic domains".Artificial Intelligence.101(1–2): 99–134.CiteSeerX10.1.1.390.8474.doi:10.1016/S0004-3702(98)00023-X.ISSN0004-3702.
  2. ^Fine, S.; Singer, Y. (1998)."The hierarchical hidden markov model: Analysis and applications".Machine Learning.32(1): 41–62.doi:10.1023/A:1007469218079.
  3. ^abBui, H. H.; Venkatesh, S.; West, G. (2002)."Policy recognition in the abstract hidden markov model".Journal of Artificial Intelligence Research.17:451–499.doi:10.1613/jair.839.hdl:10536/DRO/DU:30044252.
  4. ^Theocharous, G. (2002).Hierarchical Learning and Planning in Partially Observable Markov Decision Processes(PhD). Michigan State University.
  5. ^Luhr, S.; Bui, H. H.; Venkatesh, S.; West, G. A. W. (2003)."Recognition of Human Activity through Hierarchical Stochastic Learning".PERCOM '03 Proceedings of the First IEEE International Conference on Pervasive Computing and Communications.pp. 416–422.CiteSeerX10.1.1.323.928.doi:10.1109/PERCOM.2003.1192766.ISBN978-0-7695-1893-0.S2CID13938580.
  6. ^abPratas, D.; Hosseini, M.; Pinho, A. J. (2017). "Substitutional tolerant Markov models for relative compression of DNA sequences".PACBB 2017 – 11th International Conference on Practical Applications of Computational Biology & Bioinformatics, Porto, Portugal.pp. 265–272.doi:10.1007/978-3-319-60816-7_32.ISBN978-3-319-60815-0.
  7. ^Pratas, D.; Pinho, A. J.; Ferreira, P. J. S. G. (2016). "Efficient compression of genomic sequences".Data Compression Conference (DCC), 2016.IEEE. pp. 231–240.doi:10.1109/DCC.2016.60.ISBN978-1-5090-1853-6.S2CID14230416.
  8. ^abde Souza e Silva, E.G.; Legey, L.F.L.; de Souza e Silva, E.A. (2010)."Forecasting oil price trends using wavelets and hidden Markov models".Energy Economics.32.
  9. ^abCarpinone, A; Giorgio, M; Langella, R.; Testa, A. (2015)."Markov chain modeling for very-short-term wind power forecasting".Electric Power Systems Research.122:152–158.doi:10.1016/j.epsr.2014.12.025.
  10. ^abMunkhammar, J.; van der Meer, D.W.; Widén, J. (2019). "Probabilistic forecasting of high-resolution clear-sky index time-series using a Markov-chain mixture distribution model".Solar Energy.184:688–695.doi:10.1016/j.solener.2019.04.014.S2CID146076100.