Multidimensional hidden Markov model applied to image and video analysis

Jiten, Joakim

Recent progress and prospects in cognitive vision, multimedia, human-computer interaction, communications and the Web call for, and can profit from applications of advanced image and video analysis. Image classification is perhaps the most important part of digital image analysis. The objective is to identify and portray the visual features occurring in an image in terms of differentiated classes or themes. Traditional classification methods analyses independent blocks of an image, which results in a context-free formalism. However there is a fairly wide-spread agreement that observations should be presented as collections of features which appear in a given mutual position or shape. We therefore employ a new efficient algorithm that models context in images by a 2-D hidden Markov model (HMM). The difficulty with applying a 2-D HMM to images is the computational complexity which grows exponentially with the number of image blocks. The main technical contribution of this thesis is a way of estimating the parameters of a 2-D HMM in O(whN2) complexity instead of O(wN2h), where N is the number of states in the model and (w,h) is the width respectively height of the image. We investigate the performance of our proposed model (DT HMM) , and search for its point of operation. In an effort to introduce both global and local context in the model, the DT HMM was extended to multiple image resolutions. The results indicate that earlier recorded deficiency can be conquered and that its performance can be compared with other known algorithms. We finally demonstrate the versatility of the model by presenting applications such as; classification, segmentation and object tracking.

Data Science
Eurecom Ref:
© ENST Paris. Personal use of this material is permitted. The definitive version of this paper was published in and is available at :
See also: