Vol. 15 (2006): Abstracts of Papers
Megyesi Z., Kós G., Chetverikov D.:
3D reconstruction from images by normal aided matching.
MGV vol. 15, no. 1, 2006, pp. 3-28.
3D models play an increased role in today's computer applications. As a result, there is a need for flexible and easy to use measuring devices that produce 3D models of real world objects. 3D scene reconstruction is a quickly evolving field of computer vision, which aims at creating 3D models from images of a scene. Although many problems of the reconstruction process have been solved, the use of photographs as an information source involves some practical difficulties. Therefore, accurate and dense 3D reconstruction remains a challenging task. We discuss dense matching of surfaces in the case when the images are taken from a wide baseline camera setup. Some recent studies use a region-growing based dense matching framework, and improve accuracy through estimating the apparent distortion by local affine transformations. In this paper we present a way of using pre-calculated calibration data to improve precision. We demonstrate that the new method produces a more accurate model.
Key words: image based reconstruction, 3D scene reconstruction, stereo, wide-baseline, dense matching.
MCFI-based animation tweening algorithm for 2D parametric motion flow/optical flow.
MGV vol. 15, no. 1, 2006, pp. 29-49.
In hand drawn animation, some of the animations are created at very low frame rates (12 frames per second) because of a limited budget. A simple technique for increasing frame rates is animation tweening. In fact, most animation tweening applications process only vector-based objects. Further, there is no algorithm which can automatically define extensive modelling and animation of 3D objects for the animation tweening process. In this work, a raster-based animation tweening algorithm which does not require extensive modelling and animation of 3D objects is presented. The algorithm is based on motion-compensated frame rate up-conversion (MC-FRC), whose performance depends heavily on motion estimation (ME). A more accurate flow field, which is a combination of 2D parametric motion flow field and optical flow field, is developed. The results obtained for various types of hand drawn animation scenes are illustrated. The experimental results show that higher quality in-between frames can be obtained by the proposed algorithm.
Key words: computer animation, image processing, computer vision.
Practical camera calibration and image rectification in monocular road traffic applications.
MGV vol. 15, no. 1, 2006, pp. 51-71.
In this paper we follow apractical approach to the problem of camera calibration and image formation in a typical computer vision based traffic monitoring system. Our study starts by analysing in detail the capturing scenario and the undesirable effects resulting from the application of the pin-hole camera model to traffic images. Based on the acquired experience, we present an integral method to obtain the transformation matrices required to calibrate the camera and to construct in real time a rectified and sub-sampled image where perspective effects on the road plane have been removed. The motivation for this perspective-free image is to reduce the amount of data which must be computed (while avoiding loss of relevant information), remove influences from objects external to the capturing area and simplify the operation of the subsequent detection and tracking stages. The latter statement is justified since vehicle shapes can now be approximated by rectangles, the distance between any two neighbouring points on the road plane remains constant all over the image, and parallel trajectories are restored in the rectified image.
Key words: camera calibration, image rectification, road traffic monitoring, non-linear sub-sampling, perspective correction, real-time.
Hillion A., Roux C., Donescu I., Avaro O.:
Generalized second-order invariance in texture modeling.
MGV vol. 15, no. 1, 2006, pp. 73-97.
In image processing, micro-textures are generally represented as homogeneous random fields, the term "homogeneous" indicating a second-order stationary random process. However, such a formulation is restrictive, and does not allow for the processing of anisotropic textures. The aim of this paper is to study a generalization of second-order stationarity to second-order invariance under a group of transforms, in order to apply this generalization to texture modeling and analysis. The general formulation of second-order homogeneity or G-invariance is given in relation to the framework of group theory. Two approaches are derived, taking into consideration transitive groups and generalized translations. For the latter approach, an important particular case is outlined, in which a second-order G-invariant random field X can be one-to-one associated to asecond-order stationary random field. Some examples of interesting groups of transforms are given. Finally, Cholesky factorization is applied for the synthesis of random fields showing the generalized invariance property.
Key words: random field, second-order stationarity, group orbits, texture modeling, texture synthesis.
He Z., Harada K.:
Solving point-feature labeling placement problem by parallel Hopfield neural network on GPU graphics card.
MGV vol. 15, no. 1, 2006, pp. 99-120.
This paper discusses the application of parallel Hopfield neural networks in solving the point-feature labeling placement (PFLP) problem by using programmable graphics hardware found in a commodity PC. In this paper, we focus on two aspects. The first aspect concerns mapping the PFLP onto parallel Hopfield neural network. The second aspect is the detailed method of implementing the parallel Hopfield neural network on graphics hardware. We demonstrate the effectiveness of implementing the parallel Hopfield network by solving the PFLP problem. Moreover, our proposal makes use of the advantages of the parallel Hopfield network on low-cost platforms.
Key words: parallel computing, Hopfield neural network, GPU, PFLP.
Kwasnicka H., Paradowski M.:
Fast image auto-annotation with discretized feature distance measures.
MGV vol. 15, no. 2, 2006, pp. 123-140.
A new model for the image auto-annotation task is presented. The model can be classified as a fast image auto-annotation one. The main idea behind the model is to avoid various problems with feature space clustering. Both the image segmentation and the auto-annotation process do not use any clustering algorithms. The method presented here simulates continuous feature space analysis with very dense discretization. The paper presents the new approach and discusses the results achieved with it.
Key words: image auto-annotation, fast auto-annotation, image feature extraction.
Les Z., Les M.:
Shape understanding system: 3D interpretation as a part of the visual concept formation.
MGV vol. 15, no. 2, 2006, pp. 141-175.
This paper presents a new method of interpretation of the 2D visual objects in terms of 3D geometrical or real world objects. The 3D interpretation of the visual objects depends on the class where a given object is assigned. Each class has its own 3D interpretation method. The interpretation methods for selected classes are described and the results of testing of these methods are presented. It is shown that a visual object can be interpreted as a 3D object by assigning it to one of shape classes. The main novelty of the presented method is that the process of interpretation is related to a visual concept represented as a set of symbolic names of the shape classes. The visual concept, which is one of the components of the category of a visual object, makes it possible to represent knowledge about the visual object in the form of a categorical structure. The presented results are part of research aimed at developing a shape understanding method that will be able to perform complex visual tasks connected with visual thinking.
Key words: shape understanding, object recognition, visual concept, visual reasoning.
Variants of pattern recognition and image understanding method used in telescope guiding system.
MGV vol. 15, no. 2, 2006, pp. 177-195.
This paper presents a suggestion for using image understanding (IU) methods in a telescope guiding system, which is an example of a real-time system. In the presented approach, the guiding system uses a knowledge base to describe a time-varying situation and respond to it adequately. The large domain of understanding (the space of image features and time) is split to subspaces where the understanding tasks are simplified. The use of the IU idea to control the telescope motors and to optimize the system parameters is also discussed in the paper.
Key words: object recognition, object visual identification, computer vision, learning in vision.
Miyazaki R., Harada K.:
Mesh construction based on differential surface property.
MGV vol. 15, no. 2, 2006, pp. 197-210.
In this paper, we propose a novel method for extraction of a feature from range data to create of a simplified triangular mesh. The feature extracted by our method is based on the surface property. In the past, vertices on a highly curved surface were extracted from range data, and the extracted vertices were triangulated to construct a simplified triangular mesh. However, the vertices extracted with such a method were not enough to create a simplified triangular mesh in some cases. For example, when the shape shows less change in one direction, like a parabolic shape, too many vertices are extracted by the traditional method. We introduce a surface property defined by a combination of the mean curvature and the gaussian curvature. Vertices are classified into three categories according to the surface property. Then, the structures consisting of vertices and edges are extracted suitably for constructing a simplified triangular mesh. Following that, we can create the desirable triangulation through constrained Delaunay triangulation using the extracted structures.
Key words: feature extraction, surface property, mesh simplification, mesh segmentation.
Ramakrishnan S., Selvan S.:
Wavelet-based modeling of singular values for image texture classification.
MGV vol. 15, no. 2, 2006, pp. 211-225.
A new algorithm based on the wavelet packet transform is proposed for the classification of image textures. Energy matrices are formed from subband coefficients of the wavelet packet transform. Singular value decomposition is then employed on the energy matrices. The probability density function of singular values is modeled as exponential distribution, and the model parameter is estimated using the maximum likelihood estimation technique. The model parameter, one for each subband, is used to form the feature vector. Classification is carried out using the Kullback-Leibler Distance (KLD). Performance of the algorithm is compared with model-based and feature-based methods in terms of the signal-to-noise ratio and the classification rate. Experimental results prove that the proposed algorithm achieves better classification rate under noisy environment.
Key words: wavelet packet transform, image texture classification, singular value decomposition.
Koprowski R., Wrobel Z.:
The automatic measurement of a staining reaction level.
MGV vol. 15, no. 2, 2006, pp. 227-238.
The paper presents an application of an algorithm for automatic segmentation of biological cell structures. The algorithm, based on two morphological operations -- conditional opening and conditional closing, is described in detail in (Koprowski R., Wrobel Z.: Automatic segmentation of biological cell structures based on conditional opening or closing. Machine Graphics and Vision, 14(3), 285-307)). The results of the segmentation of biological cell structure images as well as the evaluation of a staining reaction saturation level and its metrological properties are studied.
New Books Notes
MGV vol. 15, no. 2, 2006, pp. 239-240.
[Robert Cierniak: Tomografia komputerowa: Budowa urzadzen CT, algorytmy rekonstrukcyjne
(Computed tomography: CT engineering, Reconstructive algorithms, in Polish).
Published by Akademicka Oficyna Wydawnicza EXIT, Warsaw 2005.
Proc. International Conference on Computer Vision and Graphics (ICCVG 2006),
Warsaw, Poland, September 25-27, 2006.
Special issue editor: Konrad Wojciechowski.
MGV vol. 15, no. 3/4, 2006, pp. 243-244.
Al-Hamadi A., Niese R., Panning A., Michaelis B.:
Toward robust face analysis method of non-cooperative persons in stereo color image sequences.
MGV vol. 15, no. 3/4, 2006, pp. 245-254.
[The winner of the Best Young Scientist's Presentation Award.]
This paper describes a novel method for analyzing single faces of non-cooperative persons on the basis of stereoscopic color images. The challenges arise from the fact that the persons observed are non-cooperative, which in turn complicates further processing as facial feature extraction and tracking in image sequence. In our method, face detection is based on color-driven clustering of 3D points derived from stereo. A mesh model is registered with a post-processed face cluster, using a variant of the Iterative Closest Point algorithm (ICP). The pose is derived from correspondence. Then, the pose and model information are used for face normalization and facial feature localization. Automatic extraction of facial features is carried out using modified Active Shape Models (ASM). In contrast to the simple ASM, another approach is pursued in this work. It involves two modifications to the ASM, which lead to greater stability and robustness. The results show that stereo and color are powerful cues for finding the face and its pose, and for facial feature extraction under a wide range of poses, illumination types and expressions (PIE).
Key words: facial feature extraction, active shape models, stereo and color image analysis.
Andrysiak T., Choras M.:
Algorithms for stereovision disparity calculation in the moment space.
MGV vol. 15, no. 3/4, 2006, pp. 255-264.
This article presents various theoretical and experimental approaches to the problem of stereo matching and disparity estimation. We propose to calculate stereo disparity in the moment space, but we also present numerical and correlation-based methods. In order to calculate the disparity vector, we decided to use the discrete orthogonal moments of Tchebichef, Zernike and Legendre. In our research in stereo disparity estimation, all of those moments were tested and compared. Experimental results confirm effectiveness of the presented methods for determining stereo disparity and stereo matching for robotics and machine vision applications.
Key words: stereovision, disparity calculation, moments space.
Arasteh S., Hung C.C.:
Color and texture image segmentation using uniform local binary patterns.
MGV vol. 15, no. 3/4, 2006, pp. 265-274.
The paper describes a new algorithm for image segmentation based on the color and texture features. The Uniform Local Binary Pattern (ULBP) method is used to extract texture features. Color features are defined based on the pixels' color bands. Image segmentation is carried out using the K-means algorithm on feature vectors, including color and texture features. The distance measure is defined as a function of the color and texture feature vector distances from the K-means defined centers. The weighting parameter is used to adjust the relative contribution of the color and texture features. The proposed algorithm is applied to color images in the RGB, HSV and IHLS color spaces. Experimental results show that the proposed algorithm yields good performance in combining color and texture features to distinguish different texture patterns. In particular, for textures with high color contrast, the results are prominent. The main advantage of the method is its speed and simplicity, which are inherited from the K-means algorithm.
Key words: color and texture segmentation, local binary patterns.
Bando T., Oshima T.:
Reproduction of global impression of landscape images by improved texture reconstruction methods.
MGV vol. 15, no. 3/4, 2006, pp. 275-284.
In order to reproduce the atmosphere or the global impression of the landscape images we have created model patterns by reconstructing the texture of the whole images. We have first segmented full-color landscape images into patches of same color after converting them into 216-color images, and then analyzed the size and the center of gravity of each color patch. We have created five improved reconstructed model patterns from the data in the color patches. Some of the reconstructed texture patterns are similar in the global appearance to the original landscape images, although the details of the original images have been completely destroyed, to the extent that it is difficult to understand what these images represent at all. Each of the reconstructed patterns has been evaluated in order to find good methods for reproducing the atmosphere of the landscape. The patterns reconstructed using the proposed methods, which take into consideration the shape and position of the original color patches, demonstrate very high quality. The results suggest that the reconstructed global texture of the whole image creates an atmosphere or a global impression of the landscape images.
Key words: landscape, texture reconstruction, atmosphere.
Bojar K., Nieniewski M.:
Modelling the spectrum of the fourier transform of the texture in the solar EIT images.
MGV vol. 15, no. 3/4, 2006, pp. 285-295.
We present a phenomenological parametric model for the spectrum of the discrete Fourier transform (DFT) of the images obtained from the Extreme UV Imaging Telescope (EIT) of the SOHO mission. As this spectrum decays very fast, we model its logarithm rather than the original spectrum. The proposed model is rotation-invariant. The vicinity of the direct current (DC) component of the logarithm of the spectrum is modelled by the sum of two exponential functions, while the region of high frequencies is modelled by a single exponential function summed with a constant. We discuss a method for fitting the model to the experimental data, show the results of numerical experiments, and discuss various measures of goodness of the fit. The fitting of the described model was carried out for a sequence of images covering one year, and the time evolution of the measures of goodness of fit is also presented.
Key words: texture, DFT, modelling, EIT solar images.
Boyer V., Sobczyk D.:
A Model to Blend Renderings.
MGV vol. 15, no. 3/4, 2006, pp. 297-304.
We propose a model to blend renderings. It consists in mixing different kind of rendering techniques in the same frame to enhance the visualisation of information on large scenes. This method can be implemented in real-time during the rendering process using the GPU programming. Moreover, the rendering techniques used and the key points defined by the user can be interactively changed. In this paper we present the model, a new non-photorealistic rendering technique and images produced by our method.
Key words: non-photorealistic rendering GPU.
Robust Filtering in a Laser Scanner Point Cloud.
MGV vol. 15, no. 3/4, 2006, pp. 305-310.
This paper presents a new effective and robust approach to noise reduction in a three-dimensional data measurement algorithm. In the literature, there are numerous algorithms for noise reduction. The proposed filter class is based on the nonparametric estimation of the density probability function in a sliding filter sphere. The main idea of the applied algorithm depends is to maximize the distance between points in the three-dimensional space - the nearest neighbors in sliding 3D sphere.
Key words: 3D pre-processing algorithms, 3D acquisition, cloud of points filtering.
Cinque L., Sangineto E., Tanimoto S.:
Recognition of articulated robots in the RoboCup domain.
MGV vol. 15, no. 3/4, 2006, pp. 311-320.
This paper presents a system for articulated object recognition. The system has been tested in the RoboCup domain (four-legged league), an international competition among autonomous dog-like robots playing soccer. Nevertheless, the proposed method does not depend on any specific domain, but is thought to be applicable to generic objects composed of rigid subparts linked by rotational and/or translational joints.
Key words: articulated object recognition, RoboCup, deformable shapes, model-based approach.
Cuesta-Frau D., Hernández-Fenollosa M. A., Vicedo-Payá J., Jiméenez-López F.:
Particle measurement in scanning electron microscopy images.
MGV vol. 15, no. 3/4, 2006, pp. 321-328.
This paper describes the segmentation of nanoparticles of ZnO obtained by mechanical milling. Segmentation of objects in images is a common application of computer vision methods. In contrast to manual segmentation, these techniques are fast, objective, and accurate. We describe in this paper a method based on such techniques aimed at segmenting the particles in a microscopic image of ZnO in order to obtain an approximation of the grain size, and a measure of the homogeneity, in a non-supervised way. The images are obtained using scanning electron microscopy and then preprocessed to enhance the contrast and to reduce the noise. Next, an edge detection algorithm is applied to obtain the boundaries of the particles. Finally, the particles that satisfy a specific criterion are extracted and measured, and their measure is taken as an approximation of the particle size.
Key words: image processing, edge detection, pattern matching.
Hardware-software system for acceleration of image processing operations.
MGV vol. 15, no. 3/4, 2006, pp. 329-337.
The paper presents design and architecture of a hybrid software/hardware system for acceleration of image processing. The front end consists of a software interface that defines the basic data structures and exchange mechanisms for connecting to external software. The back end consists of a hardware board which is responsible for acceleration of image computations. Thus, the two main components follow the handle/body concept, which allows modifications to the implementation without changes in interfaces. This flexibility allows for better resource usage, and faster development, and facilitates system extensions. In this paper we present the design and implementation issues for the system, as well as discuss its run-time performance for the selected image operations.
Key words: image processing library, hardware acceleration.
Detection and segmentation of moving vehicles and trains using gaussian mixtures, shadow detection and morphological processing.
MGV vol. 15, no. 3/4, 2006, pp. 339-348.
The solution presented in this paper combines background modelling, shadow detection and morphological and temporal processing into a single system responsible for detection and segmentation of moving objects recorded with a static camera. Vehicles and trains are detected based on their pixel-level difference with respect to a continually updated background model, using a Gaussian mixture calculated separately for every pixel. The shadow detection method utilizes a colour model which allows for estimating chromatic and brightness differences between the pixel colour and the background model. Morphological and temporal operations performed on binary images denoting moving objects include connecting the components, closing and temporal filtering. Experiments carried out involve employing implemented algorithms to detect vehicles and trains in video sequences. The results achieved are described and illustrated in figures.
Key words: image processing, vehicle detection and segmentation, mixture of Gaussians, shadow detection, morphological operations.
El-Etriby Sh., Al-Hamadi A., Michaelis B.:
Dense depth map reconstruction by phase difference-based algorithm under influence of perspective distortion.
MGV vol. 15, no. 3/4, 2006, pp. 349-361.
This paper describes a novel phase difference-based algorithm applied to the corresponding points in two views, which takes into account the surface perspective distortion (foreshortening). The challenges arise from the fact that stereo images are acquired from slightly different views. Therefore, the surface of a projected image is more compressed and occupies a smaller area in one view. Since the projective distortion region can not be estimated in terms of a fixed-size matching algorithm, we suggest using a local spatial frequency representation model to address this problem. Instead of matching intensities directly, a Gabor scale-space expansion (scalogram) is used. The scalogram expresses the filter output as a function of the spatial position and the principal wavelength to which the filter is tuned. The phase difference at the corresponding points in the two images is used to find the disparity value. The suggested algorithm provides an analytical closed-form expression for the perspective foreshortening effect. The foreshortening factor is verified to overcome the perspective distortion region. The efficiency and performance of the suggested algorithm for dense depth map reconstruction is demonstrated on the basis of analysis of rectified real images. Hence, our proposed method has a superior performance in comparison with other conventional methods.
Key words: stereo matching, disparity estimation, 3D reconstruction, perspective distortion.
Fedoseev V., Chernov V.:
Cryptography and canonical number systems in quadratic fields.
MGV vol. 15, no. 3/4, 2006, pp. 363-372.
This paper proposes an encryption method based on representation of messages in the canonical number systems (CNS) in quadratic fields. The essence of the encryption method is conversion of the representation of integers from the conventional number system to their representation in the CNS in a certain quadratic field. A sufficiently wide range of CNS with a given number of digits ensures resistance of the method to "accidental guessing" of the secret keys. Nonlinear nature of the conversion ensures its resistance to frequency analysis.
Key words: canonic number systems, cryptography, frequency analysis.
Florek A., Piascik T.:
Efficient object description and recognition based on shape signature.
MGV vol. 15, no. 3/4, 2006, pp. 373-380.
A simple and efficient approach to recognition and classification of planar shapes is proposed. This approach is based on comparison of areas of dynamic adapted centroid signatures. The usefulness of our approach for assessment of the similarity equally of convex and non-convex objects and objects containing openings is shown.
Key words: shape representation, signature, object recognition.
Grundland M., Vohra R., Williams G.P., Dodgson N.A.:
Nonlinear multiresolution image blending.
MGV vol. 15, no. 3/4, 2006, pp. 381-390.
We study contrast enhancement for multiresolution image blending. In image compositing, image stitching, and image fusion, a blending operator combines coefficients of a pixel array, an image pyramid, a wavelet decomposition, or a gradient domain representation. Linear interpolation reduces variation and thereby causes contrast loss, while coefficient selection increases variation and thereby causes color distortion. Offering a continuous range of possibilities between these standard alternatives, the signed weighted power mean enables the user to calibrate the contrast of composite images.
Key words: multiresolution contrast enhancement, image compositing, image blending, image fusion, image stitching, image pyramid, image editing, video editing, cross dissolve, digital art.
Hidot S., Lafaye J.-Y., Saint-Jean C.:
Discriminant factor analysis for movement recognition: application to dance.
MGV vol. 15, no. 3/4, 2006, pp. 391-399.
In this paper, we present an example of operator discriminant factor analysis applied to the study of movement. Our goal is to contribute to the automatic recognition of typical movements and give analytic elements to discuss the differences between the various types. As a signature for the movements, we select the 'Relational Covariance Operator' and explain our motivation. Discriminant analysis cannot be directly applied to covariance operators; so, in order to deal with a well conditioned problem and avoid useless heavy computation, we proceed in two steps. We first achieve a variant of principal components analysis on covariance operators by using convenient metrics, and then apply plain linear discriminant analysis on the resulting axis. Such a method allows a simple and powerful interpretation of the movement typology. We also present an experiment on motion-captured movements performed by dancers from the 'Ballet Atlantique Régine Chopinot'.
Key words: discriminant analysis, operator analysis, movement signature, movement recognition.
Hnat K., Porquet D., Merillou S., Ghazanfarpour D.:
Real-time wetting of porous media.
MGV vol. 15, no. 3/4, 2006, pp. 401-413.
Studying light reflection properties is a crucial factor in achieving a high degree of realism in image synthesis. Considered as a challenge in itself, it becomes even more complicated when dealing with specific changes in appearance due to external factors. Among these changes, one of the most common is wetting: surfaces appear darker and more specular as their wetting level increase. Such a phenomenon is of great visual importance in outdoor scenes under rain falls, for example. This change in appearance is mainly due to the porous nature of surfaces. In this paper, we propose to handle a porous surface BRDF post-process model in real-time and to extend it to account for wetting, with simple and intuitive parameters.
Key words: shading, porosity, weathering, wetting, real-time rendering.
Ivanov Y., Hamid R.:
Weighted ensemble boosting for robust activity recognition in video.
MGV vol. 15, no. 3/4, 2006, pp. 415-427.
In this paper we introduce a novel approach to classifier combination, which we term Weighted Ensemble Boosting. We apply the proposed algorithm to the problem of activity recognition in video, and compare its performance to different classifier combination methods. These include Approximate Bayesian Combination, Boosting, Feature Stacking, and the more traditional Sum and Product rules. Our proposed Weighted Ensemble Boosting algorithm combines the Bayesian averaging strategy with the boosting framework, finding useful conjunctive feature combinations and achieving a lower error rate than the traditional boosting algorithm. The method demonstrates a comparable level of stability with respect to the classifier selection pool. We show the performance of our technique for a set of 6 types of classifiers in an office setting, detecting 7 classes of typical office activities.
Key words: video analsyis, activity recognition, boosting, ensemble methods.
Iwanowski M., Huk A.:
Satellite image pansharpening by chrominance propagation combined with kernel interpolation.
MGV vol. 15, no. 3/4, 2006, pp. 429-438.
This paper describes a method for propagation-based spatial interpolation of missing color information in satellite images. Most of the land-observation satellites produce two types of imagery for every scene -- multispectral and panchromatic. The first kind is characterized by lower spatial resolution but higher spectral one, while the second is a graylevel image at higher resolution. In order to get full color visualization of a scene, the high-resolution panchromatic image must be combined with the low-resolution color information taken from multispectral bands. This process is called pansharpening. In this paper, a new method for pansharpening is proposed which combines chrominance propagation with kernel interpolation. Thanks to the propagation step, the method properly reconstructs color information and does not blur the edges on color channels.
Key words: remote sensing, pansharpening, interpolation.
Jiménez-López F., Cuesta-Frau D., Linares-Pellicer J., Micó-Tormos P.:
A comparative study of local thresholding methods for document image binarization.
MGV vol. 15, no. 3/4, 2006, pp. 439-450.
This paper provides a comparative study of several local thresholding techniques applied in the binarization of document images with text and graphs. First, the methods are described, and then a quantitative comparative analysis is carried out. Ground truth images are used in the experiments to provide an objective measure, and computational costs are also compared.
Key words: image processing, image binarization.
Kasprzak W., Szynkiewicz W., Czajka L.:
Rubik's cube reconstruction from single view for service robots.
MGV vol. 15, no. 3/4, 2006, pp. 451-459.
The Rubik's cube puzzle is seen as a benchmark for service robots. In such an application, a computer vision subsystem is required to locate the object in space and to determine the configuration of its colored cells. This paper presents a robust algorithm for Rubik's cube reconstruction from a single view in real time. An issue of special interest is to obtain a good tade-off between the quality of results, and the computational complexity of the algorithm.
Key words: color analysis, shape description, segment grouping, service robots, surface orientation, 3-D object recognition.
Korbel P., Slot K.:
Cellular neural network-based object recognition with deformable grids.
MGV vol. 15, no. 3/4, 2006, pp. 461-469.
The following paper presents an idea for a parallel implementation of the deformable grid paradigm within the framework of Cellular Neural Networks. Parallel processing may alleviate the problem of high complexity of deformable template matching and significantly speed up object recognition tasks. The paper presents details of a CNN-based implementation of the basic element of the deformable grid-based image processing, which is image-grid matching. Estimated execution speed of the CNN-based method and recognition rates achieved in the experiments make the method an attractive framework for applications such as high-speed coarse object classification.
Key words: object recognition, image processing, cellular neural networks.
Ksantini R., Ziou D., Colin B., Dubeau F.:
Weighted pseudo-metric for a fast CBIR method.
MGV vol. 15, no. 3/4, 2006, pp. 471-480.
In this paper, a simple and fast querying method for content-based image retrieval is presented. In order to measure the similarity degree between two color images both quickly and effectively, we use a weighted pseudo-metric employing one-dimensional Daubechies decomposition and compression of the extracted feature vectors. In order to improve the discriminatory capacity of the pseudo-metric, we compute its weights using separately a classical logistic regression model and a Bayesian logistic regression model. The Bayesian logistic regression model was shown to be significantly better than the classical logistic regression model at improving the retrieval performance. Experimental results are reported on the WANG and ZuBuD color image databases proposed by [Deselaers T., Keysers D., Ney H.: Classification error rate for quantitative evaluation of content-based image retrieval systems. 17th International Conference on Pattern Recognition (ICPR'04), 2, pp. 505-508, Cambridge, UK].
Key words: color image retrieval, weighted pseudo-metric, logistic regression models.
Léon P.-F., Skapin X., Meseure P.:
Topologically-based animation for describing geological evolution.
MGV vol. 15, no. 3/4, 2006, pp. 481-491.
This paper presents a topologically-based animation system which aims at representing the topological evolution of structured objects over time. Its robustness and versatility both relie on the $n$-dimensional generalised map formalism. The animation is modelled as a series of maps ordered in time and represents each topological modification of the structure. We define several dedicated topological operations, which are translated into a script defining the animation. We finally show the usefulness of the approach by means of a specific application in geology, namely representation of a subsoil evolution in 2D.
Key words: topologically-based animation, generalised maps, geology.
Minimizing CPU Usage in Soft Shadow Volumes Algorithm.
MGV vol. 15, no. 3/4, 2006, pp.493-503.
In this paper we present methods for reducing the CPU burden associated with the original implementation of the hardware-accelerated soft shadow volumes algorithm. We propose and implement two hardware-accelerated wedge construction methods: using the programmable vertex and fragment pipelines. New algorithms reducing the number of draw calls needed to render the shadow region (to three, two or even one) are also described. Tests performed using the current consumer level hardware show that our algorithms are no longer CPU-bound, and therefore can be over 2 times faster than the original algorithm.
Key words: shading, soft shadows, shadow volumes, graphics hardware.
Marusiak K., Szczepanski M.:
Video based road traffic detection and analysis.
MGV vol. 15, no. 3/4, 2006, pp. 505-514.
In this article an autonomous vehicle detection algorithm is presented. The described technique is capable of accurate vehicle detection, can work with different cameras and is fairly immune to illumination changes. It is based on the idea of an adaptive background model and the background subtraction method for detecting motion. The application's performance tests ware conducted using real life video sequences and gave satisfactory results.
Key words: video processing, motion detection, traffic monitoring, background adaptation, background detection, directShow.
Mischler D., Romaniuk B., Benassarou A., Bittar E.:
Robust 4D segmentation of cells in confocal images.
MGV vol. 15, no. 3/4, 2006, pp. 515-524.
We present a method to automatically extract the evolution of the cell envelope in 4D confocal images. Our method is based on 4DReAM, a tracking system consisting of a previously presented deformable surface model, which can change its topology. The process consists in attracting the model for each volume of a 4D series towards an iso-surface of interest, and towards the image gradients. We then statistically estimate the characteristics of the cell surface on the nodes of the model, and reconstruct it. We show detailed results on the segmentation of the cell envelope during the mitosis.
Key words: deformable model, 4D tracking, living cells, automatic segmentation, gradient comparison, differential operators.
Mokrzycki W.S., Salamonczyk A.:
Generating 3D multiview exact polyhedron representation by scanning faces surroundings.
MGV vol. 15, no. 3/4, 2006, pp. 525-536.
This article presents 3D multiview exact models that are complete representations of a polyhedron obtained by the means of the faces surroundings scanning algorithm. The algorithm uses the viewing sphere and the perspective projection concepts (known as the K-M view space model). Such models will be used for visual identification based on them and a scene depth map. We develop the concept and give an algorithm for face-dependent generation of 3D exact views polyhedron representation.
Key words: visual object identification, depth map, precise 3D multiview polyhedron models, viewing sphere with central projection, models completion state of multiview representation.
Mousa M., Chaine R., Akkouche S.:
Frequency-based representation of 3D point-based surfaces using spherical harmonics.
MGV vol. 15, no. 3/4, 2006, pp. 537-546.
In this paper, we propose a precise frequency-based representation for oriented point-based surfaces using spherical harmonics. The representation can be useful in many applications, such as filtering, progressive transmission and coding of 3D surfaces. The basic computation in our approach is the spherical harmonics transform of local spherical radial functions induced by a set of points. An important feature of our approach is that the calculations are performed directly on local 2D triangulations of the point-based surface without any prior space voxelization. This property ensures that the complexity of our computation of the spherical harmonics transform is linear in the number of triangles in the local patch. We present some experimental results which demonstrate our technique.
Key words: spherical harmonics, point-based surface, direct simplex-based computation, surface reconstruction, geometric texture.
Nacereddine N., Hamami L., Ziou D., Tridi M.:
Probabilistic deformable models for weld defect contour estimation in radiography.
MGV vol. 15, no. 3/4, 2006, pp. 547-556.
This paper describes a novel method for segmentation of weld defect in radiographic images. Contour estimation is formulated as a statistical estimation problem, where both the contour and the observation model parameters are unknown. Our approach can be described as a region-based maximum likelihood formulation of parametric deformable contours. This formulation provides robustness against the poor image quality, and allows simultaneous estimation of the contour parameters together with other parameters of the model. Implementation is performed by a deterministic iterative algorithm with minimal user intervention. Results testify very good performance of such contour estimation approach.
Key words: Gaussian and Rayleigh distributions, contour estimation, maximum likelihood, parametric deformable contours.
Nacereddine N., Hamami L., Ziou D.:
Thresholding techniques and their performance evaluation for weld defect detection in radiographic testing.
MGV vol. 15, no. 3/4, 2006, pp. 557-566.
In non-destructive testing with radiography, a perfect knowledge of the weld defect shape is an essential step to appreciate the quality of the weld and make decision on its acceptance or rejection. Because of the complex nature of the considered images, and in order that the detected defect region represent the real defect as accurately as possible, the choice of the thresholding methods must be made judiciously. In this paper, performance criteria are used to conduct a comparative study of the thresholding methods based on the gray level histogram, the 2D histogram and the locally adaptive approach to weld defect detection in radiographic images.
Key words: 1D and 2D histogram, locally adaptive approach, performance criteria, radiographic image, thresholding, weld defect.
Lip-reading with discriminative deformable models.
MGV vol. 15, no. 3/4, 2006, pp. 567-575.
The following paper describes a novel lip-reading method developed for the purpose of isolated word recognition. The method is based on a concept of a discriminative deformable model, which represents an image analysis method derived from the deformable grid paradigm. The discriminative deformable model is used to characterize the lip shape at each frame of the video sequence. The information extracted from the consecutive frames is next analyzed using the Hidden Markov Models. The proposed visual speech recognition method is tested using the Polish digits recognition task.
Key words: lip-reading, deformable grid, speech recognition.
Pelc L., Kwolek B.:
Recognition of actions in meeting videos using timed automata.
MGV vol. 15, no. 3/4, 2006, pp. 577-584.
This paper addresses the problem of action recognition in meeting videos. A declarative knowledge provided graphically by the user together with person positions extracted by a tracking algorithm are used to generate the data for recognition. The actions have been formally specified using timed automata. The specification was verified on the basis of simulation tests as well as an analysis. The tracking is accomplished using a particle filter built on cues such as color, gradient and shape.
Key words: action recognition, vision-based people tracking, timed automata.
An effective edge-directed frequency filter for removal of aliasing in upsampled images.
MGV vol. 15, no. 3/4, 2006, pp. 585-598.
Raster images can have a range of various distortions related to their raster structure. Their upsampling might in effect substantially reveal the raster structure of the original image, known as aliasing. The upsampling itself may introduce aliasing into the upsampled image as well. The presented method attempts to remove the aliasing using frequency filters based on discrete fast Fourier transform, and applied directionally in certain regions placed along the edges in the image.
As opposed to some anisotropic smoothing methods, the presented algorithm aims to selectively reduce the aliasing only, preserving the sharpness of image details.
The method can be used as a post-processing filter along with various upsampling algorithms. It was experimentally shown that the method can improve the visual quality of the upsampled images.
Key words: aliasing, upsampling, frequency filter, edge detection.
Frame rate up-conversion using region-based optical flow.
MGV vol. 15, no. 3/4, 2006, pp. 599-606.
Frame rate up-conversion is widely used for converting between the standards and increasing the frame rate of low bit rate compression video conferences. The raster-based animation tweening algorithm can not be directly applied with frame rate up-conversion, because the crucial procedure of the algorithm, the mean shift-based image segmentation method, can not handle complex characteristics of a real world video image. In some case, the background and foreground may be merged into one segment. In this work, a synergistic image segmentation method is used according to the particular needs of a real world video image. The experimental results illustrate that higher quality in-between frames can be obtained by the proposed algorithm.
Key words: image processing, computer vision.
Stachera J., Rokita P.:
Fractal-based hierarchical mip-pyramid texture compression.
MGV vol. 15, no. 3/4, 2006, pp. 607-619.
As the level of realism of computer-generated images increases with the number and resolution of textures, we faced the problem of limited hardware resources. Additionally, filtering methods require multiple access and extra memory space for texture representation, thus severely reducing the memory space and bandwidth, the most common example being mip-mapping technique. We propose a hierarchical texture compression algorithm for real-time decompression on the GPU. Our algorithm is characterised by low computational complexity, random access and a hierarchical structure, which allows access to the first three levels of an encoded mip-map pyramid. The hierarchical texture compression algorithm HiTC is based on a block-wise approach, where each block is subject to local fractal transform and further effectively coded by one level of the Laplacian Pyramid.
Key words: texture compression, fractal compression, Laplacian pyramid, mip-mapping.
Mesh representation of simply connected 3D objects in visualisation of melting phenomena.
MGV vol. 15, no. 3/4, 2006, pp. 621-630.
This paper discusses some aspects of the physics approach to simulation of the melting phenomena: energy distribution in a volumetric model, substance vaporization and their relation to geometric properties of a model. A new simulation method for melting objects is presented. The method consists in mesh deformations which allow simulation of melting phenomena with preservation of visual effects similar to those achieved in the volumetric approach. Moreover, a few examples of melted models are included to show visual effects achievable with the help of the method. Meshes used with the method should be simply connected sets. This means there are no holes in them, and that they constitute a single object through the whole simulation process. Furthermore, a new method for determining whether a point is situated inside a given polygon is shown.
Key words: mesh, melting, polygon inclusion.
Genetic filters for video noise reduction.
MGV vol. 15, no. 3/4, 2006, pp. 631-638.
This paper describes a genetic programming approach to video filtering, based on evolution programming for image sequence filter design. Each gene represents a filter weight. The intra- and interframe approaches are shown and discussed.
Key words: genetic algorithms, image enhancement, nonlinear filters.
Local features (also known as interest points, keypoints, etc.) are a popular and powerful tool for matching images and detecting partially occluded objects. While the problems of photometric distortions of images and rotational invariance of the features have satisfactory solutions, satisfactorily simple scale-invariant algorithms do not exist yet. Generally, either computationally complex methods of scale-space (multi-scale approach) are used, or the correct scale is estimated using additional mechanisms. The paper proposes a new category of keypoints that can be used to develop a simple scale-invariant method for detecting known objects in analyzed images. Keypoints are defined as locations at which selected moment-based parameters are consistent over a wide range of different-size circular patches around the keypoint. While the database of known objects (i.e. the keypoints and their descriptions) is still built using a multi-scale approach, analyzed images are scanned using only a single-scale window and its sub-window. The paper focuses on the keypoint building and keypoint matching principles. Higher-level issues of hypotheses building and verification (regarding the presence of objects in analyzed images) are only briefly discussed.
Key words: local features, keypoints, moment invariants, scale-invariance, geometric approximation.
Tworzydlo J., Skabek K., Luchowski L., Winiarczyk R.:
Bringing into register incomplete range images of cultural artifacts.
MGV vol. 15, no. 3/4, 2006, pp. 649-658.
Finding solutions for preserving cultural heritage and historic and architectural sites as 3D models is an important problem in our days. Building 3D models of real objects is also an important research issue in virtual reality. Since a 3D range scanner, used to obtain 3D data, is a line-of-sight instrument, in most cases it is necessary to scan the object from multiple viewpoints to completely reconstruct it. The acquired range images need to be brought into register to build one coherent model. This article presents two registration methods for incomplete range images. One is based on the assumption that the range images being registered have a significant overlap with some feature points. The second one needs additional 2D photographs of the scene to perform the right registration, but the overlapping area of the range images is not necessary here. The presented methods are semi-automatic and base on the analysis of redundant and uncertain data.
Key words: 3D scanner, registration of range images.
Wcislo R., Zelechowski M.:
Hardware based real-time non-rigid body animation.
MGV vol. 15, no. 3/4, 2006, pp. 659-664.
We describe a method for animation of elastic objects based on a mass-spring system with a pressure model. The method employs two different implementation approaches: CPU- and GPU-based. Three-dimensional, macroscopic objects with a volume of arbitrary geometry are considered. The main goal of the presented work is to make use of a modern graphics card equipped with a programmable GPU (Graphical Processor Unit).
Key words: non-rigid, real-time, animation, hardware based.
Potential field based camera collisions detection in a static 3D environment.
MGV vol. 15, no. 3/4, 2006, pp. 665-672.
Collision detection methods usually need a long pre-calculation stage or require difficult and time-consuming real-time computation. Moreover, their effectiveness decreases as the complexity of the scene increases. The seemingly promising solutions presented in literature which are based on potential fields do not offer a satisfactory functionality. This paper introduces a method which provides a new potential field construction affecting camera movement, which lets the viewer reach objects without constraints and protects the user from getting into their structure. Additionally, the proposed method comprises an easily executable pre-calculation stage, and represents a solution independent of the scene-complexity.
Key words: collision detection, potential field, navigation.
Zhang S., Chen Y.:
Image denoising based on wavelet support vector regression.
MGV vol. 15, no. 3/4, 2006, pp. 673-680.
Denoising is an important application of image processing. We have constructed a denoising system which learns an optimal mapping from the input data to denoised data. The Morlet wavelet was used as the kernel function to construct the wavelet support vector machine. The noised image data is mapped to denoised values by wavelet support vector regression. The result shows that denoising via wavelet support vector regression could perform better than Gaussian smoothing, median filtering and average filtering on the experimental image and it also performs better than Gaussian radial basic function support vector regression.
Key words: image denoising, support vector regression, wavelet analysis, function approximation.
Zabinski T., Grygiel T., Kwolek B.:
Design and implementation of visual feedback for an active tracking.
MGV vol. 15, no. 3/4, 2006, pp. 681-690.
Active visual tracking is used to direct the attention of the camera to an object and maintain it in the camera's field of view. A steered camera is used to decrease relative motion of the target in the image plane. This leads to better performance of the mean-shift based tracking algorithm, which requires the object tracked in the current and the previous frame to overlap. A classical PID controller and a nonlinear fuzzy controller have been tested in steering the camera head.
Key words: active vision, vision-based tracking, visual servoing.
Contents of volume 15, 2006