Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–20]

Details

	Records
	Author	Sergio Escalera; Vassilis Athitsos; Isabelle Guyon
	Title	Challenges in multimodal gesture recognition			Type	Journal Article
	Year	2016	Publication	Journal of Machine Learning Research	Abbreviated Journal	JMLR
	Volume	17	Issue		Pages	1-54
	Keywords	Gesture Recognition; Time Series Analysis; Multimodal Data Analysis; Computer Vision; Pattern Recognition; Wearable sensors; Infrared Cameras; KinectTM
	Abstract	This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011-2015. We began right at the start of the KinectTMrevolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras, to record data, thus providing a good overview of uses of machine learning and computer vision using multimodal data in this area of application. Notably, we organized a series of challenges and made available several datasets we recorded for that purpose, including tens of thousands of videos, which are available to conduct further research. We also overview recent state of the art works on gesture recognition based on a proposed taxonomy for gesture recognition, discussing challenges and future lines of research.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor	Zhuowen Tu
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA;MILAB;			Approved	no
	Call Number	Admin @ si @ EAG2016			Serial	2764
Permanent link to this record



	Author	Miquel Ferrer; I. Bardaji; Ernest Valveny; Dimosthenis Karatzas; Horst Bunke
	Title	Median Graph Computation by Means of Graph Embedding into Vector Spaces			Type	Book Chapter
	Year	2013	Publication	Graph Embedding for Pattern Analysis	Abbreviated Journal
	Volume		Issue		Pages	45-72
	Keywords
	Abstract	In pattern recognition [8, 14], a key issue to be addressed when designing a system is how to represent input patterns. Feature vectors is a common option. That is, a set of numerical features describing relevant properties of the pattern are computed and arranged in a vector form. The main advantages of this kind of representation are computational simplicity and a well sound mathematical foundation. Thus, a large number of operations are available to work with vectors and a large repository of algorithms for pattern analysis and classification exist. However, the simple structure of feature vectors might not be the best option for complex patterns where nonnumerical features or relations between different parts of the pattern become relevant.
	Address
	Corporate Author				Thesis
	Publisher	Springer New York	Place of Publication		Editor	Yun Fu; Yungian Ma
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-1-4614-4456-5	Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ FBV2013			Serial	2421
Permanent link to this record



	Author	Fadi Dornaika; Bogdan Raducanu
	Title	Subtle Facial Expression Recognition in Still Images and Videos			Type	Book Chapter
	Year	2011	Publication	Advances in Face Image Analysis: Techniques and Technologies	Abbreviated Journal
	Volume		Issue	14	Pages	259-277
	Keywords
	Abstract	This chapter addresses the recognition of basic facial expressions. It has three main contributions. First, the authors introduce a view- and texture independent schemes that exploits facial action parameters estimated by an appearance-based 3D face tracker. they represent the learned facial actions associated with different facial expressions by time series. Two dynamic recognition schemes are proposed: (1) the first is based on conditional predictive models and on an analysis-synthesis scheme, and (2) the second is based on examples allowing straightforward use of machine learning approaches. Second, the authors propose an efficient recognition scheme based on the detection of keyframes in videos. Third, the authors compare the dynamic scheme with a static one based on analyzing individual snapshots and show that in general the former performs better than the latter. The authors then provide evaluations of performance using Linear Discriminant Analysis (LDA), Non parametric Discriminant Analysis (NDA), and Support Vector Machines (SVM).
	Address
	Corporate Author				Thesis
	Publisher	IGI-Global	Place of Publication	New York, USA	Editor	Yu-Jin Zhang
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-1-6152-0991-0	Medium
	Area		Expedition		Conference
	Notes	OR;MV			Approved	no
	Call Number	Admin @ si @ DoR2011			Serial	1751
Permanent link to this record



	Author	Sergio Vera; Miguel Angel Gonzalez Ballester; Debora Gil
	Title	Optimal Medial Surface Generation for Anatomical Volume Representations			Type	Book Chapter
	Year	2012	Publication	Abdominal Imaging. Computational and Clinical Applications	Abbreviated Journal	LNCS
	Volume	7601	Issue		Pages	265-273
	Keywords	Medial surface representation; volume reconstruction
	Abstract	Medial representations are a widely used technique in abdominal organ shape representation and parametrization. Those methods require good medial manifolds as a starting point. Any medial surface used to parametrize a volume should be simple enough to allow an easy manipulation and complete enough to allow an accurate reconstruction of the volume. Obtaining good quality medial surfaces is still a problem with current iterative thinning methods. This forces the usage of generic, pre-calculated medial templates that are adapted to the final shape at the cost of a drop in volume reconstruction. This paper describes an operator for generation of medial structures that generates clean and complete manifolds well suited for their further use in medial representations of abdominal organ volumes. While being simpler than thinning surfaces, experiments show its high performance in volume reconstruction and preservation of medial surface main branching topology.
	Address	Nice, France
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor	Yoshida, Hiroyuki and Hawkes, David and Vannier, MichaelW.
	Language		Summary Language		Original Title
	Series Editor		Series Title	Lecture Notes in Computer Science	Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-642-33611-9	Medium
	Area		Expedition		Conference	STACOM
	Notes	IAM			Approved	no
	Call Number	IAM @ iam @ VGG2012b			Serial	1988
Permanent link to this record



	Author	Aura Hernandez-Sabate; Debora Gil
	Title	The Benefits of IVUS Dynamics for Retrieving Stable Models of Arteries			Type	Book Chapter
	Year	2012	Publication	Intravascular Ultrasound	Abbreviated Journal
	Volume		Issue		Pages	185-206
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher	Intech	Place of Publication		Editor	Yasuhiro Honda
	Language	English	Summary Language	english	Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-953-307-900-4	Medium
	Area		Expedition		Conference
	Notes	IAM; ADAS			Approved	no
	Call Number	IAM @ iam @ HeG2012			Serial	1684
Permanent link to this record



	Author	Jaume Gibert; Ernest Valveny; Horst Bunke
	Title	Dimensionality Reduction for Graph of Words Embedding			Type	Conference Article
	Year	2011	Publication	8th IAPR-TC-15 International Workshop. Graph-Based Representations in Pattern Recognition	Abbreviated Journal
	Volume	6658	Issue		Pages	22-31
	Keywords
	Abstract	The Graph of Words Embedding consists in mapping every graph of a given dataset to a feature vector by counting unary and binary relations between node attributes of the graph. While it shows good properties in classification problems, it suffers from high dimensionality and sparsity. These two issues are addressed in this article. Two well-known techniques for dimensionality reduction, kernel principal component analysis (kPCA) and independent component analysis (ICA), are applied to the embedded graphs. We discuss their performance compared to the classification of the original vectors on three different public databases of graphs.
	Address	Münster, Germany
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor	Xiaoyi Jiang; Miquel Ferrer; Andrea Torsello
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-642-20843-0	Medium
	Area		Expedition		Conference	GbRPR
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ GVB2011a			Serial	1743
Permanent link to this record



	Author	Jordi Gonzalez
	Title	Human Sequence Evaluation: the Key-frame Approach			Type	Book Whole
	Year	2004	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher		Place of Publication		Editor	Xavier Roca;Javier Varona
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes				Approved	no
	Call Number	ISE @ ise @ Gon2004			Serial	362
Permanent link to this record



	Author	Naila Murray
	Title	Predicting Saliency and Aesthetics in Images: A Bottom-up Perspective			Type	Book Whole
	Year	2012	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	In Part 1 of the thesis, we hypothesize that salient and non-salient image regions can be estimated to be the regions which are enhanced or assimilated in standard low-level color image representations. We prove this hypothesis by adapting a low-level model of color perception into a saliency estimation model. This model shares the three main steps found in many successful models for predicting attention in a scene: convolution with a set of filters, a center-surround mechanism and spatial pooling to construct a saliency map. For such models, integrating spatial information and justifying the choice of various parameter values remain open problems. Our saliency model inherits a principled selection of parameters as well as an innate spatial pooling mechanism from the perception model on which it is based. This pooling mechanism has been fitted using psychophysical data acquired in color-luminance setting experiments. The proposed model outperforms the state-of-the-art at the task of predicting eye-fixations from two datasets. After demonstrating the effectiveness of our basic saliency model, we introduce an improved image representation, based on geometrical grouplets, that enhances complex low-level visual features such as corners and terminations, and suppresses relatively simpler features such as edges. With this improved image representation, the performance of our saliency model in predicting eye-fixations increases for both datasets. In Part 2 of the thesis, we investigate the problem of aesthetic visual analysis. While a great deal of research has been conducted on hand-crafting image descriptors for aesthetics, little attention so far has been dedicated to the collection, annotation and distribution of ground truth data. Because image aesthetics is complex and subjective, existing datasets, which have few images and few annotations, have significant limitations. To address these limitations, we have introduced a new large-scale database for conducting Aesthetic Visual Analysis, which we call AVA. AVA contains more than 250,000 images, along with a rich variety of annotations. We investigate how the wealth of data in AVA can be used to tackle the challenge of understanding and assessing visual aesthetics by looking into several problems relevant for aesthetic analysis. We demonstrate that by leveraging the data in AVA, and using generic low-level features such as SIFT and color histograms, we can exceed state-of-the-art performance in aesthetic quality prediction tasks. Finally, we entertain the hypothesis that low-level visual information in our saliency model can also be used to predict visual aesthetics by capturing local image characteristics such as feature contrast, grouping and isolation, characteristics thought to be related to universal aesthetic laws. We use the weighted center-surround responses that form the basis of our saliency model to create a feature vector that describes aesthetics. We also introduce a novel color space for fine-grained color representation. We then demonstrate that the resultant features achieve state-of-the-art performance on aesthetic quality classification. As such, a promising contribution of this thesis is to show that several vision experiences – low-level color perception, visual saliency and visual aesthetics estimation – may be successfully modeled using a unified framework. This suggests a similar architecture in area V1 for both color perception and saliency and adds evidence to the hypothesis that visual aesthetics appreciation is driven in part by low-level cues.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Xavier Otazu;Maria Vanrell
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	CIC			Approved	no
	Call Number	Admin @ si @ Mur2012			Serial	2212
Permanent link to this record



	Author	Jesus Jaime Moreno Escobar
	Title	Perceptual Criteria on Image Compresions			Type	Book Whole
	Year	2011	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Nowadays, digital images are used in many areas in everyday life, but they tend to be big. This increases amount of information leads us to the problem of image data storage. For example, it is common to have a representation a color pixel as a 24-bit number, where the channels red, green, and blue employ 8 bits each. In consequence, this kind of color pixel can specify one of 224 ¼ 16:78 million colors. Therefore, an image at a resolution of 512 £ 512 that allocates 24 bits per pixel, occupies 786,432 bytes. That is why image compression is important. An important feature of image compression is that it can be lossy or lossless. A compressed image is acceptable provided these losses of image information are not perceived by the eye. It is possible to assume that a portion of this information is redundant. Lossless Image Compression is defined as to mathematically decode the same image which was encoded. In Lossy Image Compression needs to identify two features inside the image: the redundancy and the irrelevancy of information. Thus, lossy compression modifies the image data in such a way when they are encoded and decoded, the recovered image is similar enough to the original one. How similar is the recovered image in comparison to the original image is defined prior to the compression process, and it depends on the implementation to be performed. In lossy compression, current image compression schemes remove information considered irrelevant by using mathematical criteria. One of the problems of these schemes is that although the numerical quality of the compressed image is low, it shows a high visual image quality, e.g. it does not show a lot of visible artifacts. It is because these mathematical criteria, used to remove information, do not take into account if the viewed information is perceived by the Human Visual System. Therefore, the aim of an image compression scheme designed to obtain images that do not show artifacts although their numerical quality can be low, is to eliminate the information that is not visible by the Human Visual System. Hence, this Ph.D. thesis proposes to exploit the visual redundancy existing in an image by reducing those features that can be unperceivable for the Human Visual System. First, we define an image quality assessment, which is highly correlated with the psychophysical experiments performed by human observers. The proposed CwPSNR metrics weights the well-known PSNR by using a particular perceptual low level model of the Human Visual System, e.g. the Chromatic Induction Wavelet Model (CIWaM). Second, we propose an image compression algorithm (called Hi-SET), which exploits the high correlation and self-similarity of pixels in a given area or neighborhood by means of a fractal function. Hi-SET possesses the main features that modern image compressors have, that is, it is an embedded coder, which allows a progressive transmission. Third, we propose a perceptual quantizer (½SQ), which is a modification of the uniform scalar quantizer. The ½SQ is applied to a pixel set in a certain Wavelet sub-band, that is, a global quantization. Unlike this, the proposed modification allows to perform a local pixel-by-pixel forward and inverse quantization, introducing into this process a perceptual distortion which depends on the surround spatial information of the pixel. Combining ½SQ method with the Hi-SET image compressor, we define a perceptual image compressor, called ©SET. Finally, a coding method for Region of Interest areas is presented, ½GBbBShift, which perceptually weights pixels into these areas and maintains only the more important perceivable features in the rest of the image. Results presented in this report show that CwPSNR is the best-ranked image quality method when it is applied to the most common image compression distortions such as JPEG and JPEG2000. CwPSNR shows the best correlation with the judgement of human observers, which is based on the results of psychophysical experiments obtained for relevant image quality databases such as TID2008, LIVE, CSIQ and IVC. Furthermore, Hi-SET coder obtains better results both for compression ratios and perceptual image quality than the JPEG2000 coder and other coders that use a Hilbert Fractal for image compression. Hence, when the proposed perceptual quantization is introduced to Hi-SET coder, our compressor improves its numerical and perceptual e±ciency. When ½GBbBShift method applied to Hi-SET is compared against MaxShift method applied to the JPEG2000 standard and Hi-SET, the images coded by our ROI method get the best results when the overall image quality is estimated. Both the proposed perceptual quantization and the ½GBbBShift method are generalized algorithms that can be applied to other Wavelet based image compression algorithms such as JPEG2000, SPIHT or SPECK.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Xavier Otazu
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-938351-3-2	Medium
	Area		Expedition		Conference
	Notes	CIC			Approved	no
	Call Number	Admin @ si @ Mor2011			Serial	1786
Permanent link to this record



	Author	Xim Cerda-Company
	Title	Understanding color vision: from psychophysics to computational modeling			Type	Book Whole
	Year	2019	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	In this PhD we have approached the human color vision from two different points of view: psychophysics and computational modeling. First, we have evaluated 15 different tone-mapping operators (TMOs). We have conducted two experiments that consider two different criteria: the first one evaluates the local relationships among intensity levels and the second one evaluates the global appearance of the tonemapped imagesw.r.t. the physical one (presented side by side). We conclude that the rankings depend on the criterion and they are not correlated. Considering both criteria, the best TMOs are KimKautz (Kim and Kautz, 2008) and Krawczyk (Krawczyk, Myszkowski, and Seidel, 2005). Another conclusion is that a more standardized evaluation criteria is needed to do a fair comparison among TMOs. Secondly, we have conducted several psychophysical experiments to study the color induction. We have studied two different properties of the visual stimuli: temporal frequency and luminance spatial distribution. To study the temporal frequency we defined equiluminant stimuli composed by both uniform and striped surrounds and we flashed them varying the flash duration. For uniform surrounds, the results show that color induction depends on both the flash duration and inducer’s chromaticity. As expected, in all chromatic conditions color contrast was induced. In contrast, for striped surrounds, we expected to induce color assimilation, but we observed color contrast or no induction. Since similar but not equiluminant striped stimuli induce color assimilation, we concluded that luminance differences could be a key factor to induce color assimilation. Thus, in a subsequent study, we have studied the luminance differences’ effect on color assimilation. We varied the luminance difference between the target region and its inducers and we observed that color assimilation depends on both this difference and the inducer’s chromaticity. For red-green condition (where the first inducer is red and the second one is green), color assimilation occurs in almost all luminance conditions. Instead, for green-red condition, color assimilation never occurs. Purple-lime and lime-purple chromatic conditions show that luminance difference is a key factor to induce color assimilation. When the target is darker than its surround, color assimilation is stronger in purple-lime, while when the target is brighter, color assimilation is stronger in lime-purple (’mirroring’ effect). Moreover, we evaluated whether color assimilation is due to luminance or brightness differences. Similarly to equiluminance condition, when the stimuli are equibrightness no color assimilation is induced. Our results support the hypothesis that mutual-inhibition plays a major role in color perception, or at least in color induction. Finally, we have defined a new firing rate model of color processing in the V1 parvocellular pathway. We have modeled two different layers of this cortical area: layers 4Cb and 2/3. Our model is a recurrent dynamic computational model that considers both excitatory and inhibitory cells and their lateral connections. Moreover, it considers the existent laminar differences and the cells’ variety. Thus, we have modeled both single- and double-opponent simple cells and complex cells, which are a pool of double-opponent simple cells. A set of sinusoidal drifting gratings have been used to test the architecture. In these gratings we have varied several spatial properties such as temporal and spatial frequencies, grating’s area and orientation. To reproduce the electrophysiological observations, the architecture has to consider the existence of non-oriented double-opponent cells in layer 4Cb and the lack of lateral connections between single-opponent cells. Moreover, we have tested our lateral connections simulating the center-surround modulation and we have reproduced physiological measurements where for high contrast stimulus, the result of the lateral connections is inhibitory, while it is facilitatory for low contrast stimulus.
	Address	March 2019
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Xavier Otazu
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-948531-4-2	Medium
	Area		Expedition		Conference
	Notes	NEUROBIT			Approved	no
	Call Number	Admin @ si @ Cer2019			Serial	3259
Permanent link to this record

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >> [11–20]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: