toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Fei Yang; Luis Herranz; Yongmei Cheng; Mikhail Mozerov edit   pdf
url  doi
openurl 
  Title Slimmable compressive autoencoders for practical neural image compression Type Conference Article
  Year 2021 Publication 34th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (down) 4996-5005  
  Keywords  
  Abstract Neural image compression leverages deep neural networks to outperform traditional image codecs in rate-distortion performance. However, the resulting models are also heavy, computationally demanding and generally optimized for a single rate, limiting their practical use. Focusing on practical image compression, we propose slimmable compressive autoencoders (SlimCAEs), where rate (R) and distortion (D) are jointly optimized for different capacities. Once trained, encoders and decoders can be executed at different capacities, leading to different rates and complexities. We show that a successful implementation of SlimCAEs requires suitable capacity-specific RD tradeoffs. Our experiments show that SlimCAEs are highly flexible models that provide excellent rate-distortion performance, variable rate, and dynamic adjustment of memory, computational cost and latency, thus addressing the main requirements of practical image compression.  
  Address Virtual; June 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPR  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ YHC2021 Serial 3569  
Permanent link to this record
 

 
Author Jaykishan Patel; Alban Flachot; Javier Vazquez; David H. Brainard; Thomas S. A. Wallis; Marcus A. Brubaker; Richard F. Murray edit  url
openurl 
  Title A deep convolutional neural network trained to infer surface reflectance is deceived by mid-level lightness illusions Type Journal Article
  Year 2023 Publication Journal of Vision Abbreviated Journal JV  
  Volume 23 Issue 9 Pages (down) 4817-4817  
  Keywords  
  Abstract A long-standing view is that lightness illusions are by-products of strategies employed by the visual system to stabilize its perceptual representation of surface reflectance against changes in illumination. Computationally, one such strategy is to infer reflectance from the retinal image, and to base the lightness percept on this inference. CNNs trained to infer reflectance from images have proven successful at solving this problem under limited conditions. To evaluate whether these CNNs provide suitable starting points for computational models of human lightness perception, we tested a state-of-the-art CNN on several lightness illusions, and compared its behaviour to prior measurements of human performance. We trained a CNN (Yu & Smith, 2019) to infer reflectance from luminance images. The network had a 30-layer hourglass architecture with skip connections. We trained the network via supervised learning on 100K images, rendered in Blender, each showing randomly placed geometric objects (surfaces, cubes, tori, etc.), with random Lambertian reflectance patterns (solid, Voronoi, or low-pass noise), under randomized point+ambient lighting. The renderer also provided the ground-truth reflectance images required for training. After training, we applied the network to several visual illusions. These included the argyle, Koffka-Adelson, snake, White’s, checkerboard assimilation, and simultaneous contrast illusions, along with their controls where appropriate. The CNN correctly predicted larger illusions in the argyle, Koffka-Adelson, and snake images than in their controls. It also correctly predicted an assimilation effect in White's illusion. It did not, however, account for the checkerboard assimilation or simultaneous contrast effects. These results are consistent with the view that at least some lightness phenomena are by-products of a rational approach to inferring stable representations of physical properties from intrinsically ambiguous retinal images. Furthermore, they suggest that CNN models may be a promising starting point for new models of human lightness perception.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MACO; CIC Approved no  
  Call Number Admin @ si @ PFV2023 Serial 3890  
Permanent link to this record
 

 
Author Jaume Garcia; Albert Andaluz; Debora Gil; Francesc Carreras edit   pdf
url  doi
isbn  openurl
  Title Decoupled External Forces in a Predictor-Corrector Segmentation Scheme for LV Contours in Tagged MR Images Type Conference Article
  Year 2010 Publication 32nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society Abbreviated Journal  
  Volume Issue Pages (down) 4805-4808  
  Keywords  
  Abstract Computation of functional regional scores requires proper identification of LV contours. On one hand, manual segmentation is robust, but it is time consuming and requires high expertise. On the other hand, the tag pattern in TMR sequences is a problem for automatic segmentation of LV boundaries. We propose a segmentation method based on a predictorcorrector (Active Contours – Shape Models) scheme. Special stress is put in the definition of the AC external forces. First, we introduce a semantic description of the LV that discriminates myocardial tissue by using texture and motion descriptors. Second, in order to ensure convergence regardless of the initial contour, the external energy is decoupled according to the orientation of the edges in the image potential. We have validated the model in terms of error in segmented contours and accuracy of regional clinical scores.  
  Address Buenos Aires (Argentina)  
  Corporate Author IEEE EMB Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1557-170X ISBN 978-1-4244-4123-5 Medium  
  Area Expedition Conference EMBC  
  Notes IAM Approved no  
  Call Number IAM @ iam @ GAG2010 Serial 1514  
Permanent link to this record
 

 
Author Zhijie Fang; Antonio Lopez edit   pdf
url  doi
openurl 
  Title Intention Recognition of Pedestrians and Cyclists by 2D Pose Estimation Type Journal Article
  Year 2019 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS  
  Volume 21 Issue 11 Pages (down) 4773 - 4783  
  Keywords  
  Abstract Anticipating the intentions of vulnerable road users (VRUs) such as pedestrians and cyclists is critical for performing safe and comfortable driving maneuvers. This is the case for human driving and, thus, should be taken into account by systems providing any level of driving assistance, from advanced driver assistant systems (ADAS) to fully autonomous vehicles (AVs). In this paper, we show how the latest advances on monocular vision-based human pose estimation, i.e. those relying on deep Convolutional Neural Networks (CNNs), enable to recognize the intentions of such VRUs. In the case of cyclists, we assume that they follow traffic rules to indicate future maneuvers with arm signals. In the case of pedestrians, no indications can be assumed. Instead, we hypothesize that the walking pattern of a pedestrian allows to determine if he/she has the intention of crossing the road in the path of the ego-vehicle, so that the ego-vehicle must maneuver accordingly (e.g. slowing down or stopping). In this paper, we show how the same methodology can be used for recognizing pedestrians and cyclists' intentions. For pedestrians, we perform experiments on the JAAD dataset. For cyclists, we did not found an analogous dataset, thus, we created our own one by acquiring and annotating videos which we share with the research community. Overall, the proposed pipeline provides new state-of-the-art results on the intention recognition of VRUs.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ FaL2019 Serial 3305  
Permanent link to this record
 

 
Author Ciprian Corneanu; Meysam Madadi; Sergio Escalera; Aleix M. Martinez edit   pdf
url  doi
openurl 
  Title What does it mean to learn in deep networks? And, how does one detect adversarial attacks? Type Conference Article
  Year 2019 Publication 32nd IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (down) 4752-4761  
  Keywords  
  Abstract The flexibility and high-accuracy of Deep Neural Networks (DNNs) has transformed computer vision. But, the fact that we do not know when a specific DNN will work and when it will fail has resulted in a lack of trust. A clear example is self-driving cars; people are uncomfortable sitting in a car driven by algorithms that may fail under some unknown, unpredictable conditions. Interpretability and explainability approaches attempt to address this by uncovering what a DNN models, i.e., what each node (cell) in the network represents and what images are most likely to activate it. This can be used to generate, for example, adversarial attacks. But these approaches do not generally allow us to determine where a DNN will succeed or fail and why. i.e., does this learned representation generalize to unseen samples? Here, we derive a novel approach to define what it means to learn in deep networks, and how to use this knowledge to detect adversarial attacks. We show how this defines the ability of a network to generalize to unseen testing samples and, most importantly, why this is the case.  
  Address California; June 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPR  
  Notes HuPBA; no proj Approved no  
  Call Number Admin @ si @ CME2019 Serial 3332  
Permanent link to this record
 

 
Author Felipe Codevilla; Matthias Muller; Antonio Lopez; Vladlen Koltun; Alexey Dosovitskiy edit   pdf
doi  openurl
  Title End-to-end Driving via Conditional Imitation Learning Type Conference Article
  Year 2018 Publication IEEE International Conference on Robotics and Automation Abbreviated Journal  
  Volume Issue Pages (down) 4693 - 4700  
  Keywords  
  Abstract Deep networks trained on demonstrations of human driving have learned to follow roads and avoid obstacles. However, driving policies trained via imitation learning cannot be controlled at test time. A vehicle trained end-to-end to imitate an expert cannot be guided to take a specific turn at an upcoming intersection. This limits the utility of such systems. We propose to condition imitation learning on high-level command input. At test time, the learned driving policy functions as a chauffeur that handles sensorimotor coordination but continues to respond to navigational commands. We evaluate different architectures for conditional imitation learning in vision-based driving. We conduct experiments in realistic three-dimensional simulations of urban driving and on a 1/5 scale robotic truck that is trained to drive in a residential area. Both systems drive based on visual input yet remain responsive to high-level navigational commands. The supplementary video can be viewed at this https URL  
  Address Brisbane; Australia; May 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICRA  
  Notes ADAS; 600.116; 600.124; 600.118 Approved no  
  Call Number Admin @ si @ CML2018 Serial 3108  
Permanent link to this record
 

 
Author Angel Morera; Angel Sanchez; A. Belen Moreno; Angel Sappa; Jose F. Velez edit   pdf
url  openurl
  Title SSD vs. YOLO for Detection of Outdoor Urban Advertising Panels under Multiple Variabilities Type Journal Article
  Year 2020 Publication Sensors Abbreviated Journal SENS  
  Volume 20 Issue 16 Pages (down) 4587  
  Keywords  
  Abstract This work compares Single Shot MultiBox Detector (SSD) and You Only Look Once (YOLO) deep neural networks for the outdoor advertisement panel detection problem by handling multiple and combined variabilities in the scenes. Publicity panel detection in images offers important advantages both in the real world as well as in the virtual one. For example, applications like Google Street View can be used for Internet publicity and when detecting these ads panels in images, it could be possible to replace the publicity appearing inside the panels by another from a funding company. In our experiments, both SSD and YOLO detectors have produced acceptable results under variable sizes of panels, illumination conditions, viewing perspectives, partial occlusion of panels, complex background and multiple panels in scenes. Due to the difficulty of finding annotated images for the considered problem, we created our own dataset for conducting the experiments. The major strength of the SSD model was the almost elimination of False Positive (FP) cases, situation that is preferable when the publicity contained inside the panel is analyzed after detecting them. On the other side, YOLO produced better panel localization results detecting a higher number of True Positive (TP) panels with a higher accuracy. Finally, a comparison of the two analyzed object detection models with different types of semantic segmentation networks and using the same evaluation metrics is also included.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MSIAU; 600.130; 601.349; 600.122 Approved no  
  Call Number Admin @ si @ MSM2020 Serial 3452  
Permanent link to this record
 

 
Author Xinhang Song; Shuqiang Jiang; Luis Herranz edit   pdf
doi  openurl
  Title Combining Models from Multiple Sources for RGB-D Scene Recognition Type Conference Article
  Year 2017 Publication 26th International Joint Conference on Artificial Intelligence Abbreviated Journal  
  Volume Issue Pages (down) 4523-4529  
  Keywords Robotics and Vision; Vision and Perception  
  Abstract Depth can complement RGB with useful cues about object volumes and scene layout. However, RGB-D image datasets are still too small for directly training deep convolutional neural networks (CNNs), in contrast to the massive monomodal RGB datasets. Previous works in RGB-D recognition typically combine two separate networks for RGB and depth data, pretrained with a large RGB dataset and then fine tuned to the respective target RGB and depth datasets. These approaches have several limitations: 1) only use low-level filters learned from RGB data, thus not being able to exploit properly depth-specific patterns, and 2) RGB and depth features are only combined at high-levels but rarely at lower-levels. In this paper, we propose a framework that leverages both knowledge acquired from large RGB datasets together with depth-specific cues learned from the limited depth data, obtaining more effective multi-source and multi-modal representations. We propose a multi-modal combination method that selects discriminative combinations of layers from the different source models and target modalities, capturing both high-level properties of the task and intrinsic low-level properties of both modalities.  
  Address Melbourne; Australia; August 2017  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IJCAI  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ SJH2017b Serial 2966  
Permanent link to this record
 

 
Author Klara Janousckova; Jiri Matas; Lluis Gomez; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title Text Recognition – Real World Data and Where to Find Them Type Conference Article
  Year 2020 Publication 25th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages (down) 4489-4496  
  Keywords  
  Abstract We present a method for exploiting weakly annotated images to improve text extraction pipelines. The approach uses an arbitrary end-to-end text recognition system to obtain text region proposals and their, possibly erroneous, transcriptions. The method includes matching of imprecise transcriptions to weak annotations and an edit distance guided neighbourhood search. It produces nearly error-free, localised instances of scene text, which we treat as “pseudo ground truth” (PGT). The method is applied to two weakly-annotated datasets. Training with the extracted PGT consistently improves the accuracy of a state of the art recognition model, by 3.7% on average, across different benchmark datasets (image domains) and 24.5% on one of the weakly annotated datasets 1 1 Acknowledgements. The authors were supported by Czech Technical University student grant SGS20/171/0HK3/3TJ13, the MEYS VVV project CZ.02.1.01/0.010.0J16 019/0000765 Research Center for Informatics, the Spanish Research project TIN2017-89779-P and the CERCA Programme / Generalitat de Catalunya.  
  Address Virtual; January 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ JMG2020 Serial 3557  
Permanent link to this record
 

 
Author Joan Marti; Jose Miguel Benedi; Ana Maria Mendonça; Joan Serrat edit  openurl
  Title Pattern Recognition and Image Analysis Type Book Whole
  Year 2007 Publication 3rd Iberian Conference Abbreviated Journal  
  Volume 6669 Issue Pages (down) 4477-4478  
  Keywords  
  Abstract  
  Address Girona (Spain)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IbPRIA  
  Notes ADAS Approved no  
  Call Number ADAS @ adas @ MBM2007 Serial 994  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: