toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Rozenn Dhayot; Fernando Vilariño; Gerard Lacey edit  doi
openurl 
  Title Improving the Quality of Color Colonoscopy Videos Type Journal Article
  Year 2008 Publication EURASIP Journal on Image and Video Processing Abbreviated Journal EURASIP JIVP  
  Volume (down) 139429 Issue 1 Pages 1-9  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area 800 Expedition Conference  
  Notes MV;SIAI Approved no  
  Call Number fernando @ fernando @ Serial 2422  
Permanent link to this record
 

 
Author Yael Tudela; Ana Garcia Rodriguez; Gloria Fernandez Esparrach; Jorge Bernal edit  url
doi  openurl
  Title Towards Fine-Grained Polyp Segmentation and Classification Type Conference Article
  Year 2023 Publication Workshop on Clinical Image-Based Procedures Abbreviated Journal  
  Volume (down) 14242 Issue Pages 32-42  
  Keywords Medical image segmentation; Colorectal Cancer; Vision Transformer; Classification  
  Abstract Colorectal cancer is one of the main causes of cancer death worldwide. Colonoscopy is the gold standard screening tool as it allows lesion detection and removal during the same procedure. During the last decades, several efforts have been made to develop CAD systems to assist clinicians in lesion detection and classification. Regarding the latter, and in order to be used in the exploration room as part of resect and discard or leave-in-situ strategies, these systems must identify correctly all different lesion types. This is a challenging task, as the data used to train these systems presents great inter-class similarity, high class imbalance, and low representation of clinically relevant histology classes such as serrated sessile adenomas.

In this paper, a new polyp segmentation and classification method, Swin-Expand, is introduced. Based on Swin-Transformer, it uses a simple and lightweight decoder. The performance of this method has been assessed on a novel dataset, comprising 1126 high-definition images representing the three main histological classes. Results show a clear improvement in both segmentation and classification performance, also achieving competitive results when tested in public datasets. These results confirm that both the method and the data are important to obtain more accurate polyp representations.
 
  Address Vancouver; October 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference MICCAIW  
  Notes ISE Approved no  
  Call Number Admin @ si @ TGF2023 Serial 3837  
Permanent link to this record
 

 
Author Patricia Suarez; Dario Carpio; Angel Sappa edit  url
openurl 
  Title A Deep Learning Based Approach for Synthesizing Realistic Depth Maps Type Conference Article
  Year 2023 Publication 22nd International Conference on Image Analysis and Processing Abbreviated Journal  
  Volume (down) 14234 Issue Pages 369–380  
  Keywords  
  Abstract This paper presents a novel cycle generative adversarial network (CycleGAN) architecture for synthesizing high-quality depth maps from a given monocular image. The proposed architecture uses multiple loss functions, including cycle consistency, contrastive, identity, and least square losses, to enable the generation of realistic and high-fidelity depth maps. The proposed approach addresses this challenge by synthesizing depth maps from RGB images without requiring paired training data. Comparisons with several state-of-the-art approaches are provided showing the proposed approach overcome other approaches both in terms of quantitative metrics and visual quality.  
  Address Udine; Italia; Setember 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIAP  
  Notes MSIAU Approved no  
  Call Number Admin @ si @ SCS2023 Serial 3968  
Permanent link to this record
 

 
Author Gisel Bastidas-Guacho; Patricio Moreno; Boris X. Vintimilla; Angel Sappa edit  url
doi  openurl
  Title Application on the Loop of Multimodal Image Fusion: Trends on Deep-Learning Based Approaches Type Conference Article
  Year 2023 Publication 13th International Conference on Pattern Recognition Systems Abbreviated Journal  
  Volume (down) 14234 Issue Pages 25–36  
  Keywords  
  Abstract Multimodal image fusion allows the combination of information from different modalities, which is useful for tasks such as object detection, edge detection, and tracking, to name a few. Using the fused representation for applications results in better task performance. There are several image fusion approaches, which have been summarized in surveys. However, the existing surveys focus on image fusion approaches where the application on the loop of multimodal image fusion is not considered. On the contrary, this study summarizes deep learning-based multimodal image fusion for computer vision (e.g., object detection) and image processing applications (e.g., semantic segmentation), that is, approaches where the application module leverages the multimodal fusion process to enhance the final result. Firstly, we introduce image fusion and the existing general frameworks for image fusion tasks such as multifocus, multiexposure and multimodal. Then, we describe the multimodal image fusion approaches. Next, we review the state-of-the-art deep learning multimodal image fusion approaches for vision applications. Finally, we conclude our survey with the trends of task-driven multimodal image fusion.  
  Address Guayaquil; Ecuador; July 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPRS  
  Notes MSIAU Approved no  
  Call Number Admin @ si @ BMV2023 Serial 3932  
Permanent link to this record
 

 
Author Mickael Coustaty; Alicia Fornes edit  url
openurl 
  Title Document Analysis and Recognition – ICDAR 2023 Workshops Type Book Whole
  Year 2023 Publication Document Analysis and Recognition – ICDAR 2023 Workshops Abbreviated Journal  
  Volume (down) 14194 Issue 2 Pages  
  Keywords  
  Abstract  
  Address San Jose; USA; August 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ CoF2023 Serial 3852  
Permanent link to this record
 

 
Author Pau Torras; Mohamed Ali Souibgui; Sanket Biswas; Alicia Fornes edit  url
openurl 
  Title Segmentation-Free Alignment of Arbitrary Symbol Transcripts to Images Type Conference Article
  Year 2023 Publication Document Analysis and Recognition – ICDAR 2023 Workshops Abbreviated Journal  
  Volume (down) 14193 Issue Pages 83-93  
  Keywords Historical Manuscripts; Symbol Alignment  
  Abstract Developing arbitrary symbol recognition systems is a challenging endeavour. Even using content-agnostic architectures such as few-shot models, performance can be substantially improved by providing a number of well-annotated examples into training. In some contexts, transcripts of the symbols are available without any position information associated to them, which enables using line-level recognition architectures. A way of providing this position information to detection-based architectures is finding systems that can align the input symbols with the transcription. In this paper we discuss some symbol alignment techniques that are suitable for low-data scenarios and provide an insight on their perceived strengths and weaknesses. In particular, we study the usage of Connectionist Temporal Classification models, Attention-Based Sequence to Sequence models and we compare them with the results obtained on a few-shot recognition system.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ TSS2023 Serial 3850  
Permanent link to this record
 

 
Author Adarsh Tiwari; Sanket Biswas; Josep Llados edit  url
openurl 
  Title Can Pre-trained Language Models Help in Understanding Handwritten Symbols? Type Conference Article
  Year 2023 Publication 17th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume (down) 14193 Issue Pages 199–211  
  Keywords  
  Abstract The emergence of transformer models like BERT, GPT-2, GPT-3, RoBERTa, T5 for natural language understanding tasks has opened the floodgates towards solving a wide array of machine learning tasks in other modalities like images, audio, music, sketches and so on. These language models are domain-agnostic and as a result could be applied to 1-D sequences of any kind. However, the key challenge lies in bridging the modality gap so that they could generate strong features beneficial for out-of-domain tasks. This work focuses on leveraging the power of such pre-trained language models and discusses the challenges in predicting challenging handwritten symbols and alphabets.  
  Address San Jose; CA; USA; August 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ TBL2023 Serial 3908  
Permanent link to this record
 

 
Author George Tom; Minesh Mathew; Sergi Garcia Bordils; Dimosthenis Karatzas; CV Jawahar edit  url
openurl 
  Title Reading Between the Lanes: Text VideoQA on the Road Type Conference Article
  Year 2023 Publication 17th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume (down) 14192 Issue Pages 137–154  
  Keywords VideoQA; scene text; driving videos  
  Abstract Text and signs around roads provide crucial information for drivers, vital for safe navigation and situational awareness. Scene text recognition in motion is a challenging problem, while textual cues typically appear for a short time span, and early detection at a distance is necessary. Systems that exploit such information to assist the driver should not only extract and incorporate visual and textual cues from the video stream but also reason over time. To address this issue, we introduce RoadTextVQA, a new dataset for the task of video question answering (VideoQA) in the context of driver assistance. RoadTextVQA consists of 3, 222 driving videos collected from multiple countries, annotated with 10, 500 questions, all based on text or road signs present in the driving videos. We assess the performance of state-of-the-art video question answering models on our RoadTextVQA dataset, highlighting the significant potential for improvement in this domain and the usefulness of the dataset in advancing research on in-vehicle support systems and text-aware multimodal question answering. The dataset is available at http://cvit.iiit.ac.in/research/projects/cvit-projects/roadtextvqa.  
  Address San Jose; CA; USA; August 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ TMG2023 Serial 3906  
Permanent link to this record
 

 
Author Sergi Garcia Bordils; Dimosthenis Karatzas; Marçal Rusiñol edit  url
openurl 
  Title Accelerating Transformer-Based Scene Text Detection and Recognition via Token Pruning Type Conference Article
  Year 2023 Publication 17th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume (down) 14192 Issue Pages 106-121  
  Keywords Scene Text Detection; Scene Text Recognition; Transformer Acceleration  
  Abstract Scene text detection and recognition is a crucial task in computer vision with numerous real-world applications. Transformer-based approaches are behind all current state-of-the-art models and have achieved excellent performance. However, the computational requirements of the transformer architecture makes training these methods slow and resource heavy. In this paper, we introduce a new token pruning strategy that significantly decreases training and inference times without sacrificing performance, striking a balance between accuracy and speed. We have applied this pruning technique to our own end-to-end transformer-based scene text understanding architecture. Our method uses a separate detection branch to guide the pruning of uninformative image features, which significantly reduces the number of tokens at the input of the transformer. Experimental results show how our network is able to obtain competitive results on multiple public benchmarks while running at significantly higher speeds.  
  Address San Jose; CA; USA; August 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ GKR2023a Serial 3907  
Permanent link to this record
 

 
Author Francesc Net; Marc Folia; Pep Casals; Lluis Gomez edit  url
openurl 
  Title Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections Type Conference Article
  Year 2023 Publication 17th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume (down) 14191 Issue Pages 3-17  
  Keywords Image deduplication; Near-duplicate images detection; Transductive Learning; Photographic Archives; Deep Learning  
  Abstract This paper presents a comparative study of near-duplicate image detection techniques in a real-world use case scenario, where a document management company is commissioned to manually annotate a collection of scanned photographs. Detecting duplicate and near-duplicate photographs can reduce the time spent on manual annotation by archivists. This real use case differs from laboratory settings as the deployment dataset is available in advance, allowing the use of transductive learning. We propose a transductive learning approach that leverages state-of-the-art deep learning architectures such as convolutional neural networks (CNNs) and Vision Transformers (ViTs). Our approach involves pre-training a deep neural network on a large dataset and then fine-tuning the network on the unlabeled target collection with self-supervised learning. The results show that the proposed approach outperforms the baseline methods in the task of near-duplicate image detection in the UKBench and an in-house private dataset.  
  Address San Jose; CA; USA; August 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ NFC2023 Serial 3859  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: