toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
  Records Links
Author Mohamed Ali Souibgui; Alicia Fornes; Yousri Kessentini; Beata Megyesi edit  doi
openurl 
  Title Few shots are all you need: A progressive learning approach for low resource handwritten text recognition Type Journal Article
  Year (down) 2022 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 160 Issue Pages 43-49  
  Keywords  
  Abstract Handwritten text recognition in low resource scenarios, such as manuscripts with rare alphabets, is a challenging problem. In this paper, we propose a few-shot learning-based handwriting recognition approach that significantly reduces the human annotation process, by requiring only a few images of each alphabet symbols. The method consists of detecting all the symbols of a given alphabet in a textline image and decoding the obtained similarity scores to the final sequence of transcribed symbols. Our model is first pretrained on synthetic line images generated from an alphabet, which could differ from the alphabet of the target domain. A second training step is then applied to reduce the gap between the source and the target data. Since this retraining would require annotation of thousands of handwritten symbols together with their bounding boxes, we propose to avoid such human effort through an unsupervised progressive learning approach that automatically assigns pseudo-labels to the unlabeled data. The evaluation on different datasets shows that our model can lead to competitive results with a significant reduction in human effort. The code will be publicly available in the following repository: https://github.com/dali92002/HTRbyMatching  
  Address  
  Corporate Author Thesis  
  Publisher Elsevier Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.162; 602.230 Approved no  
  Call Number Admin @ si @ SFK2022 Serial 3736  
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Sanket Biswas; Sana Khamekhem Jemni; Yousri Kessentini; Alicia Fornes; Josep Llados; Umapada Pal edit   pdf
doi  openurl
  Title DocEnTr: An End-to-End Document Image Enhancement Transformer Type Conference Article
  Year (down) 2022 Publication 26th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 1699-1705  
  Keywords Degradation; Head; Optical character recognition; Self-supervised learning; Benchmark testing; Transformers; Magnetic heads  
  Abstract Document images can be affected by many degradation scenarios, which cause recognition and processing difficulties. In this age of digitization, it is important to denoise them for proper usage. To address this challenge, we present a new encoder-decoder architecture based on vision transformers to enhance both machine-printed and handwritten document images, in an end-to-end fashion. The encoder operates directly on the pixel patches with their positional information without the use of any convolutional layers, while the decoder reconstructs a clean image from the encoded patches. Conducted experiments show a superiority of the proposed model compared to the state-of the-art methods on several DIBCO benchmarks. Code and models will be publicly available at: https://github.com/dali92002/DocEnTR  
  Address August 21-25, 2022 , Montréal Québec  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no  
  Call Number Admin @ si @ SBJ2022 Serial 3730  
Permanent link to this record
 

 
Author Mohamed Ramzy Ibrahim; Robert Benavente; Felipe Lumbreras; Daniel Ponsa edit   pdf
doi  openurl
  Title 3DRRDB: Super Resolution of Multiple Remote Sensing Images using 3D Residual in Residual Dense Blocks Type Conference Article
  Year (down) 2022 Publication CVPR 2022 Workshop on IEEE Perception Beyond the Visible Spectrum workshop series (PBVS, 18th Edition) Abbreviated Journal  
  Volume Issue Pages  
  Keywords Training; Solid modeling; Three-dimensional displays; PSNR; Convolution; Superresolution; Pattern recognition  
  Abstract The rapid advancement of Deep Convolutional Neural Networks helped in solving many remote sensing problems, especially the problems of super-resolution. However, most state-of-the-art methods focus more on Single Image Super-Resolution neglecting Multi-Image Super-Resolution. In this work, a new proposed 3D Residual in Residual Dense Blocks model (3DRRDB) focuses on remote sensing Multi-Image Super-Resolution for two different single spectral bands. The proposed 3DRRDB model explores the idea of 3D convolution layers in deeply connected Dense Blocks and the effect of local and global residual connections with residual scaling in Multi-Image Super-Resolution. The model tested on the Proba-V challenge dataset shows a significant improvement above the current state-of-the-art models scoring a Corrected Peak Signal to Noise Ratio (cPSNR) of 48.79 dB and 50.83 dB for Near Infrared (NIR) and RED Bands respectively. Moreover, the proposed 3DRRDB model scores a Corrected Structural Similarity Index Measure (cSSIM) of 0.9865 and 0.9909 for NIR and RED bands respectively.  
  Address New Orleans, USA; 19 June 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes MSIAU; 600.130 Approved no  
  Call Number Admin @ si @ IBL2022 Serial 3693  
Permanent link to this record
 

 
Author Nil Ballus; Bhalaji Nagarajan; Petia Radeva edit  url
doi  openurl
  Title Opt-SSL: An Enhanced Self-Supervised Framework for Food Recognition Type Conference Article
  Year (down) 2022 Publication 10th Iberian Conference on Pattern Recognition and Image Analysis Abbreviated Journal  
  Volume 13256 Issue Pages  
  Keywords Self-supervised; Contrastive learning; Food recognition  
  Abstract Self-supervised Learning has been showing upbeat performance in several computer vision tasks. The popular contrastive methods make use of a Siamese architecture with different loss functions. In this work, we go deeper into two very recent state of the art frameworks, namely, SimSiam and Barlow Twins. Inspired by them, we propose a new self-supervised learning method we call Opt-SSL that combines both image and feature contrasting. We validate the proposed method on the food recognition task, showing that our proposed framework enables the self-learning networks to learn better visual representations.  
  Address Aveiro; Portugal; May 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IbPRIA  
  Notes MILAB; no menciona Approved no  
  Call Number Admin @ si @ BNR2022 Serial 3782  
Permanent link to this record
 

 
Author Oriol Ramos Terrades; Albert Berenguel; Debora Gil edit   pdf
doi  openurl
  Title A Flexible Outlier Detector Based on a Topology Given by Graph Communities Type Journal Article
  Year (down) 2022 Publication Big Data Research Abbreviated Journal BDR  
  Volume 29 Issue Pages 100332  
  Keywords Classification algorithms; Detection algorithms; Description of feature space local structure; Graph communities; Machine learning algorithms; Outlier detectors  
  Abstract Outlier detection is essential for optimal performance of machine learning methods and statistical predictive models. Their detection is especially determinant in small sample size unbalanced problems, since in such settings outliers become highly influential and significantly bias models. This particular experimental settings are usual in medical applications, like diagnosis of rare pathologies, outcome of experimental personalized treatments or pandemic emergencies. In contrast to population-based methods, neighborhood based local approaches compute an outlier score from the neighbors of each sample, are simple flexible methods that have the potential to perform well in small sample size unbalanced problems. A main concern of local approaches is the impact that the computation of each sample neighborhood has on the method performance. Most approaches use a distance in the feature space to define a single neighborhood that requires careful selection of several parameters, like the number of neighbors.
This work presents a local approach based on a local measure of the heterogeneity of sample labels in the feature space considered as a topological manifold. Topology is computed using the communities of a weighted graph codifying mutual nearest neighbors in the feature space. This way, we provide with a set of multiple neighborhoods able to describe the structure of complex spaces without parameter fine tuning. The extensive experiments on real-world and synthetic data sets show that our approach outperforms, both, local and global strategies in multi and single view settings.
 
  Address August 28, 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; IAM; 600.140; 600.121; 600.139; 600.145; 600.159 Approved no  
  Call Number Admin @ si @ RBG2022a Serial 3718  
Permanent link to this record
 

 
Author Parichehr Behjati Ardakani edit  isbn
openurl 
  Title Towards Efficient and Robust Convolutional Neural Networks for Single Image Super-Resolution Type Book Whole
  Year (down) 2022 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Single image super-resolution (SISR) is an important task in image processing which aims to enhance the resolution of imaging systems. Recently, SISR has witnessed great strides with the rapid development of deep learning. Recent advances in SISR are mostly devoted to designing deeper and wider networks to enhance their representation learning capacity. However, as the depth of networks increases, deep learning-based methods are faced with the challenge of computational complexity in practice. Moreover, most existing methods rarely leverage the intermediate features and also do not discriminate the computation of features by their frequencial components, thereby achieving relatively low performance. Aside from the aforementioned problems, another desired ability is to upsample images to arbitrary scales using a single model. Most current SISR methods train a dedicated model for each target resolution, losing generality and increasing memory requirements. In this thesis, we address the aforementioned issues and propose solutions to them: i) We present a novel frequency-based enhancement block which treats different frequencies in a heterogeneous way and also models inter-channel dependencies, which consequently enrich the output feature. Thus it helps the network generate more discriminative representations by explicitly recovering finer details. ii) We introduce OverNet which contains two main parts: a lightweight feature extractor that follows a novel recursive framework of skip and dense connections to reduce low-level feature degradation, and an overscaling module that generates an accurate SR image by internally constructing an overscaled intermediate representation of the output features. Then, to solve the problem of reconstruction at arbitrary scale factors, we introduce a novel multi-scale loss, that allows the simultaneous training of all scale factors using a single model. iii) We propose a directional variance attention network which leverages a novel attention mechanism to enhance features in different channels and spatial regions. Moreover, we introduce a novel procedure for using attention mechanisms together with residual blocks to facilitate the preservation of finer details. Finally, we demonstrate that our approaches achieve considerably better performance than previous state-of-the-art methods, in terms of both quantitative and visual quality.  
  Address April, 2022  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Place of Publication Editor Jordi Gonzalez;Xavier Roca;Pau Rodriguez  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-124793-1-7 Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ Beh2022 Serial 3713  
Permanent link to this record
 

 
Author Parichehr Behjati Ardakani; Pau Rodriguez; Carles Fernandez; Armin Mehri; Xavier Roca; Seiichi Ozawa; Jordi Gonzalez edit  doi
openurl 
  Title Frequency-based Enhancement Network for Efficient Super-Resolution Type Journal Article
  Year (down) 2022 Publication IEEE Access Abbreviated Journal ACCESS  
  Volume 10 Issue Pages 57383-57397  
  Keywords Deep learning; Frequency-based methods; Lightweight architectures; Single image super-resolution  
  Abstract Recently, deep convolutional neural networks (CNNs) have provided outstanding performance in single image super-resolution (SISR). Despite their remarkable performance, the lack of high-frequency information in the recovered images remains a core problem. Moreover, as the networks increase in depth and width, deep CNN-based SR methods are faced with the challenge of computational complexity in practice. A promising and under-explored solution is to adapt the amount of compute based on the different frequency bands of the input. To this end, we present a novel Frequency-based Enhancement Block (FEB) which explicitly enhances the information of high frequencies while forwarding low-frequencies to the output. In particular, this block efficiently decomposes features into low- and high-frequency and assigns more computation to high-frequency ones. Thus, it can help the network generate more discriminative representations by explicitly recovering finer details. Our FEB design is simple and generic and can be used as a direct replacement of commonly used SR blocks with no need to change network architectures. We experimentally show that when replacing SR blocks with FEB we consistently improve the reconstruction error, while reducing the number of parameters in the model. Moreover, we propose a lightweight SR model — Frequency-based Enhancement Network (FENet) — based on FEB that matches the performance of larger models. Extensive experiments demonstrate that our proposal performs favorably against the state-of-the-art SR algorithms in terms of visual quality, memory footprint, and inference time. The code is available at https://github.com/pbehjatii/FENet  
  Address 18 May 2022  
  Corporate Author Thesis  
  Publisher IEEE Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ BRF2022a Serial 3747  
Permanent link to this record
 

 
Author Patricia Suarez; Angel Sappa; Dario Carpio; Henry Velesaca; Francisca Burgos; Patricia Urdiales edit   pdf
url  doi
openurl 
  Title Deep Learning Based Shrimp Classification Type Conference Article
  Year (down) 2022 Publication 17th International Symposium on Visual Computing Abbreviated Journal  
  Volume 13598 Issue Pages 36–45  
  Keywords Pigmentation; Color space; Light weight network  
  Abstract This work proposes a novel approach based on deep learning to address the classification of shrimp (Pennaeus vannamei) into two classes, according to their level of pigmentation accepted by shrimp commerce. The main goal of this actual study is to support the shrimp industry in terms of price and process. An efficient CNN architecture is proposed to perform image classification through a program that could be set other in mobile devices or in fixed support in the shrimp supply chain. The proposed approach is a lightweight model that uses HSV color space shrimp images. A simple pipeline shows the most important stages performed to determine a pattern that identifies the class to which they belong based on their pigmentation. For the experiments, a database acquired with mobile devices of various brands and models has been used to capture images of shrimp. The results obtained with the images in the RGB and HSV color space allow for testing the effectiveness of the proposed model.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ISVC  
  Notes MSIAU; no proj Approved no  
  Call Number Admin @ si @ SAC2022 Serial 3772  
Permanent link to this record
 

 
Author Patricia Suarez; Dario Carpio; Angel Sappa; Henry Velesaca edit   pdf
url  doi
openurl 
  Title Transformer based Image Dehazing Type Conference Article
  Year (down) 2022 Publication 16th IEEE International Conference on Signal Image Technology & Internet Based System Abbreviated Journal  
  Volume Issue Pages  
  Keywords atmospheric light; brightness component; computational cost; dehazing quality; haze-free image  
  Abstract This paper presents a novel approach to remove non homogeneous haze from real images. The proposed method consists mainly of image feature extraction, haze removal, and image reconstruction. To accomplish this challenging task, we propose an architecture based on transformers, which have been recently introduced and have shown great potential in different computer vision tasks. Our model is based on the SwinIR an image restoration architecture based on a transformer, but by modifying the deep feature extraction module, the depth level of the model, and by applying a combined loss function that improves styling and adapts the model for the non-homogeneous haze removal present in images. The obtained results prove to be superior to those obtained by state-of-the-art models.  
  Address Dijon; France; October 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference SITIS  
  Notes MSIAU; no proj Approved no  
  Call Number Admin @ si @ SCS2022 Serial 3803  
Permanent link to this record
 

 
Author Pau Riba; Lutz Goldmann; Oriol Ramos Terrades; Diede Rusticus; Alicia Fornes; Josep Llados edit  doi
openurl 
  Title Table detection in business document images by message passing networks Type Journal Article
  Year (down) 2022 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 127 Issue Pages 108641  
  Keywords  
  Abstract Tabular structures in business documents offer a complementary dimension to the raw textual data. For instance, there is information about the relationships among pieces of information. Nowadays, digital mailroom applications have become a key service for workflow automation. Therefore, the detection and interpretation of tables is crucial. With the recent advances in information extraction, table detection and recognition has gained interest in document image analysis, in particular, with the absence of rule lines and unknown information about rows and columns. However, business documents usually contain sensitive contents limiting the amount of public benchmarking datasets. In this paper, we propose a graph-based approach for detecting tables in document images which do not require the raw content of the document. Hence, the sensitive content can be previously removed and, instead of using the raw image or textual content, we propose a purely structural approach to keep sensitive data anonymous. Our framework uses graph neural networks (GNNs) to describe the local repetitive structures that constitute a table. In particular, our main application domain are business documents. We have carefully validated our approach in two invoice datasets and a modern document benchmark. Our experiments demonstrate that tables can be detected by purely structural approaches.  
  Address July 2022  
  Corporate Author Thesis  
  Publisher Elsevier Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.162; 600.121 Approved no  
  Call Number Admin @ si @ RGR2022 Serial 3729  
Permanent link to this record
Select All    Deselect All
 |   | 
Details

Save Citations:
Export Records: