|
Records |
Links |
|
Author |
Emanuele Vivoli; Ali Furkan Biten; Andres Mafla; Dimosthenis Karatzas; Lluis Gomez |
|
|
Title |
MUST-VQA: MUltilingual Scene-text VQA |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Proceedings European Conference on Computer Vision Workshops |
Abbreviated Journal |
|
|
|
Volume |
13804 |
Issue |
|
Pages |
345–358 |
|
|
Keywords |
Visual question answering; Scene text; Translation robustness; Multilingual models; Zero-shot transfer; Power of language models |
|
|
Abstract |
In this paper, we present a framework for Multilingual Scene Text Visual Question Answering that deals with new languages in a zero-shot fashion. Specifically, we consider the task of Scene Text Visual Question Answering (STVQA) in which the question can be asked in different languages and it is not necessarily aligned to the scene text language. Thus, we first introduce a natural step towards a more generalized version of STVQA: MUST-VQA. Accounting for this, we discuss two evaluation scenarios in the constrained setting, namely IID and zero-shot and we demonstrate that the models can perform on a par on a zero-shot setting. We further provide extensive experimentation and show the effectiveness of adapting multilingual language models into STVQA tasks. |
|
|
Address |
Tel-Aviv; Israel; October 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECCVW |
|
|
Notes |
DAG; 302.105; 600.155; 611.002 |
Approved |
no |
|
|
Call Number |
Admin @ si @ VBM2022 |
Serial |
3770 |
|
Permanent link to this record |
|
|
|
|
Author |
Sergi Garcia Bordils; Andres Mafla; Ali Furkan Biten; Oren Nuriel; Aviad Aberdam; Shai Mazor; Ron Litman; Dimosthenis Karatzas |
|
|
Title |
Out-of-Vocabulary Challenge Report |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Proceedings European Conference on Computer Vision Workshops |
Abbreviated Journal |
|
|
|
Volume |
13804 |
Issue |
|
Pages |
359–375 |
|
|
Keywords |
|
|
|
Abstract |
This paper presents final results of the Out-Of-Vocabulary 2022 (OOV) challenge. The OOV contest introduces an important aspect that is not commonly studied by Optical Character Recognition (OCR) models, namely, the recognition of unseen scene text instances at training time. The competition compiles a collection of public scene text datasets comprising of 326,385 images with 4,864,405 scene text instances, thus covering a wide range of data distributions. A new and independent validation and test set is formed with scene text instances that are out of vocabulary at training time. The competition was structured in two tasks, end-to-end and cropped scene text recognition respectively. A thorough analysis of results from baselines and different participants is presented. Interestingly, current state-of-the-art models show a significant performance gap under the newly studied setting. We conclude that the OOV dataset proposed in this challenge will be an essential area to be explored in order to develop scene text models that achieve more robust and generalized predictions. |
|
|
Address |
Tel-Aviv; Israel; October 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECCVW |
|
|
Notes |
DAG; 600.155; 302.105; 611.002 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GMB2022 |
Serial |
3771 |
|
Permanent link to this record |
|
|
|
|
Author |
Andrea Gemelli; Sanket Biswas; Enrico Civitelli; Josep Llados; Simone Marinai |
|
|
Title |
Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks |
Type |
Conference Article |
|
Year |
2022 |
Publication |
17th European Conference on Computer Vision Workshops |
Abbreviated Journal |
|
|
|
Volume |
13804 |
Issue |
|
Pages |
329–344 |
|
|
Keywords |
|
|
|
Abstract |
Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis. The application of Graph Neural Networks (GNNs) has become crucial in various document-related tasks since they can unravel important structural patterns, fundamental in key information extraction processes. Previous works in the literature propose task-driven models and do not take into account the full power of graphs. We propose Doc2Graph, a task-agnostic document understanding framework based on a GNN model, to solve different tasks given different types of documents. We evaluated our approach on two challenging datasets for key information extraction in form understanding, invoice layout analysis and table detection. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-031-25068-2 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECCV-TiE |
|
|
Notes |
DAG; 600.162; 600.140; 110.312 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GBC2022 |
Serial |
3795 |
|
Permanent link to this record |
|
|
|
|
Author |
Giuseppe De Gregorio; Sanket Biswas; Mohamed Ali Souibgui; Asma Bensalah; Josep Llados; Alicia Fornes; Angelo Marcelli |
|
|
Title |
A Few Shot Multi-representation Approach for N-Gram Spotting in Historical Manuscripts |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022) |
Abbreviated Journal |
|
|
|
Volume |
13639 |
Issue |
|
Pages |
3-12 |
|
|
Keywords |
N-gram spotting; Few-shot learning; Multimodal understanding; Historical handwritten collections |
|
|
Abstract |
Despite recent advances in automatic text recognition, the performance remains moderate when it comes to historical manuscripts. This is mainly because of the scarcity of available labelled data to train the data-hungry Handwritten Text Recognition (HTR) models. The Keyword Spotting System (KWS) provides a valid alternative to HTR due to the reduction in error rate, but it is usually limited to a closed reference vocabulary. In this paper, we propose a few-shot learning paradigm for spotting sequences of a few characters (N-gram) that requires a small amount of labelled training data. We exhibit that recognition of important n-grams could reduce the system’s dependency on vocabulary. In this case, an out-of-vocabulary (OOV) word in an input handwritten line image could be a sequence of n-grams that belong to the lexicon. An extensive experimental evaluation of our proposed multi-representation approach was carried out on a subset of Bentham’s historical manuscript collections to obtain some really promising results in this direction. |
|
|
Address |
December 04 – 07, 2022; Hyderabad, India |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG; 600.121; 600.162; 602.230; 600.140 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GBS2022 |
Serial |
3733 |
|
Permanent link to this record |
|
|
|
|
Author |
Arnau Baro; Pau Riba; Alicia Fornes |
|
|
Title |
Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022) |
Abbreviated Journal |
|
|
|
Volume |
13639 |
Issue |
|
Pages |
171-184 |
|
|
Keywords |
Object detection; Optical music recognition; Graph neural network |
|
|
Abstract |
During the last decades, the performance of optical music recognition has been increasingly improving. However, and despite the 2-dimensional nature of music notation (e.g. notes have rhythm and pitch), most works treat musical scores as a sequence of symbols in one dimension, which make their recognition still a challenge. Thus, in this work we explore the use of graph neural networks for musical score recognition. First, because graphs are suited for n-dimensional representations, and second, because the combination of graphs with deep learning has shown a great performance in similar applications. Our methodology consists of: First, we will detect each isolated/atomic symbols (those that can not be decomposed in more graphical primitives) and the primitives that form a musical symbol. Then, we will build the graph taking as root node the notehead and as leaves those primitives or symbols that modify the note’s rhythm (stem, beam, flag) or pitch (flat, sharp, natural). Finally, the graph is translated into a human-readable character sequence for a final transcription and evaluation. Our method has been tested on more than five thousand measures, showing promising results. |
|
|
Address |
December 04 – 07, 2022; Hyderabad, India |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG; 600.162; 600.140; 602.230 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRF2022b |
Serial |
3740 |
|
Permanent link to this record |
|
|
|
|
Author |
Utkarsh Porwal; Alicia Fornes; Faisal Shafait (eds) |
|
|
Title |
Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition. 18th International Conference, ICFHR 2022 |
Type |
Book Whole |
|
Year |
2022 |
Publication |
Frontiers in Handwriting Recognition. |
Abbreviated Journal |
|
|
|
Volume |
13639 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
ICFHR 2022, Hyderabad, India, December 4–7, 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer |
Place of Publication |
|
Editor |
Utkarsh Porwal; Alicia Fornes; Faisal Shafait |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-031-21648-0 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ PFS2022 |
Serial |
3809 |
|
Permanent link to this record |
|
|
|
|
Author |
Patricia Suarez; Angel Sappa; Dario Carpio; Henry Velesaca; Francisca Burgos; Patricia Urdiales |
|
|
Title |
Deep Learning Based Shrimp Classification |
Type |
Conference Article |
|
Year |
2022 |
Publication |
17th International Symposium on Visual Computing |
Abbreviated Journal |
|
|
|
Volume |
13598 |
Issue |
|
Pages |
36–45 |
|
|
Keywords |
Pigmentation; Color space; Light weight network |
|
|
Abstract |
This work proposes a novel approach based on deep learning to address the classification of shrimp (Pennaeus vannamei) into two classes, according to their level of pigmentation accepted by shrimp commerce. The main goal of this actual study is to support the shrimp industry in terms of price and process. An efficient CNN architecture is proposed to perform image classification through a program that could be set other in mobile devices or in fixed support in the shrimp supply chain. The proposed approach is a lightweight model that uses HSV color space shrimp images. A simple pipeline shows the most important stages performed to determine a pattern that identifies the class to which they belong based on their pigmentation. For the experiments, a database acquired with mobile devices of various brands and models has been used to capture images of shrimp. The results obtained with the images in the RGB and HSV color space allow for testing the effectiveness of the proposed model. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ISVC |
|
|
Notes |
MSIAU; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ SAC2022 |
Serial |
3772 |
|
Permanent link to this record |
|
|
|
|
Author |
Asma Bensalah; Alicia Fornes; Cristina Carmona_Duarte; Josep Llados |
|
|
Title |
Easing Automatic Neurorehabilitation via Classification and Smoothness Analysis |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022 |
Abbreviated Journal |
|
|
|
Volume |
13424 |
Issue |
|
Pages |
336-348 |
|
|
Keywords |
Neurorehabilitation; Upper-lim; Movement classification; Movement smoothness; Deep learning; Jerk |
|
|
Abstract |
Assessing the quality of movements for post-stroke patients during the rehabilitation phase is vital given that there is no standard stroke rehabilitation plan for all the patients. In fact, it depends basically on the patient’s functional independence and its progress along the rehabilitation sessions. To tackle this challenge and make neurorehabilitation more agile, we propose an automatic assessment pipeline that starts by recognising patients’ movements by means of a shallow deep learning architecture, then measuring the movement quality using jerk measure and related measures. A particularity of this work is that the dataset used is clinically relevant, since it represents movements inspired from Fugl-Meyer a well common upper-limb clinical stroke assessment scale for stroke patients. We show that it is possible to detect the contrast between healthy and patients movements in terms of smoothness, besides achieving conclusions about the patients’ progress during the rehabilitation sessions that correspond to the clinicians’ findings about each case. |
|
|
Address |
June 7-9, 2022, Las Palmas de Gran Canaria, Spain |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IGS |
|
|
Notes |
DAG; 600.121; 600.162; 602.230; 600.140 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BFC2022 |
Serial |
3738 |
|
Permanent link to this record |
|
|
|
|
Author |
Alicia Fornes; Asma Bensalah; Cristina Carmona_Duarte; Jialuo Chen; Miguel A. Ferrer; Andreas Fischer; Josep Llados; Cristina Martin; Eloy Opisso; Rejean Plamondon; Anna Scius-Bertrand; Josep Maria Tormos |
|
|
Title |
The RPM3D Project: 3D Kinematics for Remote Patient Monitoring |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022 |
Abbreviated Journal |
|
|
|
Volume |
13424 |
Issue |
|
Pages |
217-226 |
|
|
Keywords |
Healthcare applications; Kinematic; Theory of Rapid Human Movements; Human activity recognition; Stroke rehabilitation; 3D kinematics |
|
|
Abstract |
This project explores the feasibility of remote patient monitoring based on the analysis of 3D movements captured with smartwatches. We base our analysis on the Kinematic Theory of Rapid Human Movement. We have validated our research in a real case scenario for stroke rehabilitation at the Guttmann Institute (https://www.guttmann.com/en/) (neurorehabilitation hospital), showing promising results. Our work could have a great impact in remote healthcare applications, improving the medical efficiency and reducing the healthcare costs. Future steps include more clinical validation, developing multi-modal analysis architectures (analysing data from sensors, images, audio, etc.), and exploring the application of our technology to monitor other neurodegenerative diseases. |
|
|
Address |
June 7-9, 2022, Las Palmas de Gran Canaria, Spain |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IGS |
|
|
Notes |
DAG; 600.121; 600.162; 602.230; 600.140 |
Approved |
no |
|
|
Call Number |
Admin @ si @ FBC2022 |
Serial |
3739 |
|
Permanent link to this record |
|
|
|
|
Author |
Nil Ballus; Bhalaji Nagarajan; Petia Radeva |
|
|
Title |
Opt-SSL: An Enhanced Self-Supervised Framework for Food Recognition |
Type |
Conference Article |
|
Year |
2022 |
Publication |
10th Iberian Conference on Pattern Recognition and Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
13256 |
Issue |
|
Pages |
|
|
|
Keywords |
Self-supervised; Contrastive learning; Food recognition |
|
|
Abstract |
Self-supervised Learning has been showing upbeat performance in several computer vision tasks. The popular contrastive methods make use of a Siamese architecture with different loss functions. In this work, we go deeper into two very recent state of the art frameworks, namely, SimSiam and Barlow Twins. Inspired by them, we propose a new self-supervised learning method we call Opt-SSL that combines both image and feature contrasting. We validate the proposed method on the food recognition task, showing that our proposed framework enables the self-learning networks to learn better visual representations. |
|
|
Address |
Aveiro; Portugal; May 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IbPRIA |
|
|
Notes |
MILAB; no menciona |
Approved |
no |
|
|
Call Number |
Admin @ si @ BNR2022 |
Serial |
3782 |
|
Permanent link to this record |