@pucpr.br
PPGIA
Pontifícia Universidade Católica do Paraná (PUCPR)
Computer Vision and Pattern Recognition, Artificial Intelligence, Computer Science
Scopus Publications
Scholar Citations
Scholar h-index
Scholar i10-index
Bruna Rossetto Delazeri, Andre Gustavo Hochuli, Jean Paul Barddal, Alessandro Lameiras Koerich, and Alceu de Souza Britto
Springer Science and Business Media LLC
Maria Eduarda Maciel Pinto, Alceu De Souza Britto, and Andre Gustavo Hochuli
IEEE
Audio replay attacks present a significant challenge to automatic speaker verification systems (ASVs), emphasizing the need for effective detection methods. Traditionally, embedding-based approaches, such as those leveraging Convolutional Neural Networks (CNNs), have been used. However, dissimilarity-based methods emerge as a promising alternative, offering potential advantages in detecting subtle differences between genuine and spoofed audio. This study evaluates dissimilarity strategies for detecting genuine versus spoofed audio signals using a well-known benchmark dataset and established metrics, including accuracy and Equal Error Rate (EER). We provide a comparative performance assessment of various CNN architectures and dissimilarity strategies, finding that while dissimilarity approaches are competitive with embedding-based methods, the Dissimilarity Vectors strategy outperforms the Dissimilarity Space strategy.
Paulo Luza Alves, André Hochuli, Luiz Eduardo de Oliveira, and Paulo Lisboa de Almeida
IEEE
When deploying large-scale machine learning models for smart city applications, such as image-based parking lot monitoring, data often must be sent to a central server to perform classification tasks. This is challenging for the city's infrastructure, where image-based applications require transmitting large volumes of data, necessitating complex network and hardware infrastructures to process the data. To address this issue in image-based parking space classification, we propose creating a robust ensemble of classifiers to serve as Teacher models. These Teacher models are distilled into lightweight and specialized Student models that can be deployed directly on edge devices. The knowledge is distilled to the Student models through pseudo-labeled samples generated by the Teacher model, which are utilized to fine-tune the Student models on the target scenario. Our results show that the Student models, with 26 times fewer parameters than the Teacher models, achieved an average accuracy of 96.6 % on the target test datasets, surpassing the Teacher models, which attained an average accuracy of 95.3 %.
Andre Gustavo Hochuli, Jean Paul Barddal, Gillian Cezar Palhano, Leonardo Matheus Mendes, and Paulo Ricardo Lisboa de Almeida
IEEE
Searching for available parking spots in high-density urban centers is a stressful task for drivers that can be mitigated by systems that know in advance the nearest parking space available. To this end, image-based systems offer cost advantages over other sensor-based alternatives (e.g., ultrasonic sensors), requiring less physical infrastructure for installation and maintenance. Despite recent deep learning advances, de-ploying intelligent parking monitoring is still a challenge since most approaches involve collecting and labeling large amounts of data, which is laborious and time-consuming. Our study aims to uncover the challenges in creating a global framework, trained using publicly available labeled parking lot images, that performs accurately across diverse scenarios, enabling the parking space monitoring as a ready-to-use system to deploy in a new environment. Through exhaustive experiments involving different datasets and deep learning architectures, including fusion strategies and ensemble methods, we found that models trained on diverse datasets can achieve 95% accuracy without the burden of data annotation and model training on the target parking lot.
Paulo R. Lisboa de Almeida, Jeovane Honório Alves, Luiz S. Oliveira, Andre Gustavo Hochuli, João V. Fröhlich, and Rodrigo A. Krauel
IEEE
Smart-parking solutions use sensors, cameras, and data analysis to improve parking efficiency and reduce traffic congestion. Computer vision-based methods have been used extensively in recent years to tackle the problem of parking lot management, but most of the works assume that the parking spots are manually labeled, impacting the cost and feasibility of deployment. To fill this gap, this work presents an automatic parking space detection method, which receives a sequence of images of a parking lot and returns a list of coordinates identifying the detected parking spaces. The proposed method employs instance segmentation to identify cars and, using vehicle occurrence, generate a heat map of parking spaces. The results using twelve different subsets from the PKLot and CNRPark-EXT parking lot datasets show that the method achieved an AP25 score up to 95.60% and AP50 score up to 79.90%.
Matheus Moresco, Alceu De S. Britto, Yandre M. G. Costa, Luciano J. Senger, and Andre G. Hochuli
IEEE
The plant species classification using leaf images is a challenge due to the lack of annotation, imbalanced classes and similarities in the data representation. For such problems, Siamese Neural Networks (SNN’s) have been used to overcome these bottlenecks in several contexts. In light of this, this work evaluates different architectures trained in Siamese manner for classifying plant species from the leaf image. Besides, we combined features from the intermediate convolutional layers to improve representations. Experiments on the well-known Flavia and MalayaKew databases have shown that the fusion of intermediate features results in a relevant gain in performance.
Andre G. Hochuli, Alceu S. Britto, Paulo R. L. de Almeida, Williams B. S. Alves, and Fabio M. C. Cagni
IEEE
When using vision-based approaches to classify individual parking spaces between occupied and empty, human experts often need to annotate the locations and label a training set containing images collected in the target parking lot to fine-tune the system. We propose investigating three annotation types (polygons, bounding boxes, and fixed-size squares), providing different data representations of the parking spaces. The rationale is to elucidate the best trade-off between handcraft annotation precision and model performance. We also investigate the number of annotated parking spaces necessary to fine-tune a pre-trained model in the target parking lot. Experiments using the PKLot dataset show that it is possible to fine-tune a model to the target parking lot with less than 1,000 labeled samples, using low precision annotations such as fixed-size squares.
Andre G. Hochuli, Alceu S. Britto Jr, David A. Saji, José M. Saavedra, Robert Sabourin, and Luiz S. Oliveira
Elsevier BV
Mengqiao Zhao, Andre Gustavo Hochuli, and Abbas Cheddad
Springer International Publishing
Andre G. Hochuli, Alceu S. Britto, Jean P. Barddal, Robert Sabourin, and Luiz E. S. Oliveira
IEEE
An end-to-end solution for handwritten numeral string recognition is proposed, in which the numeral string is considered as composed of objects automatically detected and recognized by a YoLo-based model. The main contribution of this paper is to avoid heuristic-based methods for string preprocessing and segmentation, the need for task-oriented classifiers, and also the use of specific constraints related to the string length. A robust experimental protocol based on several numeral string datasets, including one composed of historical documents, has shown that the proposed method is a feasible end-to-end solution for numeral string recognition. Besides, it reduces the complexity of the string recognition task considerably since it drops out classical steps, in special preprocessing, segmentation, and a set of classifiers devoted to strings with a specific length.
Andre G. Hochuli, Luiz S. Oliveira, Alceu de Souza Britto, and Robert Sabourin
IEEE
This paper presents segmentation-free strategies for the recognition of handwritten numeral strings of unknown length. A synthetic dataset of touching numeral strings of sizes 2-, 3- and 4-digits was created to train end-to-end solutions based on Convolutional Neural Networks. A robust experimental protocol is used to show that the proposed segmentation-free methods may reach the state-of-the-art performance without suffering the heavy burden of over-segmentation based methods. In addition, they confirmed the importance of introducing contextual information in the design of end-to-end solutions, such as the proposed length classifier when recognizing numeral strings.
A.G. Hochuli, L.S. Oliveira, A.S. Britto Jr, and R. Sabourin
Elsevier BV
Andre G. Hochuli, Alceu S. Britto, and Alessandro L. Koerich
IEEE
This article presents a novel approach for detection of non-conventional events in videos scenes. This novel approach consists in analyzing in real-time video from a security camera to detect, segment and tracking objects in movement to further classify its movement as conventional or non-conventional. From each tracked object in the scene features such as position, speed, changes in directions and in the bounding box sizes are extracted. These features make up a feature vector. At the classification step, feature vectors generated from objects in movement in the scene are matched almost in real-time against reference feature vectors previously labeled which are stored in a database and an algorithm based on the instance-based learning paradigm is used to classify the object movement as conventional or non-conventional. Experimental results on video clips from two databases (Parking Lot and CAVIAR) have shown that the proposed approach is able to detect non-conventional events with accuracies between 77% and 82%.
A. G. Hochuli, L. E. S. Oliveira, A. S. Britto, and A. L. Koerich
Springer Berlin Heidelberg