Ivana Sixtová

Scopus Publications

VISAnt: Unsupervised Data Exploration with Chernoff Faces
Ivana Sixtova, Ladislav Peska, Jakub Lokoč, David Bernhauer, Tomas Skopal
Lecture Notes in Computer Science, 2026
Unified Visual-Aware Representations for Data Analytics
Ladislav Peška, Ivaná Sixtova, D. Hoksza, D. Bernhauer, Jakub Lokoč, et al.
IEEE Access, 2025
One of the characteristics of big data is its internal complexity and variety manifested in many types of datasets that are to be managed, searched, or analyzed. In their natural forms, some data entities are unstructured, such as texts or multimedia objects, while some are structured but too complex (e.g., high-dimensional tabular data). Due to the many different forms of data managed in many domain-specific problems, there are many different data representations used – tailored to a specific data form, domain and task. In this paper, we propose a framework for universal visual representations of complex data. The desired property of the visualizations is the ability to visually encode the semantic features of the original data. Hence, processing of visualizations (images) by generic deep learning models results in deep feature vectors that could be uniformly used in standard data retrieval/analytics tasks. Specifically, we develop a semi-automated transfer learning pipeline for transformation of input arbitrary tabular data into visual representations. The visual representations serve for data analytics tasks performed by human users as well as serve for universal data representations used in machine learning models for automated tasks. We show in large study that visual representations of complex data are effective in a number of domains while we also propose a recommender to help with the parameterization of the entire pipeline for certain domains and use cases. In summary, the proposed framework enables rapid prototyping of data representations (in an arbitrary domain) using a shared concept – visual representations applicable in data analytics using generic deep learning models.
Visualizations for universal deep-feature representations: survey and taxonomy
Tomáš Skopal, Ladislav Peška, David Hoksza, Ivana Sixtová, David Bernhauer
Knowledge and Information Systems, 2024
In data science and content-based retrieval, we find many domain-specific techniques that employ a data processing pipeline with two fundamental steps. First, data entities are represented by some visualizations, while in the second step, the visualizations are used with a machine learning model to extract deep features. Deep convolutional neural networks (DCNN) became the standard and reliable choice. The purpose of using DCNN is either a specific classification task or just a deep feature representation of visual data for additional processing (e.g., similarity search). Whereas the deep feature extraction is a domain-agnostic step in the pipeline (inference of an arbitrary visual input), the visualization design itself is domain-dependent and ad hoc for every use case. In this paper, we survey and analyze many instances of data visualizations used with deep learning models (mostly DCNN) for domain-specific tasks. Based on the analysis, we synthesize a taxonomy that provides a systematic overview of visualization techniques suitable for usage with the models. The aim of the taxonomy is to enable the future generalization of the visualization design process to become completely domain-agnostic, leading to the automation of the entire feature extraction pipeline. As the ultimate goal, such an automated pipeline could lead to universal deep feature data representations for content-based retrieval.
Data analytics framework for sparse longitudinal structured biomedical data
Ivana Sixtová, Tomáš Uher, Tomáš Skopal
Proceedings 2023 2023 IEEE International Conference on Bioinformatics and Biomedicine Bibm 2023, 2023
An increasing amount of data is stored in electronic health records originating from laboratory, imaging, and clinical examinations. However, the automated employment of machine learning algorithms for clinical decision tasks is still limited in the case of long-term medical structured data, such as the observations of patients suffering from multiple sclerosis, including numerical laboratory results and volumes derived from brain MRI segmentation. The main reason is the complexity of these data caused by high dimensionality, irregular temporal nature, and incompleteness in both time and observation dimensions.This study introduces a comprehensive automated framework designed for an end-to-end analysis of longitudinal structured biomedical data. It comprises a preprocessing component, which includes several methods for regularization and missing values imputation. Following, a prediction component suitable for various classification and regression tasks features a range of traditional machine learning and deep neural network models. Finally, the data visualization component based on the Potential of Heat-diffusion for Affinity-based Trajectory identifies the patterns in these complex data.Evaluation of this framework was conducted on a real-world dataset involving patients with multiple sclerosis, addressing tasks such as classifying the patient’s disability state and predicting the patient’s future disability score. Additionally, with the data visualization techniques, the study demonstrates that even incomplete long-term medical time series data can unveil valuable insights.
Visual Representations for Data Analytics: User Study
Ladislav Peska, Ivana Sixtova, David Hoksza, David Bernhauer, Tomas Skopal
Communications in Computer and Information Science, 2023
Machine and human interpretable patient visualizations
Ivana Sixtova, Tomas Skopal, David Hoksza, Jakub Matejik, Tomas Uher
Proceedings 2022 IEEE International Conference on Bioinformatics and Biomedicine Bibm 2022, 2022
A growing amount of data is stored in electronic health records, which are crucial for the clinical decision-making process. A large part of these data has a tabular form, consisting of numerical and categorical values originating from various laboratory examinations and sensors. Unlike medical images and clinical notes, tabular data lack higher semantics and, combined with the high dimensionality and heterogeneity, their interpretation by a human is challenging. On the other hand, we have witnessed superior performance of deep convolutional neural network (DCNN) models in the visual medical domain. In this paper, we propose visual representations of complex tabular medical data readable simultaneously by humans and machines. To show that these representations can encode the patient’s data semantics effectively, we use them to fine-tune a DCNN to predict the disability level of patients suffering from multiple sclerosis. Our experiments show that the visual models could match the performance of non-visual models. Moreover, the visual representations add the benefit o f s ummarizing complex information about the patient’s state to a human.
Sentence diagrams: Their evaluation and combination
Law 2014 8th Linguistic Annotation Workshop in Conjunction with Coling 2014 Proceedings of the Workshop, 2020