Ivana Sixtová

@mff.cuni.cz

Department of Software Engineering
Charles University

5

Scopus Publications

Scopus Publications

  • Visualizations for universal deep-feature representations: survey and taxonomy
    Tomáš Skopal, Ladislav Peška, David Hoksza, Ivana Sixtová, and David Bernhauer

    Springer Science and Business Media LLC
    AbstractIn data science and content-based retrieval, we find many domain-specific techniques that employ a data processing pipeline with two fundamental steps. First, data entities are represented by some visualizations, while in the second step, the visualizations are used with a machine learning model to extract deep features. Deep convolutional neural networks (DCNN) became the standard and reliable choice. The purpose of using DCNN is either a specific classification task or just a deep feature representation of visual data for additional processing (e.g., similarity search). Whereas the deep feature extraction is a domain-agnostic step in the pipeline (inference of an arbitrary visual input), the visualization design itself is domain-dependent and ad hoc for every use case. In this paper, we survey and analyze many instances of data visualizations used with deep learning models (mostly DCNN) for domain-specific tasks. Based on the analysis, we synthesize a taxonomy that provides a systematic overview of visualization techniques suitable for usage with the models. The aim of the taxonomy is to enable the future generalization of the visualization design process to become completely domain-agnostic, leading to the automation of the entire feature extraction pipeline. As the ultimate goal, such an automated pipeline could lead to universal deep feature data representations for content-based retrieval.

  • Data analytics framework for sparse longitudinal structured biomedical data
    Ivana Sixtová, Tomáš Uher, and Tomáš Skopal

    IEEE
    An increasing amount of data is stored in electronic health records originating from laboratory, imaging, and clinical examinations. However, the automated employment of machine learning algorithms for clinical decision tasks is still limited in the case of long-term medical structured data, such as the observations of patients suffering from multiple sclerosis, including numerical laboratory results and volumes derived from brain MRI segmentation. The main reason is the complexity of these data caused by high dimensionality, irregular temporal nature, and incompleteness in both time and observation dimensions.This study introduces a comprehensive automated framework designed for an end-to-end analysis of longitudinal structured biomedical data. It comprises a preprocessing component, which includes several methods for regularization and missing values imputation. Following, a prediction component suitable for various classification and regression tasks features a range of traditional machine learning and deep neural network models. Finally, the data visualization component based on the Potential of Heat-diffusion for Affinity-based Trajectory identifies the patterns in these complex data.Evaluation of this framework was conducted on a real-world dataset involving patients with multiple sclerosis, addressing tasks such as classifying the patient’s disability state and predicting the patient’s future disability score. Additionally, with the data visualization techniques, the study demonstrates that even incomplete long-term medical time series data can unveil valuable insights.

  • Visual Representations for Data Analytics: User Study
    Ladislav Peska, Ivana Sixtova, David Hoksza, David Bernhauer, and Tomas Skopal

    Springer Nature Switzerland

  • Machine and human interpretable patient visualizations
    Ivana Sixtova, Tomas Skopal, David Hoksza, Jakub Matejik, and Tomas Uher

    IEEE
    A growing amount of data is stored in electronic health records, which are crucial for the clinical decision-making process. A large part of these data has a tabular form, consisting of numerical and categorical values originating from various laboratory examinations and sensors. Unlike medical images and clinical notes, tabular data lack higher semantics and, combined with the high dimensionality and heterogeneity, their interpretation by a human is challenging. On the other hand, we have witnessed superior performance of deep convolutional neural network (DCNN) models in the visual medical domain. In this paper, we propose visual representations of complex tabular medical data readable simultaneously by humans and machines. To show that these representations can encode the patient’s data semantics effectively, we use them to fine-tune a DCNN to predict the disability level of patients suffering from multiple sclerosis. Our experiments show that the visual models could match the performance of non-visual models. Moreover, the visual representations add the benefit o f s ummarizing complex information about the patient’s state to a human.

  • Sentence diagrams: Their evaluation and combination