Vlad-Mihai Sima

@illumina.com

Emerging Solutions
Illumina

Vlad-Mihai Sima

RESEARCH INTERESTS

genomics, data analysis, federated analysis
34

Scopus Publications

1778

Scholar Citations

16

Scholar h-index

20

Scholar i10-index

Scopus Publications

  • Comparative analysis of system-level acceleration techniques in bioinformatics: A case study of accelerating the Smith-Waterman Algorithm for BWA-MEM
    Ernst Joachim Houtgast, Vlad-Mihai Sima, Koen Bertels, Zaid Al-Ars
    Proceedings 2018 IEEE 18th International Conference on Bioinformatics and Bioengineering Bibe 2018, 2018
    Bioinformatics workloads are characterized by huge data sets and complex algorithms, requiring enormous data processing and making high performance heterogeneous computation platforms such as FPGAs and GPUs highly relevant. We compare three accelerated implementations of the widely used BWA-MEM genomic mapping tool as a case study on design-time optimization for heterogeneous architectures: BWA-MEM-CUDA, BWA-MEM-OpenCL, and BWA-MEMVHDL, each using an optimized Smith-Waterman algorithm implementation. Optimization of design-time is important because of the significant development effort of such implementations: BWA-MEM-CUDA and BWA-MEM-OpenCL require 5-7x more lines of code to express the Smith-Waterman algorithm, while BWA-MEM-VHDL requires more than 40x as many lines of code. Similar differences hold for required implementation time, ranging from one month for BWA-MEMOpenCL to six months for BWA-MEM-VHDL. The advantages and disadvantages of each implementation are described using both quantitative and qualitative metrics, and recommendations are given for future algorithm implementations.
  • Hardware acceleration of BWA-MEM genomic short read mapping for longer read lengths
    Ernst Joachim Houtgast, Vlad-Mihai Sima, Koen Bertels, Zaid Al-Ars
    Computational Biology and Chemistry, 2018
    We present our work on hardware accelerated genomics pipelines, using either FPGAs or GPUs to accelerate execution of BWA-MEM, a widely-used algorithm for genomic short read mapping. The mapping stage can take up to 40% of overall processing time for genomics pipelines. Our implementation offloads the Seed Extension function, one of the main BWA-MEM computational functions, onto an accelerator. Sequencers typically output reads with a length of 150 base pairs. However, read length is expected to increase in the near future. Here, we investigate the influence of read length on BWA-MEM performance using data sets with read length up to 400 base pairs, and introduce methods to ameliorate the impact of longer read length. For the industry-standard 150 base pair read length, our implementation achieves an up to two-fold increase in overall application-level performance for systems with at most twenty-two logical CPU cores. Longer read length requires commensurately bigger data structures, which directly impacts accelerator efficiency. The two-fold performance increase is sustained for read length of at most 250 base pairs. To improve performance, we perform a classification of the inefficiency of the underlying systolic array architecture. By eliminating idle regions as much as possible, efficiency is improved by up to +95%. Moreover, adaptive load balancing intelligently distributes work between host and accelerator to ensure use of an accelerator always results in performance improvement, which in GPU-constrained scenarios provides up to +45% more performance.
  • High performance streaming smith-waterman implementation with implicit synchronization on intel FPGA using OpenCL
    Ernst Houtgast, Vlad-Mihai Sima, Zaid Al-Ars
    Proceedings 2017 IEEE 17th International Conference on Bioinformatics and Bioengineering Bibe 2017, 2017
    The Smith-Waterman algorithm is widely used in bioinformatics and is often used as a benchmark of FPGA performance. Here we present our highly optimized Smith-Waterman implementation on Intel FPGAs using OpenCL. Our implementation is both faster and more efficient than other current Smith-Waterman implementations, obtaining a theoretical performance of 214 GCUPS. Moreover, due to the streaming, implicit synchronizing nature of our implementation, which streams alignments and places no restrictions on the number of alignments in flight, it achieves 99.8% of this performance in practice, almost three times as fast as previous implementations. The expressiveness of OpenCL results in a significant reduction in lines of code, and in a significant reduction of development time compared to programming in regular hardware description languages
  • A Survey and Evaluation of FPGA High-Level Synthesis Tools
    Razvan Nane, Vlad-Mihai Sima, Christian Pilato, Jongsok Choi, Blair Fort, et al.
    IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, 2016
    High-level synthesis (HLS) is increasingly popular for the design of high-performance and energy-efficient heterogeneous systems, shortening time-to-market and addressing today's system complexity. HLS allows designers to work at a higher-level of abstraction by using a software program to specify the hardware functionality. Additionally, HLS is particularly interesting for designing field-programmable gate array circuits, where hardware implementations can be easily refined and replaced in the target device. Recent years have seen much activity in the HLS research community, with a plethora of HLS tool offerings, from both industry and academia. All these tools may have different input languages, perform different internal optimizations, and produce results of different quality, even for the very same input description. Hence, it is challenging to compare their performance and understand which is the best for the hardware to be implemented. We present a comprehensive analysis of recent HLS tools, as well as overview the areas of active interest in the HLS research community. We also present a first-published methodology to evaluate different HLS tools. We use our methodology to compare one commercial and three academic tools on a common set of C benchmarks, aiming at performing an in-depth evaluation in terms of performance and the use of resources.
  • Power-Efficient Accelerated Genomic Short Read Mapping on Heterogeneous Computing Platforms
    Ernst Joachim Houtgast, Vlad-Mihai Sima, Giacomo Marchiori, Koen Bertels, Zaid Al-Ars
    Proceedings 24th IEEE International Symposium on Field Programmable Custom Computing Machines Fccm 2016, 2016
    We propose a novel FPGA-accelerated BWA-MEM implementation, a popular tool for genomic data mapping. The performance and power-efficiency of the FPGA implementation on the single Xilinx Virtex-7 Alpha Data add-in card is compared against a software-only baseline system. By offloading the Seed Extension phase onto the FPGA, a two-fold speedup in overall application-level performance is achieved and a 1.6x gain in power-efficiency. To facilitate platform and tool-agnostic comparisons, the base pairs per Joule unit is introduced as a measure of power-efficiency. The FPGA design is able to map up to 34 thousand base pairs per Joule.
  • Heterogeneous hardware/software acceleration of the BWA-MEM DNA alignment algorithm
    Nauman Ahmed, Vlad-Mihai Sima, Ernst Houtgast, Koen Bertels, Zaid Al-Ars
    2015 IEEE ACM International Conference on Computer Aided Design Iccad 2015, 2016
    The fast decrease in cost of DNA sequencing has resulted in an enormous growth in available genome data, and hence led to an increasing demand for fast DNA analysis algorithms used for diagnostics of genetic disorders, such as cancer. One of the most computationally intensive steps in the analysis is represented by the DNA read alignment. In this paper, we present an accelerated version of BWA-MEM, one of the most popular read alignment algorithms, by implementing a heterogeneous hardware/software optimized version on the Convey HC2ex platform. A challenging factor of the BWA-MEM algorithm is the fact that it consists of not one, but three computationally intensive kernels: SMEM generation, suffix array lookup and local Smith-Waterman. Obtaining substantial speedup is hence contingent on accelerating all of these three kernels at once. The paper shows an architecture containing two hardware-accelerated kernels and one kernel optimized in software. The two hardware kernels of suffix array lookup and local Smith-Waterman are able to reach speedups of 2.8x and 5.7x, respectively. The software optimization of the SMEM generation kernel is able to achieve a speedup of 1.7x. This enables a total application acceleration of 2.6x compared to the original software version.
  • GPU-accelerated BWA-MEM genomic mapping algorithm using adaptive load balancing
    Ernst Joachim Houtgast, Vlad-Mihai Sima, Koen Bertels, Zaid Al-Ars
    Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2016
    Genomic sequencing is rapidly becoming a premier generator of Big Data, posing great computational challenges. Hence, acceleration of the algorithms used is of utmost importance. This paper presents a GPU-accelerated implementation of BWA-MEM, a widely used algorithm to map genomic sequences onto a reference genome. BWA-MEM contains three main computational functions: Seed Generation, Seed Extension and Output Generation. This paper discusses acceleration of the Seed Extension function on a GPU accelerator. The GPU-based Extend kernel achieves three times higher performance and, by offloading the kernel onto an accelerator and overlapping its execution with the other functions, this results in an overall improvement to application-level execution time of upi¾źto 1.6x. To ensure that using an accelerator always results in an overall performance improvement, especially when considering slower GPUs, an adaptive load balancing solution is introduced, which intelligently distributes work between host and GPU. This provides, compared to not using load balancing, upi¾źto +46i¾ź% more performance.
  • Power-efficiency analysis of accelerated BWA-MEM implementations on heterogeneous computing platforms
    Ernst Joachim Houtgast, Vlad-Mihai Sima, Giacomo Marchiori, Koen Bertels, Zaid Al-Ars
    2016 International Conference on Reconfigurable Computing and Fpgas Reconfig 2016, 2016
    Next Generation Sequencing techniques have dramatically reduced the cost of sequencing genetic material, resulting in huge amounts of data being sequenced. The processing of this data poses huge challenges, both from a performance perspective, as well as from a power-efficiency perspective. Heterogeneous computing can help on both fronts, by enabling more performant and more power-efficient solutions. In this paper, power-efficiency of the BWA-MEM algorithm, a popular tool for genomic data mapping, is studied on two heterogeneous architectures. The performance and power-efficiency of an FPGA-based implementation using a single Xilinx Virtex-7 FPGA on the Alpha Data add-in card is compared to a GPU-based implementation using an NVIDIA GeForce GTX 970 and against the software-only baseline system. By offloading the Seed Extension phase on an accelerator, both implementations are able to achieve a two-fold speedup in overall application-level performance over the software-only implementation. Moreover, the highly customizable nature of the FPGA results in much higher power-efficiency, as the FPGA power consumption is less than one fourth of that of the GPU. To facilitate platform and tool-agnostic comparisons, the base pairs per Joule unit is introduced as a measure of power-efficiency. The FPGA design is able to map up to 44 thousand base pairs per Joule, a 2.1x gain in power-efficiency as compared to the software-only baseline.
  • An FPGA-based systolic array to accelerate the BWA-MEM genomic mapping algorithm
    Ernst Joachim Houtgast, Vlad-Mihai Sima, Koen Bertels, Zaid Al-Ars
    Proceedings 2015 International Conference on Embedded Computer Systems Architectures Modeling and Simulation Samos 2015, 2015
    We present the first accelerated implementation of BWA-MEM, a popular genome sequence alignment algorithm widely used in next generation sequencing genomics pipelines. The Smith-Waterman-like sequence alignment kernel requires a significant portion of overall execution time. We propose and evaluate a number of FPGA-based systolic array architectures, presenting optimizations generally applicable to variable length Smith-Waterman execution. Our kernel implementation is up to 3× faster, compared to software-only execution. This translates into an overall application speedup of up to 45%, which is 96% of the theoretically maximum achievable speedup when accelerating only this kernel.
  • FPGA acceleration of the pair-HMMs forward algorithm for DNA sequence analysis
    Shanshan Ren, Vlad-Mihai Sima, Zaid Al-Ars
    Proceedings 2015 IEEE International Conference on Bioinformatics and Biomedicine Bibm 2015, 2015
    Many DNA sequence analysis tools have been developed to turn the massive raw DNA sequencing data generated by NGS (Next Generation Sequencing) platforms into biologically meaningful information. The pair-HMMs forward algorithm is widely used to calculate the overall alignment probability needed by a number of DNA analysis tools. In this paper, we propose a novel systolic array design to accelerate the pair-HMMs forward algorithm on FPGAs. A number of architectural features have been implemented to improve the performance of the design, such as early exit points to increase the utilization of the array for small sequence sizes, as well as on-chip buffering to enable the processing of long sequences effectively. We present an implementation of the design on the Convey supercomputing platform. Experimental results show that the FPGA implementation of the pair-HMMs forward algorithm is up to 67x faster, compared to software-only execution.
  • High-Level Synthesis in the Delft Workbench Hardware/Software Co-design Tool-Chain
    Razvan Nane, Vlad Mihai Sima, Cuong Pham Quoc, Fernando Goncalves, Koen Bertels
    Proceedings 2014 International Conference on Embedded and Ubiquitous Computing Euc 2014, 2014
  • DRuiD: Designing reconfigurable architectures with decision-making support
    Giovanni Mariani, Gianluca Palermo, Roel Meeuws, Vlad-Mihai Sima, Cristina Silvano, et al.
    Proceedings of the Asia and South Pacific Design Automation Conference ASP DAC, 2014
  • FPGA-accelerated Monte-Carlo integration using stratified sampling and Brownian bridges
    Mark de Jong, Vlad-Mihai Sima, Koen Bertels, David Thomas
    Proceedings of the 2014 International Conference on Field Programmable Technology Fpt 2014, 2014
  • Hardware/software compilation
    Ricardo Nobre, João M. P. Cardoso, Bryan Olivier, Razvan Nane, Liam Fitzpatrick, et al.
    Compilation and Synthesis for Embedded Reconfigurable Systems an Aspect Oriented Approach, 2013
  • LARA experiments
    Fernando Gonçalves, Zlatko Petrov, José Gabriel de F. Coutinho, Razvan Nane, Vlad-Mihai Sima, et al.
    Compilation and Synthesis for Embedded Reconfigurable Systems an Aspect Oriented Approach, 2013
  • The REFLECT design-flow
    João M. P. Cardoso, José Gabriel de F. Coutinho, Razvan Nane, Vlad-Mihai Sima, Bryan Olivier, et al.
    Compilation and Synthesis for Embedded Reconfigurable Systems an Aspect Oriented Approach, 2013
  • Quipu: A statistical model for predicting hardware resources
    Roel Meeuws, S. Arash Ostadzadeh, Carlo Galuzzi, Vlad Mihai Sima, Razvan Nane, et al.
    ACM Transactions on Reconfigurable Technology and Systems, 2013
  • Run-time optimization of a dynamically reconfigurable embedded system through performance prediction
    Giovanni Mariani, Vlad-Mihai Sima, Gianluca Palermo, Vittorio Zaccaria, Giacomo Marchiori, et al.
    2013 23rd International Conference on Field Programmable Logic and Applications Fpl 2013 Proceedings, 2013
  • DWARV 2.0: A CoSy-based C-to-VHDL hardware compiler
    Razvan Nane, Vlad-Mihai Sima, Bryan Olivier, Roel Meeuws, Yana Yankova, et al.
    Proceedings 22nd International Conference on Field Programmable Logic and Applications Fpl 2012, 2012
  • A lightweight speculative and predicative scheme for hardware execution
    Razvan Nane, Vlad-Mihai Sima, Koen Bertels
    2012 International Conference on Reconfigurable Computing and Fpgas Reconfig 2012, 2012
  • Area constraint propagation in high level synthesis
    R. Nane, V.M. Sima, K. Bertels
    Fpt 2012 2012 International Conference on Field Programmable Technology, 2012
  • Using multi-objective design space exploration to enable run-time resource management for reconfigurable architectures
    G. Mariani, V. Sima, G. Palermo, V. Zaccaria, C. Silvano, et al.
    Proceedings Design Automation and Test in Europe Date, 2012
  • Extensions of the hArtes tool chain
    Ferruccio Bettarelli, Emanuele Ciavattini, Ariano Lattanzi, Giovanni Beltrame, Fabrizio Ferrandi, et al.
    Hardware Software Co Design for Heterogeneous Multi Core Platforms the Hartes Toolchain, 2012
  • The hArtes tool chain
    Koen Bertels, Ariano Lattanzi, Emanuele Ciavattini, Ferruccio Bettarelli, Maria Teresa Chiaradia, et al.
    Hardware Software Co Design for Heterogeneous Multi Core Platforms the Hartes Toolchain, 2012
  • The hArtes CarLab: A new approach to advanced algorithms development for automotive audio
    AES Journal of the Audio Engineering Society, 2011

RECENT SCHOLAR PUBLICATIONS

  • Comparative analysis of system-level acceleration techniques in bioinformatics: A case study of accelerating the smith-waterman algorithm for bwa-mem
    EJ Houtgast, VM Sima, K Bertels, Z Al-Ars
    2018 IEEE 18th International Conference on Bioinformatics and Bioengineering … , 2018
    2018
    Citations: 10
  • Hardware acceleration of BWA-MEM genomic short read mapping for longer read lengths
    EJ Houtgast, VM Sima, K Bertels, Z Al-Ars
    Computational biology and chemistry 75, 54-64 , 2018
    2018
    Citations: 178
  • High performance streaming Smith-Waterman implementation with implicit synchronization on intel FPGA using OpenCL
    E Houtgast, VM Sima, Z Al-Ars
    2017 IEEE 17th International Conference on Bioinformatics and Bioengineering … , 2017
    2017
    Citations: 26
  • An efficient gpuaccelerated implementation of genomic short read mapping with bwamem
    EJ Houtgast, VM Sima, K Bertels, Z AlArs
    ACM SIGARCH Computer Architecture News 44 (4), 38-43 , 2017
    2017
    Citations: 25
  • Power-efficiency analysis of accelerated BWA-MEM implementations on heterogeneous computing platforms
    EJ Houtgast, VM Sima, G Marchiori, K Bertels, Z Al-Ars
    2016 international conference on reconfigurable computing and fpgas … , 2016
    2016
    Citations: 16
  • Power-efficient accelerated genomic short read mapping on heterogeneous computing platforms
    EJ Houtgast, VM Sima, G Marchiori, K Bertels, Z Al-Ars
    2016 IEEE 24th Annual International Symposium on Field-Programmable Custom … , 2016
    2016
    Citations: 5
  • GPU-accelerated BWA-MEM genomic mapping algorithm using adaptive load balancing
    EJ Houtgast, VM Sima, K Bertels, Z Al-Ars
    International conference on architecture of computing systems, 130-142 , 2016
    2016
    Citations: 34
  • Computational Challenges of Next Generation Sequencing Pipelines Using Heterogeneous Systems
    EJ Houtgast, VM Sima, K Bertels, Z Al-Ars
    12th International Summer School on Advanced Computer Architecture and … , 2016
    2016
    Citations: 2
  • A survey and evaluation of FPGA high-level synthesis tools
    R Nane, VM Sima, C Pilato, J Choi, B Fort, A Canis, YT Chen, H Hsiao, ...
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and … , 2015
    2015
    Citations: 896
  • FPGA acceleration of the pair-HMMs forward algorithm for DNA sequence analysis
    S Ren, VM Sima, Z Al-Ars
    2015 IEEE international conference on bioinformatics and biomedicine (BIBM … , 2015
    2015
    Citations: 53
  • Heterogeneous hardware/software acceleration of the BWA-MEM DNA alignment algorithm
    N Ahmed, VM Sima, E Houtgast, K Bertels, Z Al-Ars
    2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 240-246 , 2015
    2015
    Citations: 69
  • An FPGA-based systolic array to accelerate the BWA-MEM genomic mapping algorithm
    EJ Houtgast, VM Sima, K Bertels, Z Al-Ars
    2015 international conference on embedded computer systems: Architectures … , 2015
    2015
    Citations: 78
  • Intra-application data-communication characterization
    I Ashraf, VM Sima, K Bertels
    Proc. 1st Int. Workshop Commun. Archit. Extreme Scale, 1-11 , 2015
    2015
    Citations: 7
  • FPGA-accelerated Monte-Carlo integration using stratified sampling and Brownian bridges
    M De Jong, VM Sima, K Bertels, D Thomas
    2014 International Conference on Field-Programmable Technology (FPT), 68-75 , 2014
    2014
    Citations: 6
  • High-level synthesis in the delft workbench hardware/software co-design tool-chain
    R Nane, VM Sima, CP Quoc, F Goncalves, K Bertels
    2014 12th IEEE International Conference on Embedded and Ubiquitous Computing … , 2014
    2014
    Citations: 20
  • DRuiD: Designing reconfigurable architectures with decision-making support
    G Mariani, G Palermo, R Meeuws, VM Sima, C Silvano, K Bertels
    2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC), 213-218 , 2014
    2014
    Citations: 6
  • Run-time optimization of a dynamically reconfigurable embedded system through performance prediction
    G Mariani, VM Sima, G Palermo, V Zaccaria, G Marchiori, C Silvano, ...
    2013 23rd International Conference on Field programmable Logic and … , 2013
    2013
    Citations: 2
  • LARA experiments
    F Gonçalves, Z Petrov, JG de F. Coutinho, R Nane, VM Sima, ...
    Compilation and Synthesis for Embedded Reconfigurable Systems: An Aspect … , 2013
    2013
  • The REFLECT design-flow
    JMP Cardoso, JG de F. Coutinho, R Nane, VM Sima, B Olivier, T Carvalho, ...
    Compilation and Synthesis for Embedded Reconfigurable Systems: An Aspect … , 2013
    2013
    Citations: 2
  • Hardware/Software Compilation
    R Nobre, JMP Cardoso, B Olivier, R Nane, L Fitzpatrick, ...
    Compilation and Synthesis for Embedded Reconfigurable Systems: An Aspect … , 2013
    2013
    Citations: 3

MOST CITED SCHOLAR PUBLICATIONS

  • A survey and evaluation of FPGA high-level synthesis tools
    R Nane, VM Sima, C Pilato, J Choi, B Fort, A Canis, YT Chen, H Hsiao, ...
    IEEE Transactions on Computer-Aided Design of Integrated Circuits and … , 2015
    2015
    Citations: 896
  • Hardware acceleration of BWA-MEM genomic short read mapping for longer read lengths
    EJ Houtgast, VM Sima, K Bertels, Z Al-Ars
    Computational biology and chemistry 75, 54-64 , 2018
    2018
    Citations: 178
  • DWARV 2.0: A CoSy-based C-to-VHDL hardware compiler
    R Nane, VM Sima, B Olivier, R Meeuws, Y Yankova, K Bertels
    22nd international conference on field programmable logic and applications … , 2012
    2012
    Citations: 133
  • An FPGA-based systolic array to accelerate the BWA-MEM genomic mapping algorithm
    EJ Houtgast, VM Sima, K Bertels, Z Al-Ars
    2015 international conference on embedded computer systems: Architectures … , 2015
    2015
    Citations: 78
  • Heterogeneous hardware/software acceleration of the BWA-MEM DNA alignment algorithm
    N Ahmed, VM Sima, E Houtgast, K Bertels, Z Al-Ars
    2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 240-246 , 2015
    2015
    Citations: 69
  • FPGA acceleration of the pair-HMMs forward algorithm for DNA sequence analysis
    S Ren, VM Sima, Z Al-Ars
    2015 IEEE international conference on bioinformatics and biomedicine (BIBM … , 2015
    2015
    Citations: 53
  • Hartes: Hardware-software codesign for heterogeneous multicore platforms
    K Bertels, VM Sima, Y Yankova, G Kuzmanov, W Luk, G Coutinho, ...
    IEEE micro 30 (5), 88-97 , 2010
    2010
    Citations: 38
  • Using multi-objective design space exploration to enable run-time resource management for reconfigurable architectures
    G Mariani, VM Sima, G Palermo, V Zaccaria, C Silvano, K Bertels
    2012 Design, Automation & Test in Europe Conference & Exhibition (DATE … , 2012
    2012
    Citations: 36
  • GPU-accelerated BWA-MEM genomic mapping algorithm using adaptive load balancing
    EJ Houtgast, VM Sima, K Bertels, Z Al-Ars
    International conference on architecture of computing systems, 130-142 , 2016
    2016
    Citations: 34
  • Runtime decision of hardware or software execution on a heterogeneous reconfigurable platform
    VM Sima, K Bertels
    2009 IEEE International Symposium on Parallel & Distributed Processing, 1-6 , 2009
    2009
    Citations: 27
  • High performance streaming Smith-Waterman implementation with implicit synchronization on intel FPGA using OpenCL
    E Houtgast, VM Sima, Z Al-Ars
    2017 IEEE 17th International Conference on Bioinformatics and Bioengineering … , 2017
    2017
    Citations: 26
  • An efficient gpuaccelerated implementation of genomic short read mapping with bwamem
    EJ Houtgast, VM Sima, K Bertels, Z AlArs
    ACM SIGARCH Computer Architecture News 44 (4), 38-43 , 2017
    2017
    Citations: 25
  • Compiler assisted runtime task scheduling on a reconfigurable computer
    M Sabeghi, VM Sima, K Bertels
    2009 International Conference on Field Programmable Logic and Applications … , 2009
    2009
    Citations: 22
  • High-level synthesis in the delft workbench hardware/software co-design tool-chain
    R Nane, VM Sima, CP Quoc, F Goncalves, K Bertels
    2014 12th IEEE International Conference on Embedded and Ubiquitous Computing … , 2014
    2014
    Citations: 20
  • REFLECT: rendering FPGAs to multi-core embedded computing
    JMP Cardoso, PC Diniz, Z Petrov, K Bertels, M Hübner, H van Someren, ...
    Reconfigurable Computing: From FPGAs to Hardware/Software Codesign, 261-289 , 2011
    2011
    Citations: 18
  • Power-efficiency analysis of accelerated BWA-MEM implementations on heterogeneous computing platforms
    EJ Houtgast, VM Sima, G Marchiori, K Bertels, Z Al-Ars
    2016 international conference on reconfigurable computing and fpgas … , 2016
    2016
    Citations: 16
  • Quipu: A statistical model for predicting hardware resources
    R Meeuws, SA Ostadzadeh, C Galuzzi, VM Sima, R Nane, K Bertels
    ACM Transactions on Reconfigurable Technology and Systems (TRETS) 6 (1), 1-25 , 2013
    2013
    Citations: 14
  • Hartes toolchain early evaluation: Profiling, Compilation and HDL generation
    K Bertels, G Kuzmanov, EM Panainte, G Gaydadjiev, Y Yankova, ...
    2007 International Conference on Field Programmable Logic and Applications … , 2007
    2007
    Citations: 12
  • Comparative analysis of system-level acceleration techniques in bioinformatics: A case study of accelerating the smith-waterman algorithm for bwa-mem
    EJ Houtgast, VM Sima, K Bertels, Z Al-Ars
    2018 IEEE 18th International Conference on Bioinformatics and Bioengineering … , 2018
    2018
    Citations: 10
  • Area constraint propagation in high level synthesis
    R Nane, VM Sima, K Bertels
    2012 International Conference on Field-Programmable Technology, 247-252 , 2012
    2012
    Citations: 10