@rwth-aachen.de
Postdoctoral researcher, Electrical Engineering and Information Technology
RWTH Aachen University
Electrical and Electronic Engineering, Hardware and Architecture, Computer Networks and Communications, Artificial Intelligence
Scopus Publications
Scholar Citations
Scholar h-index
Scholar i10-index
Mujtaba Hassan, Arish Sateesan, Jo Vliegen, Stjepan Picek, and Nele Mentens
Elsevier BV
Arish Sateesan, Jo Vliegen, Simon Scherrer, Hsu-Chun Hsiao, Adrian Perrig, and Nele Mentens
Association for Computing Machinery (ACM)
Network flow measurement is an integral part of modern high-speed applications for network security and data-stream processing. However, processing at line rate while maintaining the required data structure within the on-chip memory of the hardware platform is a challenging task for measurement algorithms, especially when accuracy is of primary importance, such as in network security applications. Most of the existing measurement algorithms are no exception to such issues when deployed in high-speed networking environments and are also not tailored for efficient hardware implementation. Sketch-based measurement algorithms minimize the memory requirement and are suitable for high-speed networks but possess a low memory-accuracy trade-off and lack the versatility of individual flow mapping. To address these challenges, we present a hardware-friendly data structure named Sketch-based Pseudo-associative array Architecture (SPArch). SPArch is highly accurate and extremely memory-efficient, making it suitable for network flow measurement and security applications. The parallelism in SPArch ensures minimal and constant memory access cycles. Unlike other sketch architectures, SPArch provides the functionality of individual flow mapping similar to associative arrays, and the optimized version of SPArch allows the organization of counters in multiple buckets based on the flow sizes. An in-depth analysis of SPArch is carried out in this article and implemented SPArch on the Alveo data center accelerator card, demonstrating its suitability for high-speed networks.
Arish Sateesan, Jelle Biesmans, Thomas Claesen, Jo Vliegen, and Nele Mentens
Elsevier BV
Simon Scherrer, Jo Vliegen, Arish Sateesan, Hsu-Chun Hsiao, Nele Mentens, and Adrian Perrig
IEEE
Modern DDoS defense systems rely on probabilistic monitoring algorithms to identify flows that exceed a volume threshold and should thus be penalized. Commonly, classic sketch algorithms are considered sufficiently accurate for usage in DDoS defense. However, as we show in this paper, these algorithms achieve poor detection accuracy under burst-flood attacks, i.e., volumetric DDoS attacks composed of a swarm of medium-rate sub-second traffic bursts. Under this challenging attack pattern, traditional sketch algorithms can only detect a high share of the attack bursts by incurring a large number of false positives. In this paper, we present ALBUS, a probabilistic monitoring algorithm that overcomes the inherent limitations of previous schemes: ALBUS is highly effective at detecting large bursts while reporting no legitimate flows, and therefore improves on prior work regarding both recall and precision. Besides improving accuracy, ALBUS scales to high traffic rates, which we demonstrate with an FPGA implementation, and is suitable for programmable switches, which we showcase with a P4 implementation.
Mujtaba Hassan, Arish Sateesan, Jo Vliegen, Stjepan Picek, and Nele Mentens
Springer Nature Switzerland
Arish Sateesan, Jo Vliegen, Joan Daemen, and Nele Mentens
Elsevier BV
Arish Sateesan, Jo Vliegen, and Nele Mentens
Springer Nature Switzerland
Laurens Le Jeune, Arish Sateesan, Md Masoom Rabbani, Toon Goedemé, Jo Vliegen, and Nele Mentens
Springer International Publishing
Arish Sateesan, Sharad Sinha, Smitha K. G., and A. P. Vinod
Springer Science and Business Media LLC
Thomas Claesen, Arish Sateesan, Jo Vliegen, and Nele Mentens
IEEE
This paper proposes the design and FPGA implementation of five novel non-cryptographic hash functions, that are suitable to be used in networking and security applications that require fast lookup and/or counting architectures. Our approach is inspired by the design of the existing non-cryptographic hash function Xoodoo-NC, which is constructed through the concatenation of several Xoodoo permutations. We similarly construct non-cryptographic hash functions based on the concatenation of several rounds of symmetric-key ciphers. The goal is to achieve high performance in combination with good avalanche properties, which are required in order to have a significant change in the output value as a result of a limited change in the input value. We simulate how many rounds are needed to achieve satisfactory avalanche scores and we implement the corresponding non-cryptographic hash functions on an FPGA to evaluate the occupied resources and the performance. One of the proposed non-cryptographic hash functions, namely GIFT-NC, outperforms all previously proposed non-cryptographic hash functions in terms of throughput and latency, in exchange for an acceptable increase in FPGA resources.
Arish Sateesan, Jo Vliegen, Simon Scherrer, Hsu-Chun Hsiao, Adrian Perrig, and Nele Mentens
IEEE
Network traffic measurement keeps track of the amount of traffic sent by each flow in the network. It is a core functionality in applications such as traffic engineering and network intrusion detection. In high-speed networks, it is impossible to keep an exact count of the flow traffic, due to limitations with respect to memory and computational speed. Therefore, probabilistic data structures, such as sketches, are used. This paper proposes Approximate Count-Min sketch or ACM sketch, a novel variant of the Count-Min sketch algorithm that uses less memory and has a higher throughput compared to other FPGA-based sketch implementations. A-CM sketch relies on optimizations at two levels: (1) it uses approximate counters and the newly proposed Hardware-oriented Simple Active Counter algorithm to efficiently implement these counters; (2) it uses a distribution of the embedded memory, optimized towards maximum operating frequency. To the best of our knowledge, A-CM sketch outperforms all other FPGA-based sketch implementations.
Simon Scherrer, Che-Yu Wu, Yu-Hsi Chiang, Benjamin Rothenberger, Daniele E. Asoni, Arish Sateesan, Jo Vliegen, Nele Mentens, Hsu-Chun Hsiao, and Adrian Perrig
IEEE
Current probabilistic flow-size monitoring can only detect heavy hitters (e.g., flows utilizing 10 times their permitted bandwidth), but cannot detect smaller overuse (e.g., flows utilizing 50-100 % more than their permitted bandwidth). Thus, these systems lack accuracy in the challenging environment of high-throughput packet processing, where fast-memory resources are scarce. Nevertheless, many applications rely on accurate flow-size estimation, e.g., for network monitoring, anomaly detection and Quality of Service. We design, analyze, implement, and evaluate LOFT, a new approach for efficiently detecting overuse flows that achieves dramatically better properties than prior work. LOFT can detect 1.50x overuse flows in one second, whereas prior approaches can only reliably detect flows that overuse their allocation by at least 3x. We demonstrate LOFT's suitability for high-speed packet processing with implementations in the DPDK framework and on an FPGA.
Arish Sateesan, Sharad Sinha, and Smitha K G
IEEE
Deployment of complex convolutional neural network (CNN) algorithms on Field Programmable Gate Arrays (FPGAs) is a non-trivial task and it becomes even more challenging when the implementation has to be done on resource-constrained devices like smaller FPGAs with less computational resources. To make hardware implementation as easy as software based implementation, a high-level framework with well-optimized hardware relevant libraries, is needed to quickly map a CNN model to such devices without much effort on the part of CNN designer or hardware designer. Only a few such frameworks are available targeting FPGAs, but without the consideration for resource constraints. In this work, we present a Python-based open source design automation tool called “DASH” for hardware generation for convolutional neural networks. Based on user given CNN parameters and hardware resource constraints, the tool trains the network model and generates the hardware code to map the CNN model to FPGAs under the specified hardware resource constraints with the help of a library of optimized design templates. The evaluation results of the generated hardware design show better performance compared to other works on the hardware implementation of CNNs in terms of resources and throughput.
Arish Sateesan, Jo Vliegen, Joan Daemen, and Nele Mentens
IEEE
This paper proposes novel Bloom filter algorithms and FPGA architectures for high-speed searching applications. A Bloom filter is a memory structure that is used to test whether input search data are present in a table of stored data. Bloom filters are extensively used in network security solutions that apply traffic flow monitoring or deep packet inspection. Improving the speed of Bloom filters can therefore have a significant impact on the speed of many network applications. The most important components determining the speed of Bloom filters are hash functions. While hash functions in Bloom filters do not require strong cryptographic properties, they do need a minimized computational delay. We take on the challenge of developing ultra-high-speed Bloom filters on FPGAs by proposing a new noncryptographic hash function, called Xoodoo-NC, derived from the cryptographic permutation Xoodoo. Xoodoo-NC is a reducedround, reduced-state version of Xoodoo, inheriting Xoodoo’s desired avalanche properties and low logical depth, resulting in an ultra-low-latency non-cryptographic hash function. We evaluate the performance of Bloom filter architectures based on Xoodoo-NC on a Xilinx UltraScale+FPGA and we compare the performance and resource occupation to existing Bloom filter implementations. We additionally compare our results to memories that use the built-in CAM cores in Xilinx UltraScale+ FPGAs. Our proposed algorithmic and architectural advances lead to Bloom filters that, to the best of our knowledge, outperform all other FPGA-based solutions.