ANIRBAN CHAKRABORTY

Scopus Publications

Deep Neural Network Based Multi-Object Detection for Real-time Aerial Surveillance
Rebanta Dey, Binit Kumar Pandit, Anirban Ganguly, Anirban Chakraborty, and Ayan Banerjee
IEEE
Aerial surveillance is one of the widely used modern days surveillance methodologies, finding applications in many important fields including military and civilian. This article presents a comprehensive study of Deep Neural Network (DNN) based solutions for real-time object tracking from Unmanned Aerial Vehicle (UAV) using a modified version of the state-of-the-art object detection algorithm YOLOv5 model. The modified YOLOv5 architecture is achieved by changing the activation function to Rectified Linear Unit (ReLU) and fine-tuning the network’s hyperparameter. A comparative analysis was then done on a subset of the AU-AIR dataset by comparing the different YOLOv5 models based on the network depth to determine the improvements in training speed and accuracy. The modified network was also compared in terms of mean average precision (mAP) to the original paper, a performance gain of almost 2.9 times was achieved in the best-case scenario.

Resource-efficient VLSI Architecture of Softmax Activation Function for Real-time Inference in Deep Learning Applications
Akash Ther, Binit Kumar Pandit, Anirban Ganguly, Anirban Chakraborty, and Ayan Banerjee
IEEE
The Softmax activation function layer is the output layer in various Deep Learning (DL) models for multi-class classification applications. It computes the probability values for each input entry to the layer and represents the degree of confidence. The nonlinearity of the softmax function poses a challenge of increased hardware complexity. Therefore, the proposed VLSI architecture of the softmax activation function aims to reduce hardware resources with minimal loss in accuracy. The proposed architecture utilizes an optimized design of a ROM-based exponential unit for exponentiation and a Newton-Raphson iterations-based division by reciprocation unit for division operation. The proposed architecture is evaluated on Xilinx's Kirtex-7 KC705 FPGA Evaluation Platform, achieving LUTs 1.44 - 11.61 × lesser than the other state-of-the-art architectures on the MNIST dataset.

Low Power and High Precision Analog VLSI Design of 1-D DCT for Real-time Application
Deepak Kumar, Anirban Ganguly, Puja Chakraborty, Anirban Chakraborty, Binit Kumar Pandit, and Ayan Banerjee
IEEE
The need for low power in many wireless and biomedical signal processing applications leaves a great scope for the analog discrete time VLSI architectures over their digital counterparts. To investigate the performance of a discrete cosine transform (DCT) for low energy compression application a subthreshold regime was selected using modified class AB current mirror only circuit. A radix-2 algorithm was selected in order to increase the efficiency of computation instead of complex matrix multiplication. The accuracy of the proposed circuit for an 8-point 1-D DCT computation was observed around ±1% with a power consumption of 1.8mW using a $0.5\\ \\mu {m}$ CMOS process.

CORDIC-Based High-Speed VLSI Architecture of Transform Model Estimation for Real-Time Imaging
Anirban Chakraborty and Ayan Banerjee
Institute of Electrical and Electronics Engineers (IEEE)

A memory and area-efficient distributed arithmetic based modular VLSI architecture of 1D/2D reconfigurable 9/7 and 5/3 DWT filters for real-time image decomposition
Anirban Chakraborty and Ayan Banerjee
Springer Science and Business Media LLC
In this article, we have proposed the internal architecture of a dedicated hardware for 1D/2D convolution-based 9/7 and 5/3 DWT filters, exploiting bit-parallel ‘distributed arithmetic’ (DA) to reduce the computation time of our proposed DWT design while retaining the area at a comparable level to other recent existing designs. Despite using memory extensive bit-parallel DA, we have successfully achieved 90% reduction in the memory size than that of the other notable architectures. Through our proposed architecture, both the 9/7 and 5/3 DWT filters can be realized with a selection input, mode. With the introduction of DA, we have incorporated pipelining and parallelism into our proposed convolution-based 1D/2D DWT architectures. We have reduced the area by 38.3% and memory requirement by 90% than that of the latest remarkable designs. The critical-path delay of our design is almost 50% than that of the other latest designs. We have successfully applied our prototype 2D design for real-time image decomposition. The quality of the architecture in case of real-time image decomposition is measured by ‘peak signal-to-noise ratio’ and ‘computation time’, where our proposed design outperforms other similar kind of software- and hardware-based implementations.

A Memory Efficient, Multiplierless & Modular VLSI Architecture of 1D/2D Re-Configurable 9/7 & 5/3 DWT Filters Using Distributed Arithmetic
Anirban Chakraborty and Ayan Banerjee
World Scientific Pub Co Pte Lt
Dedicated hardware for “Discrete Wavelet Transform” (DWT) is at high demand for real-time imaging operations in any standalone electronic devices, as DWT is being extensively utilized for most of the transform-domain imagery applications. Various DWT algorithms exist in the literature facilitating its software implementations which are generally unsuitable for real-time imaging in any stand-alone devices due to their power intensiveness and huge computation time. In this paper, a convolutional DWT-based pipelined and tunable VLSI architecture of Daubechies 9/7 and 5/3 DWT filter is presented. Our proposed architecture, which mingles the advantages of convolutional and lifting DWT while discarding their notable disadvantages, is made area and memory efficient by exploiting “Distributed Arithmetic’ (DA) in our own ingenious way. Almost 90% reduction in the memory size than other notable architectures is reported. In our proposed architecture, both the 9/7 and 5/3 DWT filters can be realized with a selection input, “mode”. With the introduction of DA, pipelining and parallelism are easily incorporated into our proposed 1D/2D DWT architectures. The area requirement and critical path delay are reduced to almost 38.3% and 50% than that of the latest remarkable designs. The performance of the proposed VLSI architecture also excels in real-time applications.

A Novel VLSI Architecture of CORDIC Based Image Registration
Anirban Chakraborty and Ayan Banerjee
IEEE
Now-a-days real-time image registration (IR) has been emerged as a potential field of research by virtue of its vast applicability. Despite such high demand, till now, most of the real-time IR based applications rely upon software based tools for performing the image registration task. Those software based tools need to be highly sophisticated to comply with the performance of any real-time system. Naturally, the real-time IR system is costly and power consuming in nature. The only cost-effective solution for this is to develop a dedicated hardware catering real-time IR. In this article, a novel VLSI architecture for real-time IR has been presented. With an ardent focus on making the proposed architecture cost-economic, the transform model estimation (TME) step of IR method is realized using co-ordinate rotation digital computer (CORDIC) technique which is popularly regarded as hardware efficient. The proposed CORDIC based TME architecture is utilized to realize the state of the art VLSI architecture of real-time IR in terms of dedicated hardware which has been implemented onto Zynq UltraScale+ MPSoC series FPGA. The performance of the proposed and FPGA-implemented VLSI architecture of the IR has been compared with its software based counterpart using MATLAB. The proposed VLSI architecture of IR has been proved to outperform the software based realizations in terms of area, power consumption and speed of operation, thereby emerging as a cost-economic solution for the real-time IR.

Area and memory efficient tunable VLSI implementation of DWT filters for image decomposition using distributed arithmetic
Anirban Chakraborty and Ayan Banerjee
Informa UK Limited
ABSTRACT Dedicated hardware realisation of ‘Discrete Wavelet Transform’ (DWT) is highly demanded for real-time multimedia operations in any handheld gadgets, as DWT is a popular transform-domain mathematical tool widely employed for many applications. Several DWT techniques exist in the literature assisting the software-based DWT realizations, inappropriate for real-time operations. There exist several hardware-based solutions for DWT. But most of the VLSI architectures of DWT fail to offer balanced performance, i.e. such architectures show good performance with respect to certain metrics while sacrificing other performance metrics. In this article, a convolutional DWT-based pipelined and tunable VLSI architecture of Daubechies 9/7 and 5/3 DWT filter is presented. The proposed architecture is designed with an ardent focus of maintaining balanced performance. The presented design, which blends the merits of convolutional and lifting DWT while dumping their demerits, is prepared area and memory efficient by employing ‘Distributed Arithmetic’ (DA) in our own ingenious method. Comparative experimental results justify the superiority of the proposed design regarding 30% area reduction, memory saving, remarkable clock frequency attainment and acceptable computation time. The proposed 2D DWT architecture is applied for real-time image decomposition. The outcomes of such real-time on-board testing prove the practicability of the presented work.

An Adaptive and Automated Image Fusion Algorithm Based on DWT for Real Time Applications
Anirban Chakraborty and Ayan Banerjee
IEEE
In this article, a novel adaptive ‘Image Fusion’ (IF) algorithm has been presented. The proposed IF algorithm exploits the capability of ‘Discrete Wavelet Transform’ (DWT) to efficiently fuse useful image information. Normalized mutual information, normalized entropy and average gradient are utilized in our own way to formulate the proposed automated fusion algorithm. Proposed fusion algorithm is devoid of any hand-tuned parameters, thereby achieving automation. Our proposed adaptive fusion rules help us maintaining the time complexity of the proposed algorithm in a reasonably lower value while retaining the accuracy at per with other exiting algorithm. The proposed algorithm is realized using Verilog coding to address real-time applications. The quality of the proposed algorithm, along with its VLSI architecture, has been vigorously evaluated both quantitatively and qualitatively for medical and aerial images. The results of the evaluations point towards the superiority for both the proposed algorithm and its VLSI implementation.

A unified block-based sparse domain solution for quasi-periodic de-noising from different genres of images with iterative filtering
D. Chakraborty, A. Chakraborty, A. Banerjee, and S. R. Bhadra Chaudhuri
Springer Science and Business Media LLC
Images, corresponding to various crucial imagery applications often experience stern problem of being degraded by different modalities of periodic/quasi-periodic noises. Though few periodic denoising algorithms address well for some specific application only, most of them fail to focus on the problem as a whole. In this article, a unified solution is presented which performs well for most of the vital non-natural imagery applications having dissimilar modalities. Initially, we divide the corrupted image into several blocks and then average those to get an averaged spatial image block. This block gets convolved with the Kaiser-Window to avoid any unnecessary artifacts followed by the spectral domain transformation. Our proposed algorithm relies on steadily decreasing characteristic of any uncorrupted natural image’s power spectra to expect a model by grossly reducing induced noise. An image feature based adaptive threshold is then applied on error spectra to precisely perceive unexpectedly high spectral amplitudes as the outliers. It is then interpolated to the actual size of the corrupted image, containing noisy spectra on which a proposed recursively adaptive notch-reject filter is applied. Extensive and detailed study of performance comparison with other state-of-the-art algorithms proves the supremacy of our proposed strategy.

Low latency semi-iterative CORDIC algorithm using normalized angle recoding and its VLSI implementation
Anirban Chakraborty and Ayan Banerjee
IEEE
In this article, we have proposed a semi-iterative 2D Co-ordinate Rotation Digital Computer (CORDIC) algorithm and presented the VLSI architecture of it. Our proposed CORDIC algorithm is based on our own Normalized Angle Recoding based Hybrid (NARH) method. The proposed algorithm and its corresponding architecture are innovative and unique, because our proposed CORDIC is not fully iterative unlike all other CORDIC algorithms. The latency of our proposed CORDIC is the least in comparison to most of the other significant and latest CORDIC algorithms. We have also judiciously parallelized the computation of the scale factor. The proposed high speed CORDIC algorithm has been realized in Field Programmable Gate Array(FPGA). Though the VLSI architecture of the proposed algorithm is prototyped using 16 bit fixed point format, it can easily be extended for higher bit length. The performance of the proposed VLSI architecture has extensively been assessed based on various performance metrics like latency, hardware complexities, maximum operating frequency and power requirement. Our proposed prototype architecture of NARH based proposed CORDIC algorithm outshines most of the notable CORDIC architectures in respect of the above mentioned performance parameters.

Modular and parallel VLSI architecture of multi-dimensional quad-core GA co-processor for real time image/video processing
Anirban Chakraborty and Ayan Banerjee
Elsevier BV
Abstract In modern days multi-dimensional signal processing has appeared as a trendy research area, including hyper-spectral image/video processing. It involves computationally complex algebraic equations, thereby considering conventional algebra incompetent. To alleviate this problem, Geometric Algebra (GA) establishes its competency uniquely due to its compact and less complex equations for operating in higher dimensional frameworks in contrast to the conventional one. Hardware realization is indispensible for any real time signal processing algorithm. As any GA based algorithm usually results in a significantly complex, outsized and sluggish digital circuit, it lags behind the conventional algebra in case of hardware implementation. Hence, in this article, a parallel-pipelined VLSI architecture of GA co-processor has been proposed to mitigate this intricacy, thereby becoming proficiently competent for computing a variety of GA operations. 4 channel parallelisms and 8-stage pipelining is the main attraction of our proposed design which does not necessitate any wait-state even in the situation of concurrent accessing of Memory. Supremacy of our design has been established over other state-of-the-art GA architectures in terms of number of processing cycle, latency and throughput. Superiority of our proposed design has also been proved in case of exemplary real time applications.

A novel VLSI design of radix-4 DFT in current mode
A. Ganguly, A. Chakraborty, and A. Banerjee
Informa UK Limited
ABSTRACT Discrete-time Fourier transform (DFT) is viewed as an important tool in discrete time signal processing. Applications in wireless communication such as OFDM uses DFT/IDFT in its receiver and transmitter. For small battery powered wireless devices, discrete time analogue DFT can be very useful as a low-energy front-end. The quest for a reduction in the effect due to the mismatch of transistors lead to higher radix structure. It becomes very challenging for the designer to build an analogue circuit for implementation of DFT with radix sizes 4, 8, and so on. This is mainly because of hand calculation of circuit-level equations from butterfly algorithm becomes a long process. Thus, a design methodology becomes a necessary option in this regard. Here an algorithm is proposed for the generation of circuit-level equations leading to signal routing table for the circuit of basic radix-4 FFT. Following that algorithm, a current mode all analogue circuit with cascode current mirror is proposed. Simulations are carried out in SPICE using BSIM4 65 nm CMOS process. A mismatch noise model is also made to show the reduction in error with higher radix structure. The non-ideal effects due to mismatch in Vth are analysed through Monte-Carlo simulation.

A Multiplierless VLSI architecture of QR decomposition based 2D wiener filter for 1D/2D signal processing with high accuracy
Anirban Chakraborty and Ayan Banerjee
IEEE
Now-a-days real-time signal processing attracts growing interests from the researchers all over the world due to its advantageous nature in solving various hindrances that frequently occur in various significant signal processing applications. Specifically, digital images suffer from noise contamination and blurring effect which poses difficulties in extracting useful information from those images. This necessitates the removal of noises from those digital images as well as de-blurring of such images in real-time. In this article, we have proposed a low area and highly accurate VLSI architecture of 2D Wiener filter which can be applied for any1D/2D real-time signal efficiently. The applicability of inherently highly accurate Wiener filter throttles due to its computational complexities. Our focus in this article is to overcome the barrier by reducing the computational complexity using Toeplitz matrix formation and its QR decomposition. We have proposed an area efficient multiplier-less VLSI architecture for realizing 2D Wiener filter. We have exploited the concept of Givens rotation based QR decomposition and also we have utilized the CORDIC algorithm to achieve high performance of our design. We have also applied our proposed hardware in real-time audio signal and image denoising. The supremacy of our proposed design can be proved from the analysis of pictorial and also numerical results.

A Memory Efficient, High Throughput and Fastest 1D/3D VLSI Architecture for Reconfigurable 9/7 5/3 DWT Filters
Anirban Chakraborty, Debolina Chakraborty, and Ayan Banerjee
IEEE
In this paper, we have proposed a low area, high throughput parallel VLSI architecture of 3D Discrete Wavelet Transform (DWT). First of all, we have proposed a multiplier-less conventional 1D DWT architecture which has high throughput like 4 outputs / clock cycles. We have greatly reduced the number of Multiply and Accumulation (MAC) units needed for our ID DWT design in comparison to any other conventional DWT architectures. Inside MAC units, all the multipliers are replaced by Shift and Add module to reduce the area. All the adders used in our design are non-conventional speculative adders which have one special feature like highest processing speed. Thus using such adder, the propagation delay time has greatly been reduced. Our proto-type ID DWT architecture is reconfigurable as it can be used to realize the output of both 9/7 and 5/3 DWT filters. By using this ID DWT architecture as both row and column processor, we have designed 3D DWT architecture where we have efficiently reduced the internal buffer size to zero by reusing the off-chip input memory for storing the intermediate data also. By carefully controlling the sequence of operation and maintaining proper timing, we have been able to achieve a high processing speed of $\\displaystyle \\frac {N^{2}}{4}$ clock cycles/ 2 frames for the overall 3D DWT architecture, where the frame size is (N X N). Throughput of our proposed 3D DWT architecture is also 4 outputs / clock cycles which are exactly same as our proposed ID DWT architecture.

An Efficient Spectral Domain Approach of Periodic Noise Suppression in Digital Images using Gaussian Filtering Profile
Debolina Chakraborty, Anirban Chakraborty, Milan K. Tarafder, Ayan Banerjee, and Sekhar R. B. Chaudhuri
IEEE
Research area based on periodic noise fading from digital images stands distinctively and uniquely in the recent research field of image processing due to its enormous appellation amongst the researchers over the last few decades. Images are often contaminated by periodic noises (unintended and spatially dependent repetitive pattern) which considerably demean image quality. Separation of high amplitude noisy spectra by using appropriate thresholding method becomes easier as periodic noisy spectral components concentrate in image spectrum and becomes clearly noticeable from the remaining uncorrupted spectral components. Here, we have proposed a simple yet graceful fully adaptive periodic noise reduction algorithm. Atlst double thresholding method has been exploited to get the noisy bitmap using the concept of spectral histogram after spectral smoothing operation. After that, approximate shapes of each noisy region are detected in an elegant way. Lastly, to filter out those noisy components, an Adaptive Gaussian Restoration Filter (AGRF) with no prefixed coefficients has been applied in the filtration stage. Performance of our proposed algorithm has been judged with other existing algorithms as in the literature which demarcates that our proposition is able to achieve more effective restoration with considerable lower computational time.

Low Area Memory Efficient VLSI Architecture of 1D/2D DWT for Real Time Image Decomposition
Anirban Chakraborty and Ayan Banerjee
IEEE
There is a dearth of high quality reconfigurable hardware for DWT which can be extensively used in various real time signal and image processing applications. In this article, we have focused on proposing a dedicated customizable hardware for DWT applicable to 1D/2D signal processing. We have proposed bit-serial Distributed Arithmetic (DA) based VLSI architectures for ID/2D DWT. Exploitation of DA enables us to make our designs Multiplierless, thereby consuming less area. Bit serial configuration of DA is also exploited to introduce modularity and pipelining in the proposed convolution DWT based ID/2D architectures. Though the speed of the proposed designs may not be suitable for certain applications, a number of parallel channels can be introduced for the same. The provision of mode selection can efficiently be used to realize 9/7 and also 5/3 DWT filters in the same proposed architecture. In the proposed 1D/2D architecture, we have achieved about 50% memory size reduction, 30% reduction in area consumption and comparable speed in comparison to the other latest notable DWT architecture so far. We have verified the viability of our proposed 2D architecture by utilizing it in real time image decomposition.

A Highly Accurate Current Mode Analog Implementation of Radix-2 FFT/IFFT Processor
Anirban Ganguly, Anirban Chakraborty, and Ayan Banerjee
IEEE
The discrete Fourier transform is a very powerful tool of digital signal processing used for spectral domain analysis of signals. The radix-2 fast Fourier transform (FFT) is the most popular algorithm for calculation of DFT for its simplicity. Applications where DFT and IDFT operations both are needed should include a single computational block which can calculate both DFT an IDFT as per the need. Analog implementation of DFT has low power consumption and takes less area on silicon. The proposed technique discusses the analog hardware implementation of eight point FFT/IFFT using radix-2 algorithm. This approach saves a lot of area and memory requirements. The FFT hardware is designed using cascode current mirror for higher accuracy. Mode selection between FFT/IFFT is done with switch arrangement. The circuit is tested rigorously with real and complex inputs using SPICE simulation with BSIM4 65nm CMOS process.

Automated spectral domain approach of quasi-periodic denoising in natural images using notch filtration with exact noise profile
Debolina Chakraborty, Anirban Chakraborty, Ayan Banerjee, and Sekhar R. Bhadra Chaudhuri
Institution of Engineering and Technology (IET)
The domain of noise fading from digital images, by virtue of its enormous appellation amongst the researchers, stands out uniquely in the recent research field of image processing over the last few decades. Periodic noises are unintended spurious signals which often agitate an image during acquisition/transmission, thereby resulting in repetitive patterns having spatial dependency and extensively demeaning visual excellence of the image. However, high amplitude noisy spectral components are clearly noticeable from the remaining uncorrupted ones in the corresponding Fourier transformed corrupted image spectrum. Hence, it is easier to distinguish and minimise those noisy components using an appropriate thresholding and filtration technique. Therefore, to start with, a simple yet elegant model of the noise-free natural image has been developed from the corrupted one followed by a proper thresholding method to get the noisy bitmap. Finally, an elegant adaptive sinc restoration filter with the concept of extracting the exact shape of a noise spectrum profile has been applied in the filtration phase. The performance of the proposed algorithm has been assessed both visually and statistically with other state-of-the-art algorithms in the literature in terms of various performance measurement attributes, providing evidence of achieving more effective restoration with considerable lower computational time.

A multiplier less VLSI architecture of modified lifting based 1D/2D DWT using speculative adder
A. Chakraborty, D. Chakraborty, and A. Banerjee
IEEE
This paper presents a fast, cost effective, area efficient, multiplier less and pipelined VLSI architecture of 1D and 2D Discrete Wavelet Transform (DWT). We have proposed 1D DWT architecture, based on a novel existing modified lifting based algorithm. Our proposition outperforms other available lifting algorithm based architectures as in the literature in terms of hardware requirement and operating frequency. This is achieved by maintaining comparable latency and throughput. Area of our design is reduced, introducing a pipelined shift and add unit instead of multiplier. Speed is also enhanced by diminishing critical path delay to less than one adder delay (< Ta) using a recently invented exceptionally faster, non-conventional speculative adder. Our design is capable of operating at such a high frequency where other existing designs cannot be able to operate satisfactorily. We have extended our proposition for the design of 2D DWT architecture elegantly using our 1D DWT pipelined architecture. In 2D DWT, we have used an innovative block based Z type memory scanning method at our own way for reducing total processing time. A two channel parallelism is incorporated in a cost effective way into 2D DWT architecture to produce less number of processing cycles along with double throughput in comparison with other existing designs.

A proficient method for periodic and quasi-periodic noise fading using spectral histogram thresholding with sinc restoration filter
Debolina Chakraborty, Milan Kumar Tarafder, Anirban Chakraborty, and Ayan Banerjee
Elsevier BV
Abstract Image denoising is a challenging task in the recent research field of image processing. During image acquisition, periodic and quasi-periodic noise perturbs the image signal. Hence, the area involving the removal of periodic noise from the images has immense significance among the researchers over the last few decades. Periodic noises are unintended signals and result in repetitive spatially dependent patterns. These degrade visual quality significantly. Presence of some high amplitude spiky peaks in the spectral domain makes them clearly distinguishable from the non-noisy coefficients. Hence, these become easy to segregate using appropriate thresholding technique. Till date, many algorithms have been proposed to alleviate noisy effect while preserving the authentic image information and thereby improving the image quality in frequency domain which has proved to be much better solution than the spatial domain operations. Here, we have proposed one simple, yet elegant, and fully adaptive reconstruction algorithm by using the concept of spectral domain histogram for thresholding. Then a novel sinc restoration filter is applied during the noisy frequencies cancellation phase. Performance of the proposed algorithm is compared with some other algorithms as discussed in the literature in terms of various metrics which proves the novelty and supremacy of our proposition both qualitatively and quantitatively.

ANIRBAN CHAKRABORTY

EDUCATION

RESEARCH INTERESTS

Scopus Publications