Received B.E. in Electronics and Communication engineering in 2004 from R.G.P.V, Bhopal, Received M.Tech in VLSI Design from ABV-IIITM, Gwalior in 2006 and Received Ph.D degree in the area of VLSI implementation of DSP Algorithms from JUET, Guna in 2017. In 2006 he was selected for SMDP project, sponsored by MHRD Government of India and joined as Research engineer in the Department of Electronics and Communication engineering, MANIT, Bhopal. In 2007 he joined Jaypee University of Engineering and Technology, Guna, Madhya Pradesh as a Assistance Professor. Currently he serves as the reviewers of various IEEE Transactions, Journal of Circuit, System and Signal Processing, Springer. He has member of various professional bodies like IEEE, IETE. His research interest includes various VLSI architectures design, ASIC and FPGA designs, Image processing. He has published nearly 6 technical papers.
EDUCATION
B.E., M.Tech., Ph.D.
RESEARCH, TEACHING, or OTHER INTERESTS
Electrical and Electronic Engineering, Signal Processing, Hardware and Architecture, Artificial Intelligence
13
Scopus Publications
Scopus Publications
An efficient method of modulo adder design for Digital Signal Processing applications Subodh Kumar Singhal, Sumit Kumar, Sujit Kumar Patel, K. Anjali Rao, Gaurav Saxena Methodsx, 2025 Modulo adder is a widely used arithmetic component in many Digital Signal Processing (DSP) applications such as Finite Impulse Response (FIR), Infinite Impulse Response (IIR) filters, digital signal processors, image processing modules, discrete cosine transform, and cryptography. Therefore, in this paper, the critical path delay and area of modulo adder are analyzed. An optimized diminished-one modulo adder for 2 n + 1 is proposed based on the analysis results. • Theoretical comparison shows that the suggested modulo adder involves 23.41 % less area (transistors count) and 31.64 % less delay than the best existing design for an average bit-width. • Synthesis result reveals that the proposed modulo adder involves 13.71 % less area and 14.5 % less delay compared to the best existing modulo adder structure design in the literature for an average bit-width. • To observe the overall efficacy of the suggested modulo adder design, the area delay product (ADP) and power delay product (PDP) values of the proposed and existing modulo adder designs are computed using synthesis data. The values obtained for ADP and PDP reveal that the proposed design achieves a 26.2 % reduction in ADP and a 32.8 % improvement in PDP compared to the best available modulo-adder structure.
Performance analysis of single image fog expulsion techniques Gaurav Saxena, Sarita Singh Bhadauria, Subodh Kumar Singhal Proceedings 2021 IEEE 10th International Conference on Communication Systems and Network Technologies Csnt 2021, 2021 Haze removal techniques are widely used in various computer vision applications like object detection, tracking, target recognition, and video surveillance. Therefore, in this paper, the classification of different fog removal techniques is presented. Further, recent dehazing algorithms related to each category are analyzed for the restoration of atmospherically degraded images. However, the performance of the different algorithms is evaluated based on the most commonly used image quality assessment parameters. Hence, different comparison parameters utilized for the evaluation of the performance of the various dehazing algorithms are also discussed. Finally, the qualitative and quantitative comparison of the various state-of-art defogging algorithms and research scope for further improvement is discussed.
Area-delay efficient Radix-4 8×8 Booth multiplier for DSP applications Subodh K. SINGHAL, Sujit K. PATEL, Anurag MAHAJAN, Gaurav SAXENA Turkish Journal of Electrical Engineering and Computer Sciences, 2021 Booth multiplier is the key component in portable very large-scale integration (VSLI) systems enabled with signal and image processing applications. The area, delay, and energy are the major constraints in these systems. Therefore, in this paper, a detailed analysis of the state-of-the-art Booth multiplier architecture and its various internal units are presented to find the scope of optimization. Based on the finding of analysis, optimized new binary to 2's complement (B2C), Booth encoder-cum-selector type-1 and type-2, and partial product addition units are proposed. Furthermore, using these optimized units, an efficient parallel radix-4 8×8 Booth multiplier architecture is proposed. The simulation is carried out to verify the functionality of the proposed design. The synthesis results show that the proposed structure offers a saving of 13.56% in delay and 34.87% in area compared to the recent similar Booth multiplier design. Comparison results also reveal that the proposed Booth multiplier design involves 43.7% less area-delay-product and 11.24% less energy compared to the recent Booth multiplier design. Therefore, the proposed Booth multiplier design could be helpful for efficient realization of digital signal processing systems.
Efficient diminished-1 modulo (2n + 1) adder using parallel prefix adder Subodh Kumar Singhal, B. K. Mohanty, Sujit Kumar Patel, Gaurav Saxena Journal of Circuits Systems and Computers, 2020 Parallel prefix adder (PPA) is the core component of diminished-1 modulo ([Formula: see text]) adder structure. In this paper, group-carry selection logic based PPA design is proposed and it is free from redundant logic operations which otherwise present in the existing PPA design based on group sum selection logic. Further, the logic expression of pre-processing unit of PPA is also presented in a simplified form to save some logic resources. The proposed PPA design for bit-width 32-bit involves 26.1% less area, consumes 28.4% less power and marginally higher critical-path delay than the existing PPA design. An efficient diminished-1 modulo ([Formula: see text]) adder structure is presented using proposed PPA design and modified carry computation algorithm of existing design. The proposed diminished-1 modulo ([Formula: see text]) adder structure for bit-width 32-bit offers a saving of 25.5% in area-delay-product (ADP) and 24.1% in energy-delay-product (EDP) than the best of the existing modulo adder structure.
Area-delay and energy efficient multi-operand binary tree adder Sujit Kumar Patel, Subodh Kumar Singhal Iet Circuits Devices and Systems, 2020 Here, the critical path of ripple carry adder (RCA)-based binary tree adder (BTA) is analysed to find the possibilities for delay minimisation. Based on the findings of the analysis, the new logic formulation and the corresponding design of RCA are proposed for the BTA. The comparison result shows that the proposed RCA design offers better efficiency in terms of area, delay and energy than the existing RCA. Using this RCA design, the BTA structure is proposed. The synthesis result reveals that the proposed 32-operand BTA provides the saving of 22.5% in area–delay product and 28.7% in energy–delay product over the recent Wallace tree adder which is the best among available multi-operand adders. The authors have also applied the proposed BTA in the recent multiplier designs to evaluate its performance. The synthesis result shows that the performance of multiplier designs improved significantly due to the use of proposed BTA. Therefore, the proposed BTA design can be a better choice to develop the area, delay and energy efficient digital systems for signal and image processing applications.
Efficient Parallel Architecture for Fixed-Coefficient and Variable-Coefficient FIR Filters Using Distributed Arithmetic Subodh Kumar Singhal, Basant Kumar Mohanty Journal of Circuits Systems and Computers, 2016 In this paper, we performed the complexity analysis of fixed-coefficient and variable-coefficient distributed arithmetic (DA)-based finite impulse response (FIR) filter structures to observe the effect of LUT decomposition on the area complexity of DA structure. The complexity analysis reveals that the area complexity of different units of DA FIR filter structure does not increase proportionately with the level of parallelism. An appropriate selection of LUT decomposition factor, and introducing higher level of parallelism in the computation could improve the area-delay efficiency of both fixed-coefficient and variable-coefficient DA-based FIR structures. Based on these findings, we have proposed bit-parallel block-based DA structures, for fixed-coefficient and variable-coefficient FIR. The proposed structures process one block of input samples and produce one block of outputs in every clock cycle. Theoretical estimate shows that the proposed fixed-coefficient structure, for block-size 8 and filter-length 32, involves eight times more ROM-LUT words, eight times more adders, two less registers, and offers eight times higher throughput-rate than the existing similar structure. For the same block-size and filter-length, the proposed variable-coefficient structure involves 7.2 times more adders, the same number of registers, eight times more MUXes, and offers eight times higher throughput than the best available similar structure. Synthesis result shows that the proposed fixed-coefficient structure for block-size 8 and filter-length 32 involve 47% less area delay product (ADP) and 42% less energy per sample (EPS) than the existing structure and offers nearly eight times higher throughput than others. For the same block-size and filter-length, the proposed structure for variable-coefficient FIR involves 71% less ADP and 65% less EPS than the similar existing structures.