@unom.ac.in
Associate Professor
University of Madras
1.B.Sc 1990 Madurai Kamaraj University Computer Science Class-I
2. M.C.A. 1993 Manonmanium Sundaranar University Computer Science Class-I
3. M.Phil 2003 Manonmanium Sundaranar University Computer Science Class-I
4. SLET 2000 Bharathidasan University Computer Science
5. Ph.D. 2012 Mother Theresa Women’s University Computer Science
• Data Mining
• Machine and Deep Learning
• Big Data Analytics
Scopus Publications
A. S. Karthik Kannan, S. Appavu alias Balamurugan, and S. Sasikala
Institute of Electrical and Electronics Engineers (IEEE)
Software packages that meets the requirements of an organization should be appropriately investigated and evaluated. Picking up a wrong software package may adversely influence the business process and working function of an organization. Inappropriate software selection can turn out to be costly and it is a time-consuming decision-making process. This paper aims to provide a base for selecting the open source software packages based on analytic hierarchy process and technique for order preference similarity to ideal solution methodologies. In addition, the priority weights are generated and optimized by using teaching–learning based optimization approach. A well-organized algorithmic procedure is given in detail and a numerical example is examined to illustrate the validity and practicability of our proposed methodologies.
E. S. Vinoth Kumar, S. Appavu alias Balamurugan, and S. Sasikala
Atlantis Press
In present decade, many Educational Institutions use classification techniques and Data mining concepts for evaluating student records. Student Evaluation and classification is very much important for improving the result percentage. Hence, Educational Data Mining based models for analyzing the academic performances have become an interesting research domain in current scenario. With that note, this paper develops a model called Multi-Tier Student Performance Evaluation Model (MTSPEM) using single and ensemble classifiers. The student data from higher educational institutions are obtained and evaluated in this model based on significant factors that impacts greatermanner in student’s performances and results. Further, data preprocessing is carried out for removing the duplicate and redundant data, thereby, enhancing the results accuracy. The multi-tier model contains two phases of classifications, namely, primary classification and secondary classification. The First-Tier classification phase uses Naive Bayes Classification, whereas the second-tier classification comprises the Ensemble classifiers such as Boosting, Stacking and RandomForest (RF). The performance analysis of the proposed work is established for calculating the classification accuracy and comparative evaluations are also performed for evidencing the efficiency of the proposed model.
A. S. Karthik Kannan, S. Appavu Alias Balamurugan, and S. Sasikala
Institute of Electrical and Electronics Engineers (IEEE)
This paper proposes a metaheuristic approach for Group Decision Making (GDM) model for integrating heterogeneous information. Instead of converting heterogeneous information into a single form, the proposed approach incorporates heterogeneous information using a Weighted Power Average (WPA) operator to prevent information loss. The consensus degree between the individual and the group (decision matrix) is then determined on the basis of the deviation degree. In addition, to adjust the individual decision matrix, the iterative algorithm’s feedback mechanism is used, which does not achieve consensus. The consensus GDM is used by the Analytic Hierarchy Process (AHP), an imperative technique for generating weights for each and every criteria. These weights are optimized by using Jaya, one of the metaheuristic algorithms. In addition, in order to choose the best alternative, a heterogeneous Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) is used. The supplier selection problem is chosen to validate the proposed model and compare it with other similar GDM models. The results show that the proposed approach not only prevents the loss of information, but can also effectively integrate heterogeneous information in the heterogeneous GDM environment.
N. Malarvizhi, J. Aswini, S. Sasikala, M. Hemanth Chakravarthy, and E. A. Neeba
Springer Science and Business Media LLC
S. Appavu Alias Balamurugan, K. R. Saranya, S. Sasikala, and G. Chinthana
Atlantis Press
Associate Professor, Department of Computer Science, Central University of Tamil Nadu, Thiruvarur, Tamil Nadu, India Research Scholar, Department of Computer Science and Engineering, Velammal College of Engineering and Technology, Madurai, Tamil Nadu, India Associate Professor, Department of Computer Science and Engineering, Velammal College of Engineering and Technology, Madurai, Tamil Nadu, India Assistant Professor, Department of Pharmacology, Thanjavur Medical College, Thanjavur, Tamil Nadu, India
A. Aafreen Nawresh and S. Sasikala
Springer Singapore
Feature extraction has been a difficult task in medical imaging and analysis since there remains an essential feature that may be useful for a precise identification and diagnosis purpose. Our main motive is to initiate an independent detection and classification process to improvise and accelerate the physician’s decision-making system during the emergency phase in the case of brain haemorrhages or trauma. To extract the haemorrhagic area, the other parts in and around the brain CT scan such as the skull, brain ventricles, edema tissues are to be eliminated, for which the image has to undergo processing. The process of getting the Region of Interest and the features underlies in the following steps: (a) Histogram Image intensities, (b) Otsu Thresholding, (c) Skull Removal, (d) Gray Level Co-occurrence Matrix for Feature extraction, (e) Classification using K-Nearest Neighbour and Multi Layer Perceptron algorithms for the type identification of brain haemorrhages. The identification and classification phase are used to validate the output got using the methods in both the phases. In this proposition, K-Nearest Neighbour and Multilayer Perceptron are being compared where the result obtained in classifying brain haemorrhages gave an accuracy of about 82% using K-Nearest Neighbour and 95.5% using Multilayer Perceptron.
S. Shyni Carmel Mary and S. Sasikala
Springer Singapore
Automated segmentation of abnormal medical images using computing algorithms is a challenging task. Among other segmentation and clustering algorithm, Fuzzy C Means (FCM) is beneficial for producing accurate results. In this paper, the empirical work concentrated on Identification of tumor from spinal cord MRI by determining the accuracy of the affected region on FCM cluster result with different filtering techniques. At first, Linear Support Vector Machine (SVM) is used to classify the image as normal or abnormal. Once the anomaly confirmed MRI images are pre-processed with different filters such as Arithmetic, Gaussian, Median, Wiener and Anisotropic diffusion; for the enhancement without changing the details of the image. Each Filtering has unique characteristic over the dataset. All the pre-processing data is clustered using FCM to identify the tumor region. The best filtering technique suitable for the clustering is selected based on the accuracy and processing time taken on various numbers of clusters. The proposed algorithm-anisotropic diffusion with FCM’s performance measures gave an efficient result.
R. Shalini and S. Sasikala
Springer Singapore
In the recent era, the increasing diabetic ratio effects in producing complex retinal diseases. The early diagnosis will avoid the consequences of diabetic-related complications. So, there is an urge to develop a computerized model for identifying the retinopathy features caused due to diabetes while evading the false positives. The aim of this paper is to identify and segment the retinal features like blood vessels, optic disc in diabetic retinopathy (DR) fundus images in order to remove the non-diabetic retinopathy features which makes the detection of DR features (lesions) easier. And this is done by using a hybrid segmentation algorithm called BINI which combines the model of both binary and Niblack’s thresholding. Initially, the input image is standardized by resizing the image using bi-cubic interpolation area method. Then, the fundus image quality is enhanced using preprocessing techniques like green channel extraction, intensity channel extraction, median filtering, contrast limited adaptive histogram equalization and morphological operations. The preprocessed images are segmented to produce retinal features using the BINI algorithm. It is a novel method which hybrids both binary thresholding and Niblack’s thresholding. The performance of the segmentation methods is evaluated using the validation measures like Rand index (Ri) and Jaccard index (Ji), Precision (Pr), Recall (Rc) and F-Measure (Fm). The proposed method for segmenting the retinal features using BINI thresholding has given an accuracy of about 96.48%; it leads to an accuracy of 100% clear segmentation of the lesions of diabetic retinopathy images.
D. Renuka Devi and S. Sasikala
Springer International Publishing
The Feature selection (FS) plays an imperative role in Machine Learning (ML) but it is really demanding when we apply feature selection to voluminous data. The conventional FS methods are not competent in handling big datasets. This leads to the need of a technology that processes the data in parallel. MapReduce is a new programming framework used for processing massive data by using the “divide and conquer” approach. In this paper, a novel parallel BAT algorithm is proposed for feature selection of big datasets and finally classification is applied to the set of known classifiers. The proposed parallel FS technique is highly scalable for big datasets. The experimental results have shown improved efficacy of the proposed algorithm in terms of the accuracy and comparatively lesser execution time when the number of parallel nodes is increased.
D. Renuka Devi and S. Sasikala
Springer Science and Business Media LLC
AbstractFeature selection is mainly used to lessen the dispensation load of data mining models. To condense the time for processing voluminous data, parallel processing is carried out with MapReduce (MR) technique. However with the existing algorithms, the performance of the classifiers needs substantial improvement. MR method, which is recommended in this research work, will perform feature selection in parallel which progresses the performance. To enhance the efficacy of the classifier, this research work proposes an innovative Online Feature Selection (OFS)–Accelerated Bat Algorithm (ABA) and a framework for applications that streams the features in advance with indefinite knowledge of the feature space. The concrete OFS-ABA method is suggested to select significant and non-superfluous feature with MapReduce (MR) framework. Finally, Ensemble Incremental Deep Multiple Layer Perceptron (EIDMLP) classifier is applied to classify the dataset samples. The outputs of homogeneous IDMLP classifiers were combined using the EIDMPL classifier. The projected feature selection method along with the classifier is evaluated expansively on three datasets of high dimensionality. In this research work, MR-OFS-ABA method has shown enhanced performance than the existing feature selection methods namely PSO, APSO and ASAMO (Accelerated Simulated Annealing and Mutation Operator). The result of the EIDMLP classifier is compared with other existing classifiers such as Naïve Bayes (NB), Hoeffding tree (HT), and Fuzzy Minimal Consistent Class Subset Coverage (FMCCSC)-KNN (K Nearest Neighbour). The methodology is applied to three datasets and results were compared with four classifiers and three state-of-the-art feature selection algorithms. The outcome of this research work has shown enhanced performance in accuracy and less processing time.
S. Koteeswaran, N. Malarvizhi, E. Kannan, S. Sasikala, and S. Geetha
Springer Science and Business Media LLC
Aviation safety management system is a vital component of the aviation industry. Aviation safety inspectors apply a broad knowledge about aviation industry, aviation safety, and the central laws and regulations, and strategies affecting aviation. In addition, they put on severe technical knowledge and skill in the operation and maintenance of aircraft. Data mining methods also have been successfully applied in aviation safety management system. Aviation industry accumulates large amount of knowledge and data. This paper proposes a method that applied data mining technique on the accident reports of the Federal Aviation Administration (FAA) accident/incident data system database which contains accident data records for all categories of civil aviation between the years of 1919 and 2014. In this study, we have investigated the application of several data mining methods on the accidents reports, to arrive at new inferences that could help aviation management system. Moreover correlation based feature selection (CFS) with Oscillating Search Technique is used to select the number of prominent attributes that are potential factors causing maximum number of accidents in aircraft. The principle of this work is to find out the effective attributes in order to reduce the number of the accidents in the aviation industry. This proposed novel idea named "improved oscillated correlation feature selection (IOCFS)" is evaluated against the conventional classifiers like Naïve bayes, support vector machine (SVM), artificial neural network (ANN), k-nearest neighbor (k-NN), Multiclass classifier and decision tree (J48). The selected features are tested in terms of their accuracy, running time and reliability as in terms of true positive rate, false positive rate, precision, recall-measure and ROC. The results are seen to be the best for k-NN classifier on comparing with other conventional classifiers, with the value of k = 5.
The optimal feature subset selection over very high dimensional data is a vital issue. Even though the optimal features are selected, the classification of those selected features becomes a key complicated task. In order to handle these problems, a novel, Accelerated Simulated Annealing and Mutation Operator (ASAMO) feature selection algorithm is suggested in this work. For solving the classification problem, the Fuzzy Minimal Consistent Class Subset Coverage (FMCCSC) problem is introduced. In FMCCSC, consistent subset is combined with the K-Nearest Neighbour (KNN) classifier known as FMCCSC-KNN classifier. The two data sets Dorothea and Madelon from UCI machine repository are experimented for optimal feature selection and classification. The experimental results substantiate the efficiency of proposed ASAMO with FMCCSC-KNN classifier compared to Particle Swarm Optimization (PSO) and Accelerated PSO feature selection algorithms.
R. Shalini and S. Sasikala
IEEE
Visual perception is very important for human life. Although several medical conditions can cause retinal disease, the most common cause is diabetes. Diabetic Retinopathy (DR) can be identified using retinal fundus images. Detection and classification of deformation in Diabetic retinopathy is a challenging task since it is symptomless. Several algorithms were analyzed for the identification of abnormality. The analysis of different models in detecting the abnormalities from the image is done which includes various preprocessing techniques to standardize the image and post-processing techniques are applied for morphological adjustments, segmentation algorithms for segmenting the Lesion of Interest(LOI ) namely white lesions and red lesions, further feature extraction methods extracts the features like Micro Aneurysms, Hemorrhages, Exudates and Cotton Wool Spots and so on finally, classification methods were utilized which concludes the presence or absence of DR symptoms along with the severity based on the count of the features extracted in the given retinal image. This survey study aims to develop a novel algorithm to identify and detect types of above mentioned diseases and find out the severity of those diseases also examine with 100% accuracy.
S Appavu Alias Balamurugan, J Felicia Lilian, and S Sasikala
IEEE
The integration of various embedded technological devices makes our environment to enable IoT. Internet of things plays a major role in bringing out the smartness into the culture. The smart transportation system is one such example that mainly focuses on to provide safety, efficient usage and reliable transportation this in turn is known in the form of smartness, and sustainable environment that is established for the public. To provide intelligent transportation System (ITS) there exist many problems in supply of the resources such as controlling of air pollution, reduce traffic congestion and the continuous growth of population. Therefore a deep analysis is required to control the trajectory data. These data have to analyzed and predicted using Big Data Analytics techniques. To solve the congestion in traffic a proper Data mining methodology is to be adapted that provides/paves way for the prioritized vehicles or it can make the congested payment to be free. These technologies will provide proper guidance to the public sector in enabling a healthier environment. The use of GPS sensor and analyzing the data is being done on the trajectory big data which will enable us to generate a Smarter Environment. This paper has discussed about the various technological implementations that have been carried out in prior to develop a smart intelligent transportation and it also has pointed out the different mechanism in executing this system effectively.
S. Sasikala and D. Renuka Devi
IEEE
With the increase in recent development in hardware and software technologies, streaming data is used everywhere in today's environment and it is a very difficult task to store, process, investigate and visualize huge volumes of data. One of the most important and challenging issue in the data stream domain is the classification of the big datasets. However the conventional classification methods developed to run in a streaming environment with high use of memory constraints and longer execution running time. Another three major important issues in the data stream classification methods are huge length, conception drift and Feature Selection (FS). In this review paper, we consider the difficult problem of FS algorithms for streaming data, in which the size of streaming data for the feature set is unknown, primary to an inflexible demand in computation constraints, and not every feature is available from classifier model. In order to solve this difficulty, Swarm Intelligence (SI) algorithms are performed on the high dimensionality and streaming big dataset samples which result in increase classification accuracy, less memory consumption and lesser running time when compared to the existing streaming FS algorithms on various datasets. The proposed SI based FS algorithms overcomes the difficulty of the traditional FS algorithms.
P. Punithavathi, S. Geetha, and S. Sasikala
ACM
Cancelable biometric system is a transformation technique for securing biometric templates. This work proposes application of bi-level template securing technique at feature-level, and generates revocable templates. The bi-level transformation includes Discrete Fourier Transform and partial Hadamard based transformations on iris features, using user-specific key. The proposed bi-level transformation applied at feature-level provides better robustness and security against correlation attacks. A comprehensive analysis has been performed on the proposed approach to study the non-invertibility, diversity, revocability and matching performance on iris samples. The experimental results show that the proposed approach is promising, and deliver good performance.
S. Sasikala, S. Appavu alias Balamurugan, and S. Geetha
Elsevier BV
Optimal feature Selection is an imperative area of research in medical data mining systems. Feature selection is an important factor that boosts-up the classification accuracy. In this paper we have proposed a adaptive feature selector based on game theory and optimization approach for an investigation on the improvement of the detection accuracy and optimal feature subset selection. Particularly, the embedded Shapely Value includes two memetic operators namely include and remove features (or genes) to realize the genetic algorithm (GA) solution. The use of GA for feature selection facilitates quick improvement in the solution through a fine tune search. An extensive experimental comparison on 22 benchmark datasets (both synthetic and microarray) from UCI Machine Learning repository and Kent ridge repository of proposed method and other conventional learning methods such as Support Vector Machine (SVM), Naive Bayes (NB), K-Nearest Neighbor (KNN), J48 (C4.5) and Artificial Neural Network (ANN) confirms that the proposed SVEGA strategy is effective and efficient in removing irrelevant and redundant features. We provide representative methods from each of wrapper, filter, conventional GA and show that this novel memetic algorithm - SVEGA yields overall promising results in terms of the evaluation criteria namely classification accuracy, number of selected genes, running time and other metrics over conventional feature selection methods.
S. Sasikala, S. Appavu alias Balamurugan, and S. Geetha
Emerald
Abstract Selection of optimal features is an important area of research in medical data mining systems. In this paper we introduce an efficient four-stage procedure – feature extraction, feature subset selection, feature ranking and classification, called as Multi-Filtration Feature Selection (MFFS), for an investigation on the improvement of detection accuracy and optimal feature subset selection. The proposed method adjusts a parameter named “variance coverage” and builds the model with the value at which maximum classification accuracy is obtained. This facilitates the selection of a compact set of superior features, remarkably at a very low cost. An extensive experimental comparison of the proposed method and other methods using four different classifiers (Naive Bayes (NB), Support Vector Machine (SVM), multi layer perceptron (MLP) and J48 decision tree) and 22 different medical data sets confirm that the proposed MFFS strategy yields promising results on feature selection and classification accuracy for medical data mining field of research.
R. Dhinesh Kumar, A. Balaji Ganesh, and S. Sasikala
Indian Society for Education and Environment
Background: Automatic Speaker Identification (SID) systems has been a major breakthrough and crucial in many real world applications. Methods: This work addresses the SID task based on GMM-SVM in a three stage process. Firstly, the Gammatone Frequency Cepstral Coefficients (GFCC) and Mean Hilbert Envelope Coefficients (MHEC) of the speakers are extracted. Secondly, these features are modeled using Gaussian Mixture Model (GMM), on adapting the extracted acoustic features by mean, the corresponding super vectors are found and these vectors are trained using Support Vector Machine (SVM). Finally, the actual recognition is done by feeding the super vectors of them asked noisy test utterance by Ideal Binary Mask (IBM) into SVM model and their accuracy of recognition is compared for GFCC, MHEC and RASTA-MFCC in different noisy conditions. Findings: Evaluation results show that SID performance carried out with MHEC is extensively better than the performance of other two features. Applications: Major areas that implements automatic SIDs are forensics, surveillance and audio biometrics etc.
S. Sasikala, S. Appavu alias Balamurugan, and S. Geetha
Czech Technical University in Prague - Central Library
Abstract: This work is motivated by the interest in feature selection that greatly affects the detection accuracy of a classifier. The goals of this paper are (i) identifying optimal feature subset using a novel wrapper based feature selection algorithm called Shapley Value Embedded Genetic Algorithm (SVEGA), (ii) showing the improvement in the detection accuracy of the Artificial Neural Network (ANN) classifier with the optimal features selected, (iii) evaluating the performance of proposed SVEGA-ANN model on the medical datasets. The medical diagnosis system has been built using a wrapper based feature selection algorithm that attempts to maximize the specificity and sensitivity (in turn the accuracy) as well as by employing an ANN for classification. Two memetic operators namely “include” and “remove” features (or genes) are introduced to realize the genetic algorithm (GA) solution. The use of GA for feature selection facilitates quick improvement in the solution through a fine tune search. An extensive experimental evaluation of the proposed SVEGA-ANN method on 26 benchmark datasets from UCI Machine Learning repository and Kent ridge repository, with three conventional classifiers, outperforms state-of-the-art systems in terms of classification accuracy, number of selected features and running time.
S. Sasikala, S. Appavu alias Balamurugan, and S. Geetha
Elsevier BV
Abstract In this paper we propose a novel Shapely Value Embedded Genetic Algorithm, called as SVEGA that improves the breast cancer diagnosis accuracy that selects the gene subset from the high dimensional gene data. Particularly, the embedded Shapely Value includes two memetic operators namely “include” and “remove” features (or genes) to realize the genetic algorithm (GA) solution. The method is ranking the genes according to its capability to differentiate the classes. The method selects the genes that can maximize the capability to discriminate between different classes. Thus, the dimensionality of data features is reduced and the classification accuracy rate is improved. Four classifiers such as Support vector machine (SVM), Naive Bayes (NB), K-Nearest Neighbor (KNN) and J48 are used on the breast cancer dataset from the Kent ridge biomedical repository to classify between the normal and abnormal tissues and to diagnose as benign and malignant tumours. The obtained classification accuracy demonstrates that the proposed method contributes to the superior diagnosis of breast cancer than the existing methods.
S. Sasikala, S. Appavu alias Balamurugan, and S. Geetha
Springer India
Dimensionality reduction is an essential problem in data analysis that has received a significant amount of attention from several disciplines. It includes two types of methods, i.e., feature extraction and feature selection. In this paper, we introduce a simple method for supervised feature selection for data classification tasks. The proposed hybrid feature selection mechanism (HFS), i.e., RF-SEA (ReliefF-Shapley ensemble analysis) which combines both filter and wrapper models for dimension reduction. In the first stage, we use the filter model to rank the features by the ReliefF(RF) between classes and then choose the highest relevant features to the classes with the help of the threshold. In the second stage, we use Shapley ensemble algorithm to evaluate the contribution of features to the classification task in the ranked feature subset and principal component analysis (PCA) is carried out as preprocessing step before both the steps. Experiments with several medical datasets proves that our proposed approach is capable of detecting completely irrelevant features and remove redundant features without significantly hurting the performance of the classification algorithm and also experimental results show obviously that the RF-SEA method can obtain better classification performance than singly Shapley-value-based or ReliefF (RF)-algorithm based method.