Shina Sheen

@psgtech.edu

Associate Professor Department of Applied Mathematics and Computational Sciences
PSG College of Technology



                    

https://researchid.co/shinasheen

RESEARCH INTERESTS

Malware Detection, Network Security, Data Mining

17

Scopus Publications

Scopus Publications

  • Focused Crawler Based on Reinforcement Learning and Decaying Epsilon-Greedy Exploration Policy
    Parisa Begum Kaleel and Shina Sheen

    Zarqa University
    In order to serve a diversified user base with a range of purposes, general search engines offer search results for a wide variety of topics and material categories on the internet. While Focused Crawlers (FC) deliver more specialized and targeted results inside particular domains or verticals, general search engines give a wider coverage of the web. For a vertical search engine, the performance of a focused crawler is extremely important, and several ways of improvement are applied. We propose an intelligent, focused crawler which uses Reinforcement Learning (RL) to prioritize the hyperlinks for long-term profit. Our implementation differs from other RL based works by encouraging learning at an early stage using a decaying ϵ-greedy policy to select the next link and hence enables the crawler to use the experience gained to improve its performance with more relevant pages. With an increase in the infertility rate all over the world, searching for information regarding the issues and details about artificial reproduction treatments available is in need by many people. Hence, we have considered infertility domain as a case study and collected web pages from scratch. We compare the performance of crawling tasks following ϵ-greedy and decaying ϵ-greedy policies. Experimental results show that crawlers following a decaying ϵ-greedy policy demonstrate better performance

  • R-Sentry: Deception based ransomware detection using file access patterns
    Shina Sheen, K A Asmitha, and Sridhar Venkatesan

    Elsevier BV



  • Darknet Traffic Analysis and Classification Using Numerical AGM and Mean Shift Clustering Algorithm
    R. Niranjana, V. Anil Kumar, and Shina Sheen

    Springer Science and Business Media LLC
    The cyberspace continues to evolve more complex than ever anticipated, and same is the case with security dynamics there. As our dependence on cyberspace is increasing day-by-day, regular and systematic monitoring of cyberspace security has become very essential. A darknet is one such monitoring framework for deducing malicious activities and the attack patterns in the cyberspace. Darknet traffic is the spurious traffic observed in the empty address space, i.e., a set of globally valid Internet Protocol (IP) addresses which are not assigned to any hosts or devices. In an ideal secure network system, no traffic is expected to arrive on such a darknet IP space. However, in reality, noticeable amount of traffic is observed in this space primarily due to the Internet wide malicious activities, attacks and sometimes due to the network level misconfigurations. Analyzing such traffic and finding distinct attack patterns present in them can be a potential mechanism to infer the attack trends in the real network. In this paper, the existing Basic and Extended AGgregate and Mode (AGM) data formats for darknet traffic analysis is studied and an efficient 29-tuple Numerical AGM data format suitable for analyzing the source IP address validated TCP connections (three-way handshake) is proposed to find attack patterns in this traffic using Mean Shift clustering algorithm. Analyzing the patterns detected from the clusters results in providing the traces of various attacks such as Mirai bot, SQL attack, and brute force. Analyzing the source IP validated TCP, darknet traffic is a potential technique in Cyber security to find the attack trends in the network.

  • Ransomware detection by mining API call usage
    Shina Sheen and Ashwitha Yadav

    IEEE
    In the recent past one of the harmful forms of malware seen is the Ransomware. The year 2016 has seen a huge rise in ransomware attacks. According to the study by Tripwire, Ransomware has done the most amount of damage to organizations in 2017, followed by DDoS, Malicious Insiders, Phishing, and Known/Unknown Vulnerabilities. In this work, Application Programming Interface (API) calls are extracted from the executables and the most discriminating API calls are used to train a classifier to detect unknown ransomware. We have tested our method on various classifiers like Decision trees, KNN, Random forest. Class imbalance due to the difference in the number of samples available in two classes - Ransomware and benign is also considered. It is seen that Random forest with smote for class imbalance has given a detection rate of over 98%. A large number of ransomware samples have been analyzed and the discriminating API calls have been identified.

  • Preface


  • Computational intelligence, cyber security and computational models: Proceedings of ICC3 2015


  • Multilevel analysis to detect covert social botnet in multimedia social networks
    V. Natarajan, S. Sheen, and R. Anitha

    Oxford University Press (OUP)
    In recent years, social botnets have become a major security threat to both online social networking websites and their users. Social bots communicate over probabilistically unobservable communication channels and steal sensitive information from its victims. Stegobot is a social botnet which uses image steganography to hide the presence of communication. Since these botnets exhibit unique propagation methods, existing botnet detection techniques cannot identify these bots. In this paper, we propose an effective method to detect Stegobot hosts within a monitored social network. Based on the observations, Stegobot often has a differentiable communication pattern because of the unique design and implementation. Hence by investigating each host profile activity, it is possible to determine whether the profile is a Stegobot or normal. Our experiments show that the traffic patterns among Stegobot and normal traffic can be classified efficiently using multilevel social network profile analysis. In addition to the ability to detect bot traffic, a classification model is constructed using profile level and content level analysis to improve the detection ability. The experimental results show that the proposed method can detect Stegobot profiles with more than 97% accuracy and false-positive rate lower than 3%.

  • Malware detection in android files based on multiple levels of learning and diverse data sources
    Shina Sheen and Anitha Ramalingam

    ACM Press
    Smart mobile device usage has expanded at a very high rate all over the world. Mobile devices have experienced a rapid shift from pure telecommunication devices to small ubiquitous computing platforms. They run sophisticated operating systems that need to confront the same risks as desktop computers, with Android as the most targeted platform for malware. The processing power is one of the factors that differentiate PC's and mobile phones. Mobile phones are more compact and therefore limited in memory and depend on a limited battery power for their energy needs. Hence developing apps to run on these devices should take into consideration the above mentioned factors. To improve the speed of detection, a multilevel detection mechanism using diverse data sources is designed for detecting malware balancing between the accuracy of detection and usage of less compute intensive computations. In this work we have analyzed android based malware for analysis and a multilevel detection mechanism is designed using diverse data sources. We have evaluated our work on a collection of Android based malware comprising of different malware families and our results show that the proposed method is faster with good performance

  • Android based malware detection using a multifeature collaborative decision fusion approach
    Shina Sheen, R. Anitha, and V. Natarajan

    Elsevier BV
    Abstract Smart mobile device usage has expanded at a very high rate all over the world. Since the mobile devices nowadays are used for a wide variety of application areas like personal communication, data storage and entertainment, security threats emerge, comparable to those which a conventional PC is exposed to. Mobile malware has been growing in scale and complexity as smartphone usage continues to rise. Android has surpassed other mobile platforms as the most popular whilst also witnessing a dramatic increase in malware targeting the platform. In this work, we have considered Android based malware for analysis and a scalable detection mechanism is designed using multifeature collaborative decision fusion (MCDF). The different features of a malicious file like the permission based features and the API call based features are considered in order to provide a better detection by training an ensemble of classifiers and combining their decisions using collaborative approach based on probability theory. The performance of the proposed model is evaluated on a collection of Android based malware comprising of different malware families and the results show that our approach give a better performance than state-of-the-art ensemble schemes available.

  • Comparative study of two-and multi-class-classification-based detection of malicious executables using soft computing techniques on exhaustive feature set
    Shina Sheen, R. Karthik, and R. Anitha

    Springer India
    Detection of malware using soft computing methods has been explored extensively by many malware researchers to enable fast and infallible detection of newly released malware. In this work, we did a comparative study of two- and multi-class-classification-based detection of malicious executables using soft computing techniques on exhaustive feature set. During this comparative study, a rigorous analysis of static features, extracted from benign and malicious files, was conducted. For the analysis purpose, a generic framework was devised and is presented in this paper. Reference dataset (RDS) from National software reference library (NSRL) was explored in this study as a mean for filtering out benign files during analysis. Finally, through well-corroborated experiments, it is shown that AdaBoost, when combined with algorithms such as C4.5 and random forest with two-class classification, outperforms many other soft-computing-based techniques.

  • Malware detection by pruning of parallel ensembles using harmony search
    Shina Sheen, R. Anitha, and P. Sirisha

    Elsevier BV
    Detection of malware using data mining techniques has been explored extensively. Techniques used for detecting malware based on structural features rely on being able to identify anomalies in the structure of executable files. The structural attributes of an executable that can be extracted include byte ngrams, Portable Executable (PE) features, API call sequences and Strings. After a thorough analysis we have extracted various features from executable files and applied it on an ensemble of classifiers to efficiently detect malware. Ensemble methods combine several individual pattern classifiers in order to achieve better classification. The challenge is to choose the minimal number of classifiers that achieve the best performance. An ensemble that contains too many members might incur large storage requirements and even reduce the classification performance. Hence the goal of ensemble pruning is to identify a subset of ensemble members that performs at least as good as the original ensemble and discard any other members. In this paper we propose a novel idea of pruning ensemble using Harmony search which is a music inspired algorithm. The pruned ensemble is then used for malware detection. Multiple heterogeneous classifiers in parallel fashion are used for constructing the ensemble and harmony search is used to choose the best set of classifiers from the ensemble to get the pruned set. From the experimental results, it is evident that our algorithm achieves high detection accuracy and outperforms the existing ensemble algorithms.

  • Detection of StegoBot: A covert social network botnet
    V. Natarajan, Shina Sheen, and R. Anitha

    ACM Press
    StegoBot is a recently discovered social network security threat that allows probabilistically unobservable communication through social network. The main aim of a Stegobot is to spread social malware and steal the information from targeted machines. Stegobot takes the advantage of image Steganography to hide the presence of communication within the image sharing behavior of user interaction. In this paper, we present a detection scheme to detect StegoBot. We analyzed different entropies of images to show that image files are generally very sensitive to embedding. Ensemble classification is employed here as a powerful tool that allows fast detection of StegoBot. The power of the proposed framework is demonstrated on different social networks with different evaluation metrics.

  • A novel node splitting criteria for decision trees based on Theil index
    Shina Sheen and R. Anitha

    Springer Berlin Heidelberg
    The performance of detectors using decision trees can be improved by reducing the average height of the tree for faster detection. We propose a new attribute splitting criteria for decision tree construction using the concept of Theil index. The Theil index is a statistic used to measure economic inequality. Results show a decrease in average height compared to the frequently used trees like ID3 and C4.5 using impurity measure as the splitting criterion. Detection of malware using data mining techniques has been explored extensively. Techniques used for detecting malware based on structural features rely on being able to identify anomalies in the structure of executable files. These features might indicate that the file was created or infected to perform malicious activity. They are applied to a decision tree using Theil index as splitting criterion for classification as malware or benign files.

  • Ensemble pruning using Harmony search
    Shina Sheen, S. V. Aishwarya, R. Anitha, S. V. Raghavan, and S. M. Bhaskar

    Springer Berlin Heidelberg
    In recent years, a number of works proposing the combination of multiple classifiers to produce a single classification have been reported. The resulting classifier, referred to as an ensemble classifier, is generally found to be more accurate than any of the individual classifiers making up the ensemble. In an ensemble of classifiers, it is hoped that each individual classifier will focus on different aspects of the data and error under different circumstances. By combining a set of so-called base classifiers, the deficiencies of each classifier may be compensated by the efficiency of the others. Ensemble pruning deals with the reduction of an ensemble of predictive models in order to improve its efficiency and performance. Ensemble pruning can be considered as an optimization problem. In our work we propose the use of Harmony search, a music inspired algorithm to prune and select the best combination of classifiers. The work is compared with AdaBoost and Bagging among other popular ensemble methods and our method is shown to perform better than the other methods. We have also compared our work with an ensemble pruning technique based on genetic algorithm and our model has shown better accuracy.

  • Network intrusion detection using feature selection and decision tree classifier
    Shina Sheen and R. Rajesh

    IEEE
    Security of computers and the networks that connect them is increasingly becoming of great significance. Machine learning techniques such as Decision trees have been applied to the field of intrusion detection. Machine learning techniques can learn normal and anomalous patterns from training data and generate classifiers that are used to detect attacks on computer system. In general the input to classifiers is in a high dimension feature space, but not all features are relevant to the classes to be classified. Feature selection is a very important step in classification since the inclusion of irrelevant and redundant features often degrade the performance of classification algorithms both in speed and accuracy. In this paper, we have considered three different approaches for feature selection, Chi square, Information Gain and ReliefF which is based on filter approach. A comparative study of the three approaches is done using decision tree as classifier. The KDDcup 99 data set is used to train and test the decision tree classifiers.

RECENT SCHOLAR PUBLICATIONS