An insight into tree based machine learning techniques for big data analytics using Apache Spark Ananthi Sheshasaayee, J V N Lakshmi 2017 International Conference on Intelligent Computing Instrumentation and Control Technologies Icicict 2017, 2017 Data Analysis, Classification and Regression on Big Data using machine learning algorithms in a novel approach used in various science and medical streams. Map Reduce is widely used frame work to parallelize machine learning algorithms. These algorithms are tuned to attain best outcomes. This techniques uses lots of time for processing map reduce model multiple times by tuning the parameters as per the requirement. To achieve the shortest time consumption for tuning the jobs, Apache Spark based model is proposed for optimizing the assignments. This model is to predict the temperature from existing data by training using tree based machine learning techniques. This model replaces the map reduce by spark for implementing the best prediction result. The prediction outcomes computed are compared on tree structured ML methods with respect to time and space utilization.
Machine learning approaches on map reduce for Big Data analytics J V N Lakshmi, Ananthi Sheshasaayee Proceedings of the 2015 International Conference on Green Computing and Internet of Things Icgciot 2015, 2016 To analyze enormous datasets, collection of algorithms, associated systems and perform necessary processing on massive data structures there is obligation for a novel trend, which is framed by Big Data. Architecture of Big Data varies across compound machines and clusters with unique purpose sub systems. The data produced from several sources requires analysis and organization with meager amounts of time. To potentially speed up the processing, a unified way of machine learning is applied on MapReduce frame work. A broadly applicable programming model MapReduce is applied on different learning algorithms belonging to machine learning family for all business decisions. By using ML algorithms with Hadoop for better storage distribution will improve the time and processing speed. This paper presents parallel implementation of various machine learning algorithms implemented on top of MapReduce model for time and processing efficiency.
A theoretical model for big learning,data analytics using machine algorithms Ananthi Sheshasaayee, J. V. N. Lakshmi ACM International Conference Proceeding Series, 2015 Big Data processing is currently becoming increasingly important in modern era due to continuous growth of the amount of data generated in various fields. Architecture for Big Data usually ranges across multiple machines and clusters consisting of various sub systems. To potentially speed up the processing, a unified way of machine learning is applied on MapReduce frame work. A broadly applicable programming model MapReduce is applied on different learning algorithms belonging to machine learning family for all business decisions. This paper presents parallel implementation of various machine learning algorithms, includes K-Means, Logistic Regression implemented on top of MapReduce model.