R: Mining spatial, text, web, and social media data
by
Publisher - Packt Publishing
Category - Engineering & IT
Key FeaturesDevelop a strong strategy to solve predictive modeling problems using the most popular data mining algorithmsReal-world case studies will take you from novice to intermediate to apply data mining techniquesDeploy cutting-edge sentiment analysis techniques to real-world social media data using R Book DescriptionData mining is the first step to understanding data and making sense of heaps of data. Properly mined data forms the basis of all data analysis and computing performed on it. This learning path will take you from the very basics of data mining to advanced data mining techniques, and will end up with a specialized branch of data mining—social media mining.You will learn how to manipulate data with R using code snippets and how to mine frequent patterns, association, and correlation while working with R programs. You will discover how to write code for various predication models, stream data, and time-series data. You will also be introduced to solutions written in R based on R Hadoop projects.Now that you are comfortable with data mining with R, you will move on to implementing your knowledge with the help of end-to-end data mining projects. You will learn how to apply different mining concepts to various statistical and data applications in a wide range of fields. At this stage, you will be able to complete complex data mining cases and handle any issues you might encounter during projects.After this, you will gain hands-on experience of generating insights from social media data. You will get detailed instructions on how to obtain, process, and analyze a variety of socially-generated data while providing a theoretical background to accurately interpret your findings. You will be shown R code and examples of data that can be used as a springboard as you get the chance to undertake your own analyses of business, social, or political data.This Learning Path combines some of the best that Packt has to offer in one complete, curated package. It includes content from the following Packt products:Learning Data Mining with R by Bater MakhabelR Data Mining Blueprints by Pradeepta MishraSocial Media Mining with R by Nathan Danneman and Richard HeimannWhat you will learnDiscover how to manipulate data in RGet to know top classification algorithms written in RExplore solutions written in R based on R Hadoop projectsApply data management skills in handling large data setsAcquire knowledge about neural network concepts and their applications in data miningCreate predictive models for classification, prediction, and recommendationUse various libraries on R CRAN for data miningDiscover more about data potential, the pitfalls, and inferencial gotchasGain an insight into the concepts of supervised and unsupervised learningDelve into exploratory data analysisUnderstand the minute details of sentiment analysisAbout the AuthorBater Makhabel (LinkedIn: BATERMJ and GitHub: BATERMJ) is a system architect who lives across Beijing, Shanghai, and Urumqi in China. He received his masters and bachelors degrees in computer science and technology from Tsinghua University between the years 1995 and 2002. He has extensive experience in machine learning, data mining, natural language processing (NLP), distributed systems, embedded systems, the web, mobile, algorithms, and applied mathematics and statistics. He has worked for clients such as CA Technologies, META4ALL, and EDA (a subcompany of DFR). He also has experience in setting up start-ups in China.Bater has been balancing a life of creativity between the edge of computer sciences and human cultures. For the past 12 years, he has gained experience in various culture creations by applying various cutting-edge computer technologies, one being a human-machine interface that is used to communicate with computer systems in the Kazakh language. He has previously collaborated with other writers in his fields too, but Learning Data Mining with R is his first official effort.Pradeepta Mishra is a data scientist, predictive modeling expert, deep learning and machine learning practitioner, and an econometrician. He is currently leading the data science and machine learning practice for Ma Foi Analytics, Bangalore, India. Ma Foi Analytics is an advanced analytics provider for Tomorrows Cognitive Insights Ecology, using a combination of cutting-edge artificial intelligence, proprietary big data platform, and data science expertise.He holds a patent for enhancing planogram design for the retail industry. Pradeepta has published and presented research papers at IIM Ahmedabad, India. He is a visiting faculty at various leading B-schools and regularly gives talks on data science and machine learning.Pradeepta has spent more than 10 years in his domain and has solved various projects relating to classification, regression, pattern recognition, time series forecasting, and unstructured data analysis using text mining procedures, spanning across domains such as healthcare, insurance, retail and e-commerce, manufacturing, and so on.If you have any questions, dont hesitate to look me up on Twitter via @mishra1_PK, I will be more than glad to help a fellow web professional wherever, whenever.Nathan Danneman holds a PhD degree from Emory University, where he studied International Conflict. Recently, his technical areas of research have included the analysis of textual and geospatial data and the study of multivariate outlier detection.Nathan is currently a data scientist at Data Tactics, and supports programs at DARPA and the Department of Homeland Security.Richard Heimann leads the Data Science Team at Data Tactics Corporation and is an EMC Certified Data Scientist specializing in spatial statistics, data mining, Big Data, and pattern discovery and recognition. Since 2005, Data Tactics has been a premier Big Data and analytics service provider based in Washington D.C., serving customers globally.Richard is an adjunct faculty member at the University of Maryland, Baltimore County, where he teaches spatial analysis and statistical reasoning. Additionally, he is an instructor at George Mason University, teaching human terrain analysis, and is also a selection committee member for the 2014-2015 AAAS Big Data and Analytics Fellowship Program.In addition to co-authoring Social Media Mining in R, Richard has also recently reviewed Making Big Data Work for Your Business for Packt Publishing, and also writes frequently on related topics for the Big Data Republic (http://www.bigdatarepublic.com/bloggers.asp#Rich_Heimann). He has recently assisted DARPA, DHS, the US Army, and the Pentagon with analytical support.Table of ContentsWarming UpMining Frequent Patterns, Associations, and CorrelationsClassificationAdvanced ClassificationCluster AnalysisAdvanced Cluster AnalysisOutlier DetectionMining Stream, Time-series, and Sequence DataGraph Mining and Network AnalysisMining Text and Web DataAlgorithms and Data StructuresData Manipulation Using In-built R DataExploratory Data Analysis with Automobile DataVisualize Diamond DatasetRegression with Automobile DataMarket Basket Analysis with Groceries DataClustering with E-commerce DataBuilding a Retail Recommendation EngineDimensionality ReductionApplying Neural Network to Healthcare DataGoing ViralGetting Started with RMining Twitter with RPotentials and Pitfalls of Social Media DataSocial Media Mining – FundamentalsSocial Media Mining – Case StudiesConclusions and Next StepsBibliography
Please login to borrow the book.
Preview