10Feb 2016 Text Analytics Part 2 Rupayan Banerjee

February 15, 2016

Report Abuse
Data Preprocessing
Data Preprocessing is about clearing the stopwords, removing special characters, lemmatization, stemming etc. The input data is the one that we got from the previous experiment, the reader gets the stopwords from a Windows Blob storage, text.preprocessing is a sort of library that has the all the inbuilt generalized functions and procedures in R. The Execute R script is a script that will do the specific jobs whilst calling the functions from the parent zip. Lastly in an effort to make a word cloud on the various text from the input blob we use the last Execute statement.