Daily Tweets on feeling, collected using Azure Streaming Analytics and analyzed for various sentiments using multiple R-packages.
Below outlines process of extracting tweets, cleaning tweets, evaluating sentiment scores for each tweet and as well as calculating 10 emotional attributes from Syuzhet package using various of Azure components. - Data The Twitter dataset was collected daily using ASA job with Event Hub as the source and Blob storage as the sink. The collected dataset was stored in a Blob Storage in CSV format and downloaded using Azure Storage Explorer. Various Twitter handles were used for daily collecting sentiments for each day over a period of 1 week. ASA jobs were run daily for over a period of 1 hour to collect the info. For example, for Monday, handles such as (Monday, MondayMorning, mondaythoughts, MondayMood, MondayMotivation) were collected for analysis. - Retrieve Data Over of 25k collected and the following is the Azure Streaming Analytics Query Language (ASAQL) used to for the Twitter Stream *SELECT [CreatedAt[, [Topic], [SentimentScore], [Author], [Text] FROM EventHubInputName* Additional features were added to facilitate easier selection of info for R-scripts - Prepare Data No preparation of the info was done in Azure Machine Learning (AML) - Preprocess Data None of the AML modules pertaining to feature selection or preprocessing were employed inside the Azure Machine Learning - Algorithm In an effort to study various sentiments expressed, the following R-packages were employed *SentimentAnalysis: uses analyzeSentiment Syuzhet: uses get_nrc_sentiment emotion lexicon tm: various functions were utilized to remove stopwords, url. Whitespace and as well as restrict to particular encoding for this analysis.* - Results Two Execute R-Script modules were utilized to calculate and plot using the above packages. Each of the plots compares the sentiments across each day