How to Fight Social Media Anxiety with Machine Learning

The problem

According to wikipedia

The solution

What if we can train a machine to go through hundreds (or thousands) of tweets, extract the relevant data, classify people’s tweets based on opinions, and let us know of what we actually HAVE to know.
That sounds neat, doesn’t it?

Translating that to code

If you made it this far, you are ready to go ahead and clone the project on GitHub.

The Project in Action

Let us take for example a term that everyone on the planet has seen/heard of at least a hundred times, but the majority (Britons included) don’t quite understand. You guessed it, it is “Brexit

Crunching data from Twitter of the term “Brexit”

Wait .. How can we achieve that?

Good question, here is a typical way to classify/make sense of data in a machine learning project. This project is no exception.

  • We clean the data
  • We extract the relevant words
  • We create a dictionary of words used and their frequency
  • We train our model to recognise the pattern of words using a dataset of already classified comments/feedbacks.
  • We use our model to classify each tweet based on the pattern of most used words.
  • We display the cloud of most used words
  • We display the stats of opinions
  • We try to create a comprehensive overall description from the dictionary of most frequently used words!

Let’s get technical

Tweets Sentiment Classification

In order to guess what people are actually thinking about a given topic (Positive, negative, neutral), We are going to use the Naive Bayes Classifier.

How to train our NB model to do that?

The project allows to either train your own model with a dataset of your choosing, or use the pre-trained model. In our example we used the model included in the NLTK corpora basic models.

Generating a Brief Description

The script basically uses the most used 5–6 words and tries to order them in different ways while testing the resulting paragraph’s Coherence. We are using the Language_check to achieve that. Really simple, right?

Getting Started

I have described the installation, pre-requirements, and the getting started steps in the Readme file please go ahead and check it out.

Important Note

This is the initial version of this project, I am intending to improve it as I progress. Please do not hesitate to contribute/share your suggestions.

The machine thinks. hishri.com