Skip to content

Latest commit

 

History

History
75 lines (49 loc) · 2.12 KB

README.md

File metadata and controls

75 lines (49 loc) · 2.12 KB

Twitter Sentiment Analysis

screen

How to run the code

Run Zookeper and Kakfa Server first:
zookeeper-server-start.sh /usr/local/kafka_2.11-2.0.0/config/zookeeper.properties
kafka-server-start.sh /usr/local/kafka_2.11-2.0.0/config/server.properties

Run the Web Server in the folder /src/ruby:
ruby web-server.rb

In the browser, navigate to http://localhost:4567, insert the preferred keywords and start the stream processing.

Then run the Spark Streaming job int the folder /src/scala:
run sbt

to track new keywords the stream must be stopped before and then started again.

In this project we developed a standalone application to perform real-time sentiment analysis of Twitter users related to keywords defined by the application user. The system exploits two frameworks for large-scale distributed computation of data such as Kafka and Spark Streaming to process the stream of Tweets generated by Twitter APIs.

Used Tools

Programming Languages

  • Ruby 2.3.4

  • Scala 2.11.12

Web Framework

  • Sinatra 2.0.4

Distributed Large-Scale Streaming Processing Frameworks

  • Kafka 2.11-2.0.0

  • Spark 2.3.1

NLP Library

  • Stanford CoreNLP 3.5.2

System Components

The system consists of:

  • A that submit the input from the Twitter Stream APIs to the Kafka Brokers. It is written in Ruby (file: /src/ruby/kafka-producer.rb)

  • A that creates a DirectStream from the Kafka distrbuted log. It is written in Scala
    (file: /src/scala/spark-sentiment-analysis.scala)

  • A to count the processed tweets and analyze their sentiment related to a specific topic. It is written in Scala
    (file: /src/scala/spark-sentiment-analysis.scala)

  • A object to perform the Sentiment Analysis of tweets. It is written in Scala exploiting the CoreNLP Stanford library
    (file: /src/scala/sentiment-analyzer.scala)

  • A to create a dynamic web interface to visualize the analysis and manage the stream of data. It is written in Ruby exploiting the Sinatra framework
    (file: /src/scala/web-server.rb)