diff --git a/NLP with AWS.ipynb b/NLP with AWS.ipynb
deleted file mode 100644
index 9cf694e..0000000
--- a/NLP with AWS.ipynb
+++ /dev/null
@@ -1,251 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {
- "toc": true
- },
- "source": [
- "
Table of Contents
\n",
- ""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Setting up"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "In AWS account:\n",
- "- create a [new IAM user](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html#id_users_create_cliwpsapi)\n",
- "- note down 1) **YOUR_ACCESS_KEY** 2) **YOUR_SECRET_KEY** once the IAM user sets up\n",
- "\n",
- "In terminal:\n",
- "- pip install awscli\n",
- "- aws configure\n",
- " - aws_access_key_id = **YOUR_ACCESS_KEY**\n",
- " - aws_secret_access_key = **YOUR_SECRET_KEY**\n",
- " - Default region name = **us-east-1**\n",
- " - Default output format = **json**"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "import pandas as pd\n",
- "import boto3\n",
- "import json"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## NLP Sentiment Analysis with Amazon Comprehend"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### South Park Review one record "
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "collapsed": true
- },
- "source": [
- "data:image/s3,"s3://crabby-images/8896a/8896aa765fb1476d5a4620a9b2a25762e801514a" alt="Garrison""
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "metadata": {},
- "outputs": [],
- "source": [
- "comprehend = boto3.client(service_name='comprehend', region_name='us-east-1')"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 26,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Calling DetectSentiment\n",
- "{\n",
- " \"ResponseMetadata\": {\n",
- " \"HTTPHeaders\": {\n",
- " \"connection\": \"keep-alive\",\n",
- " \"content-length\": \"160\",\n",
- " \"content-type\": \"application/x-amz-json-1.1\",\n",
- " \"date\": \"Tue, 13 Mar 2018 01:59:41 GMT\",\n",
- " \"x-amzn-requestid\": \"315ed945-2662-11e8-84f7-3bd0bdbf23ac\"\n",
- " },\n",
- " \"HTTPStatusCode\": 200,\n",
- " \"RequestId\": \"315ed945-2662-11e8-84f7-3bd0bdbf23ac\",\n",
- " \"RetryAttempts\": 0\n",
- " },\n",
- " \"Sentiment\": \"NEUTRAL\",\n",
- " \"SentimentScore\": {\n",
- " \"Mixed\": 0.03897104784846306,\n",
- " \"Negative\": 0.3537358343601227,\n",
- " \"Neutral\": 0.5894510746002197,\n",
- " \"Positive\": 0.01784200221300125\n",
- " }\n",
- "}\n",
- "End of DetectSentiment\n",
- "\n"
- ]
- }
- ],
- "source": [
- "text = \"GarrisonTrump’s solution is to violently rape all three, as he nurses a black eye and defends Trump’s actions, is painfully accurate. The parallel between Heidi and Cartman’s poisonous, yet stubborn relationship is extended to Trump’s supporters, who watch Trump’s latest gaffe on tv, and look away, unwilling to criticize him for fear of the left’s rabid gloating. This is unfortunately, exactly what’s happening. We’re in a bizarre situation where the anti-Trump crowd is obsessively critical, ready to erupt in outrage at the slightest provocation, real or imagined, causing Trump’s supporters to aggressively defend his every action, even those they might not actually agree with.\"\n",
- "\n",
- "print('Calling DetectSentiment')\n",
- "print(json.dumps(comprehend.detect_sentiment(Text=text, LanguageCode='en'), sort_keys=True, indent=4))\n",
- "print('End of DetectSentiment\\n')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### on one movie review document"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 42,
- "metadata": {},
- "outputs": [],
- "source": [
- "path = \"/Users/Jessica/Desktop/BAX452/Natural_Language_Processing/DaVinciCode/training_DaVinciCodeExcerpt.txt\"\n",
- "doc1 = open(path, \"r\")\n",
- "output = doc1.readlines()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 43,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "{\n",
- " \"ResponseMetadata\": {\n",
- " \"HTTPHeaders\": {\n",
- " \"connection\": \"keep-alive\",\n",
- " \"content-length\": \"161\",\n",
- " \"content-type\": \"application/x-amz-json-1.1\",\n",
- " \"date\": \"Tue, 13 Mar 2018 02:17:29 GMT\",\n",
- " \"x-amzn-requestid\": \"adf8d1b4-2664-11e8-9bab-4dcb52f23022\"\n",
- " },\n",
- " \"HTTPStatusCode\": 200,\n",
- " \"RequestId\": \"adf8d1b4-2664-11e8-9bab-4dcb52f23022\",\n",
- " \"RetryAttempts\": 0\n",
- " },\n",
- " \"Sentiment\": \"NEGATIVE\",\n",
- " \"SentimentScore\": {\n",
- " \"Mixed\": 0.2702062726020813,\n",
- " \"Negative\": 0.4358953535556793,\n",
- " \"Neutral\": 0.17409518361091614,\n",
- " \"Positive\": 0.11980316042900085\n",
- " }\n",
- "}\n"
- ]
- }
- ],
- "source": [
- "whole_doc = ', '.join(map(str, output))\n",
- "print(json.dumps(comprehend.detect_sentiment(Text=whole_doc, LanguageCode='en'), sort_keys=True, indent=4))"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Note: there is a text size limit for Detectsemtiment analysis in AWS of 5000 bytes. So I have to manually cut down the text file and make the size fit. "
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## References\n",
- "\n",
- "- ['South Park' Review: 'Doubling Down' Is The Most Insightful Episode In Years](https://www.forbes.com/sites/danidiplacido/2017/11/09/south-park-review-doubling-down-is-the-most-insightful-episode-in-years/#171864e17684)\n",
- "- [UMICH SI650 - Sentiment Classification](https://www.kaggle.com/c/si650winter11/data)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": []
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": []
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.6.4"
- },
- "toc": {
- "nav_menu": {},
- "number_sections": true,
- "sideBar": true,
- "skip_h1_title": false,
- "title_cell": "Table of Contents",
- "title_sidebar": "Contents",
- "toc_cell": true,
- "toc_position": {},
- "toc_section_display": true,
- "toc_window_display": true
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}
diff --git a/NLP with GCP.ipynb b/NLP with GCP.ipynb
deleted file mode 100644
index 332e005..0000000
--- a/NLP with GCP.ipynb
+++ /dev/null
@@ -1,297 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "data:image/s3,"s3://crabby-images/03e2b/03e2bb9e1ee966613c569abf50328c278d686f3c" alt="""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- " # Natural Language Processing with Google Cloud Platform\n",
- " -- In Application with Rick & Morty "
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "toc": true
- },
- "source": [
- "Table of Contents
\n",
- ""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "data:image/s3,"s3://crabby-images/f0bd7/f0bd7858edc9db7b41ee8931f60c98875f21f262" alt="""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Setting Up"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Using a service account to [get authetication](https://cloud.google.com/docs/authentication/getting-started). \n",
- "\n",
- "### [Create a service account](https://console.cloud.google.com/apis/credentials/serviceaccountkey?_ga=2.11759798.-323970793.1520889464)\n",
- "- select **new service account**\n",
- "- From the **Role** dropdown, select **Project > Owner**.\n",
- "- **Create** botton, download the JSON file that contains the key to local computer\n",
- "\n",
- "### Open terminal and type the following command\n",
- "- **export GOOGLE_APPLICATION_CREDENTIALS=\"[PATH]\"**\n",
- "- **jupyter notebook**\n",
- " - data:image/s3,"s3://crabby-images/34a65/34a6519e9f42d45d5be9befa8ad109b9358164dd" alt="setup""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### [Verifying authentication](https://cloud.google.com/docs/authentication/getting-started#auth-cloud-implicit-python)\n",
- "\n",
- "I use python in jupyter notebook: "
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "def implicit():\n",
- " from google.cloud import storage\n",
- "\n",
- " # If you don't specify credentials when constructing the client, the\n",
- " # client library will look for credentials in the environment.\n",
- " storage_client = storage.Client()\n",
- "\n",
- " # Make an authenticated API request\n",
- " buckets = list(storage_client.list_buckets())\n",
- " print(buckets)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "#!pip3 install google.cloud"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Importing packages"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "import os"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "# Imports the Google Cloud client library\n",
- "import google.cloud\n",
- "from google.cloud import language\n",
- "from google.cloud.language import enums\n",
- "from google.cloud.language import types\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Sentiment Analysis"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "metadata": {
- "collapsed": true
- },
- "outputs": [],
- "source": [
- "text = \"Rick is an eccentric and alcoholic mad scientist, who eschews many ordinary conventions such as school, marriage, love, and family.\"\n",
- "client = language.LanguageServiceClient()\n",
- "document = types.Document(\n",
- " content=text,\n",
- " type=enums.Document.Type.PLAIN_TEXT)\n",
- "\n",
- "sentiment = client.analyze_sentiment(document).document_sentiment"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Score: 0.8999999761581421\n",
- "Magnitude: 0.8999999761581421\n"
- ]
- }
- ],
- "source": [
- "print('Score: {}'.format(sentiment.score))\n",
- "print('Magnitude: {}'.format(sentiment.magnitude))"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "[name: \"Rick\"\n",
- "type: PERSON\n",
- "salience: 0.758594810962677\n",
- "mentions {\n",
- " text {\n",
- " content: \"Rick\"\n",
- " begin_offset: -1\n",
- " }\n",
- " type: PROPER\n",
- "}\n",
- "mentions {\n",
- " text {\n",
- " content: \"mad scientist\"\n",
- " begin_offset: -1\n",
- " }\n",
- " type: COMMON\n",
- "}\n",
- ", name: \"conventions\"\n",
- "type: OTHER\n",
- "salience: 0.08138427138328552\n",
- "mentions {\n",
- " text {\n",
- " content: \"conventions\"\n",
- " begin_offset: -1\n",
- " }\n",
- " type: COMMON\n",
- "}\n",
- ", name: \"love\"\n",
- "type: OTHER\n",
- "salience: 0.061606135219335556\n",
- "mentions {\n",
- " text {\n",
- " content: \"love\"\n",
- " begin_offset: -1\n",
- " }\n",
- " type: COMMON\n",
- "}\n",
- ", name: \"family\"\n",
- "type: PERSON\n",
- "salience: 0.04316957667469978\n",
- "mentions {\n",
- " text {\n",
- " content: \"family\"\n",
- " begin_offset: -1\n",
- " }\n",
- " type: COMMON\n",
- "}\n",
- ", name: \"school\"\n",
- "type: ORGANIZATION\n",
- "salience: 0.027622591704130173\n",
- "mentions {\n",
- " text {\n",
- " content: \"school\"\n",
- " begin_offset: -1\n",
- " }\n",
- " type: COMMON\n",
- "}\n",
- ", name: \"marriage\"\n",
- "type: OTHER\n",
- "salience: 0.027622591704130173\n",
- "mentions {\n",
- " text {\n",
- " content: \"marriage\"\n",
- " begin_offset: -1\n",
- " }\n",
- " type: COMMON\n",
- "}\n",
- "]"
- ]
- },
- "execution_count": 7,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "entities = client.analyze_entities(document).entities\n",
- "entities"
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.6.4"
- },
- "toc": {
- "nav_menu": {},
- "number_sections": true,
- "sideBar": true,
- "skip_h1_title": true,
- "title_cell": "Table of Contents",
- "title_sidebar": "Contents",
- "toc_cell": true,
- "toc_position": {},
- "toc_section_display": true,
- "toc_window_display": true
- }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}