Skip to content
This repository was archived by the owner on Sep 7, 2022. It is now read-only.

Commit

Permalink
first
Browse files Browse the repository at this point in the history
  • Loading branch information
tambien committed Nov 10, 2016
0 parents commit 9d44eaa
Show file tree
Hide file tree
Showing 113 changed files with 4,498 additions and 0 deletions.
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
server/data/key.json
node_modules/
.DS_Store
*.pyc
static/build/*
static/images/font/
27 changes: 27 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
Want to contribute? Great! First, read this page (including the small print at the end).

### Before you contribute
Before we can use your code, you must sign the
[Google Individual Contributor License Agreement]
(https://cla.developers.google.com/about/google-individual)
(CLA), which you can do online. The CLA is necessary mainly because you own the
copyright to your changes, even after your contribution becomes part of our
codebase, so we need your permission to use and distribute your code. We also
need to be sure of various other things—for instance that you'll tell us if you
know that your code infringes on other people's patents. You don't have to sign
the CLA until after you've submitted your code for review and a member has
approved it, but you must do it before we can put your code into our codebase.
Before you start working on a larger contribution, you should get in touch with
us first through the issue tracker with your idea so that we can help out and
possibly guide you. Coordinating up front makes it much easier to avoid
frustration later on.

### Code reviews
All submissions, including submissions by project members, require review. We
use Github pull requests for this purpose.

### The small print
Contributions made by corporations are covered by a different agreement than
the one above, the
[Software Grant and Corporate Contributor License Agreement]
(https://cla.developers.google.com/about/google-corporate).
58 changes: 58 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
Giorgio Cam let's you use your camera to make music. It relies on a combination of [Google Cloud Vision API](https://cloud.google.com/vision/) and [MaryTTS](https://github.com/marytts/marytts).

This is not an official Google product.

## OVERVIEW

The client-side javascript application captures images using WebRTC. When the user hits the shutter button, an image is sent to the server which then returns an array of labels and confidence scores for that image using Cloud Vision. These labels are dropped into a rhyming template to create the next phrase that the computer will speak. To get the audio of that phrase, the client makes another request to the MaryTTS (text to speech) server which returns a wav file of the audio. That audio is then synced to the music using Tone.js.

## FRONT-END

To build the client-side javascript, first install [node](https://nodejs.org) and [webpack](https://webpack.github.io/). Then you can install of the dependencies of the project by typing the following in the terminal:

```bash
cd static
npm install
```

Then build all of the files

```bash
webpack -p
```

## BACK-END

The back-end uses [Google App Engine](https://cloud.google.com/appengine/) to serve static content and mediate between the two other back-end services: Google Cloud Vision and MaryTTS.

To install the dependencies so they can be launched within Google App Engine they will need to be installed into a local folder (documented [here](https://cloud.google.com/appengine/docs/python/tools/using-libraries-python-27)).

```bash
cd server
mkdir lib
pip install -t lib -r requirements.txt
```

### Google Cloud Vision API

You will need to first enable the API and [generate credentials](https://cloud.google.com/vision/docs/common/auth). Under "Key type", use "JSON" and then download the key.json file.

Add your json key to the `env_variables` section of your `.yaml` file like so:

```yaml
env_variables:
GOOGLE_APPLICATION_CREDENTIALS: PATH/TO/CLOUD_VISION_KEY.json
```
### MaryTTS
Download and install [MaryTTS](https://github.com/marytts/marytts). Then run the MaryTTS Server. Add the IP Adress and Port number that MaryTTS is running on to `server/mary.py`. The default location is `http://localhost:59125`

```python
MARY_TTS_URL = 'http://127.0.0.1'
MARY_TTS_PORT = '59125'
```

### App Engine

Enabled App Engine, created a project and set your project ID under `application`. You can then launch the back-end server which will communication to the front-end application, Cloud Vision, and MaryTTS server.
60 changes: 60 additions & 0 deletions app.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
application: APP_ID_HERE
version: dev
runtime: python27
api_version: 1
threadsafe: true


handlers:

- url: /
script: server.main.app
secure: always

- url: /speak/.*
script: server.mary.app

- url: /see
script: server.vision.app

- url: /(.*\.js)
mime_type: text/javascript
static_files: static/\1
upload: static/(.*\.js)

- url: /(.*\.html)
mime_type: text/html
static_files: static/\1
upload: static/(.*\.html)

- url: /(.*\.(bmp|gif|ico|jpeg|jpg|png|svg|ttf))
static_files: static/\1
upload: static/(.*\.(bmp|gif|ico|jpeg|jpg|png|svg|ttf))

# index files
- url: /(.+)/
static_files: static/\1/index.html
upload: static/(.+)/index.html

# audio files
- url: /(.*\.mp3)
mime_type: audio/mpeg
static_files: static/\1
upload: static/audio/(.*\.mp3)


skip_files:
- ^(.*/)?.*\.pyc
- ^(.*/)?.*\.wav
- ^.git/.*
- ^(.*/)?node_modules/.*
- ^static/src/.*
- ^static/style/.*
- ^package\.json

libraries:
- name: webapp2
version: latest

env_variables:
GOOGLE_APPLICATION_CREDENTIALS: PATH/TO/CLOUD_VISION_KEY.json
Empty file added server/__init__.py
Empty file.
11 changes: 11 additions & 0 deletions server/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
<!-- <!DOCTYPE html> -->
<html>
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no">

<script type="text/javascript" src="build/Main.js"></script>
</head>
<body>
</body>
</html>
32 changes: 32 additions & 0 deletions server/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/usr/bin/env python
#
# Copyright 2016 Google Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

import os
import webapp2
from google.appengine.ext.webapp import template


class Main(webapp2.RequestHandler):

def get(self):
main = os.path.join(os.path.dirname(__file__), 'index.html')
self.response.out.write(template.render(main, {}))


app = webapp2.WSGIApplication([
('/', Main)
], debug=True)
60 changes: 60 additions & 0 deletions server/mary.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
#!/usr/bin/env python
#
# Copyright 2016 Google Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

import webapp2
import urllib2

from google.appengine.api import memcache

MARY_TTS_URL = 'http://127.0.0.1'
MARY_TTS_PORT = '59125'


class SpeakHandler(webapp2.RequestHandler):
def maryRequestUrl(self, text, rate, pitch):
f0_scale = '0.8' if float(pitch) == 0 else '1.5'
tract_scaler = '0.9'
f0_add = str(float(pitch) * 10)
text = urllib2.quote(text.encode("utf-8"))
urlString = '{0}:{1}/process?INPUT_TYPE=TEXT&OUTPUT_TYPE=AUDIO&INPUT_TEXT={2}'.format(MARY_TTS_URL, MARY_TTS_PORT, text)
urlString += '&VOICE_SELECTIONS=cmu-bdl-hsmm%20en_US%20male%20hmm&AUDIO_OUT=WAVE_FILE&LOCALE=en_US&VOICE=cmu-bdl-hsmm&AUDIO=WAVE_FILE'
# some voice parameters
urlString += '&effect_F0Scale_selected=on&effect_F0Scale_parameters=f0Scale%3A{0}%3B'.format(f0_scale)
urlString += '&effect_TractScaler_selected=on&effect_TractScaler_parameters=amount%3A{0}%3B'.format(tract_scaler)
urlString += '&effect_F0Add_selected=on&effect_F0Add_parameters=f0Add%3A{0}%3B'.format(f0_add)
urlString += '&effect_Rate_selected=on&effect_Rate_parameters=durScale%3A{0}%3B'.format(str(1 / float(rate)))
return urlString
def get(self):
text = self.request.get('text', default_value='this is a test')
rate = self.request.get('rate', default_value='1')
pitch = self.request.get('pitch', default_value='0')

key = 'text={0} rate={1} pitch={2}'.format(text, rate, pitch)
print self.maryRequestUrl(text, rate, pitch)
audio = memcache.get(key)
if audio is None:
url = self.maryRequestUrl(text, rate, pitch)
audio = urllib2.urlopen(url).read()
four_hours = 60 * 60 * 4
memcache.add(key, audio, four_hours)
self.response.headers['Content-Type'] = 'audio/wav'
self.response.out.write(audio)


app = webapp2.WSGIApplication([
('/.*', SpeakHandler)
], debug=True)
1 change: 1 addition & 0 deletions server/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
google-api-python-client==1.5.1
84 changes: 84 additions & 0 deletions server/vision.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
#!/usr/bin/env python
#
# Copyright 2016 Google Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


import webapp2
import urllib2
import json

import base64

from google.appengine.ext import vendor
vendor.add('server/lib')

import re
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials

# remove these things cause it could be taken the wrong way
DISCOVERY_URL='https://{api}.googleapis.com/$discovery/rest?version={apiVersion}'

class VisionApi(webapp2.RequestHandler):
def __init__(self, request, response):
# Set self.request, self.response and self.app.
self.initialize(request, response)
self.vision = self._create_client()

def _create_client(self):
credentials = GoogleCredentials.get_application_default()
return discovery.build(
'vision', 'v1', credentials=credentials,
discoveryServiceUrl=DISCOVERY_URL)

def _label(self, image, max_results=10, num_retries=2):
"""
Uses the Vision API to detect text in the given file.
"""

label_request = {
'image': {
'content': image
},
'features': [{
'type': 'LABEL_DETECTION',
'maxResults': max_results,
}]
}

request = self.vision.images().annotate(body={'requests': [label_request]})
response = request.execute(num_retries=num_retries)['responses'][0]
labels_raw = response.get('labelAnnotations', [])
print labels_raw
error = response.get('error', [])
if len(labels_raw):
filtered = [r for r in labels_raw if float(r['score']) > 0.1]
labels = [{'label' : str(x['description']), 'score' : float(x['score'])} for x in filtered]
return labels
elif len(error):
return {'error' : error}
else:
return []


def post(self):
image = self.request.POST.multi['image']
labels = self._label(image)
self.response.out.write(json.dumps(labels))


app = webapp2.WSGIApplication([
('.*', VisionApi)
], debug=True)
Binary file added static/audio/applause.mp3
Binary file not shown.
Binary file added static/audio/crash.mp3
Binary file not shown.
Binary file added static/audio/end.mp3
Binary file not shown.
Binary file added static/audio/fill0.mp3
Binary file not shown.
Binary file added static/audio/fill1.mp3
Binary file not shown.
Binary file added static/audio/fill2.mp3
Binary file not shown.
Binary file added static/audio/racer.mp3
Binary file not shown.
Binary file added static/audio/reverseCrash.mp3
Binary file not shown.
Binary file added static/audio/shutter.mp3
Binary file not shown.
Binary file added static/audio/siren.mp3
Binary file not shown.
Binary file added static/audio/voice/alright.mp3
Binary file not shown.
Binary file added static/audio/voice/herewego.mp3
Binary file not shown.
Binary file added static/audio/voice/herewegogo.mp3
Binary file not shown.
Binary file added static/audio/voice/readytoexplore.mp3
Binary file not shown.
Binary file added static/audio/voice/takeapicture.mp3
Binary file not shown.
Binary file added static/audio/voice/yeah.mp3
Binary file not shown.
Binary file added static/audio/voice/yeahup.mp3
Binary file not shown.
Loading

0 comments on commit 9d44eaa

Please sign in to comment.