diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index b7c0412..f520828 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -6,8 +6,21 @@ documentation, we greatly value feedback and contributions from our community. Please read through this document before submitting any issues or pull requests to ensure we have all the necessary information to effectively respond to your bug report or contribution. - -## Reporting Bugs/Feature Requests +## Index + +* [Introduction](#introduction) + * [Reporting Bugs/Feature Requests](#reporting-bugsfeature-requests) + * [Contributing via Pull Requests](#contributing-via-pull-requests) + * [Ways to contribute](#ways-to-contribute) + * [Code of Conduct](#code-of-conduct) + * [Security issue notifications](#security-issue-notifications) + * [Licensing](#licensing) +* [Prerequisites](#prerequisites) +* [Working with CloudFormation](#working-with-cloudformation) +* [Working with the Transcriber Java back-end](#working-with-the-transcriber-java-back-end) +* [Working with the Web UI](#working-with-the-web-ui) + +### Reporting Bugs/Feature Requests We welcome you to use the GitHub issue tracker to report bugs or suggest features. @@ -20,7 +33,7 @@ reported the issue. Please try to include as much information as you can. Detail * Anything unusual about your environment or deployment -## Contributing via Pull Requests +### Contributing via Pull Requests Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: 1. You are working against the latest source on the *master* branch. @@ -40,22 +53,66 @@ GitHub provides additional document on [forking a repository](https://help.githu [creating a pull request](https://help.github.com/articles/creating-a-pull-request/). -## Finding contributions to work on -Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any ['help wanted'](https://github.com/aws-samples/amazon-transcribe-news-media-analysis/labels/help%20wanted) issues is a great place to start. +### Ways to contribute +Looking at existing issues is a great way to find areas to contribute. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any ['help wanted'](https://github.com/aws-samples/amazon-transcribe-news-media-analysis/labels/help%20wanted) issues is a great place to start. -## Code of Conduct +### Code of Conduct This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact opensource-codeofconduct@amazon.com with any additional questions or comments. -## Security issue notifications +### Security issue notifications If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. -## Licensing +### Licensing See the [LICENSE](https://github.com/aws-samples/amazon-transcribe-news-media-analysis/blob/master/LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes. + +## Prerequisites + +The following applications are required to build and test changes: + +* Node.js >=v8 +* AWS CLI +* Docker +* Java +* Maven + +To install the required Node.js libraries, run `npm install`. To start a local build, run `npm run build`. + +## Working with CloudFormation + +The CloudFormation template is located inside the `src/cfn` directory. The template uses a custom resource to populate the S3 bucket with the Web UI static resources and to trigger the back-end build. The lambda function source code is located inside the `src/backend/functions/setup` directory. + +## Working with the Transcriber Java back-end + +To run the Transcriber as a standalone Docker application run the following shell commands: + +```bash +cd src/backend/transcriber + +docker build -t transcriber . + +docker run \ +--env AWS_SECRET_ACCESS_KEY="${AWS_SECRET_ACCESS_KEY}" \ +--env AWS_ACCESS_KEY_ID="${AWS_ACCESS_KEY_ID}" \ +--env TRANSCRIPTS_DYNAMO_DB_TABLE=MediaAnalysisTranscript \ +--env LOG_LEVEL=INFO \ +--env AWS_REGION="${AWS_REGION}" \ +--env TASKS_DYNAMO_DB_TABLE=MediaAnalysisTasks \ +--env MEDIA_URL="${MEDIA_URL}" \ +transcriber java -jar -Dlog4j.configurationFile=log4j2.xml transcriber.jar +``` + +## Working with the Web UI + +To develop a local version of the web UI: +1. Deploy the CloudFormation template. +2. Once the CloudFormation stack is deployed, a `url` output will be available from CloudFormation in the format of `https:///index.html`. Download the file `https:///settings.js` to the `src/frontend/public/` folder. In this way, it will be possible to develop locally using the API Gateway and Cognito Pool Id that CloudFormation just created in AWS. Note that the `settings.js` is "*gitignored*". +3. Run `npm start`. The browser will automatically open the UI with hot reloading enabled. +To make changes, edit the files in the `src/frontend` folder. diff --git a/README.md b/README.md index c9c3c5e..a087b4a 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,48 @@ ## Amazon Transcribe News Media Analysis -Transcribe news audio in realtime. +Transcribe news audio in realtime > Warning: This project is currently being developed and the code shouldn't be used in production. [![Build Status](https://travis-ci.org/aws-samples/amazon-transcribe-news-media-analysis.svg?branch=master)](https://travis-ci.org/aws-samples/amazon-transcribe-news-media-analysis) -### Deployment +This solution allows you to create transcriptions of live streaming video using AWS Transcribe. The application +consists of a Web UI where the user may submit URLs of videos for processing, which in turn creates an ECS task per URL +running in Fargate to begin the transcription. A user can then view the video and follow the text in real time by +clicking on the link provided by the UI. + +### Index + +* [Architecture](#architecture) +* [Usage](#usage) + * [Prerequisites](#prerequisites) + * [Deployment](#deployment) + * [Accessing the application](#accessing-the-application) +* [Remove the application](#remove-the-application) +* [Contributing](#contributing) + +### Architecture + +The Transcribe News Media Analysis uses: +* [Amazon Transcribe](https://aws.amazon.com/transcribe) for transcribing audio to text +* [AWS Lambda](https://aws.amazon.com/lambda) and [Amazon ECS](https://aws.amazon.com/ecs) for computing +* [Amazon DynamoDB](https://aws.amazon.com/dynamodb) for storage +* [Amazon API Gateway](https://aws.amazon.com/api-gateway) and [Amazon Cognito](https://aws.amazon.com/cognito) for the API +* [Amazon S3](https://aws.amazon.com/s3), [AWS Amplify](https://aws.amazon.com/amplify), and [React](https://reactjs.org) for the front-end layer + +An overview of the architecture is below: + +![Architecture](docs/arch_diagram.png) + +### Usage + +#### Prerequisites + +To deploy the application you will require an AWS account. If you don’t already have an AWS account, create one at by following the on-screen instructions. Your access to the AWS account must have IAM permissions to launch AWS CloudFormation templates that create IAM roles. + +To use the application you will require a browser. + +#### Deployment The application is deployed as an [AWS CloudFormation](https://aws.amazon.com/cloudformation) template. @@ -14,7 +50,7 @@ The application is deployed as an [AWS CloudFormation](https://aws.amazon.com/cl You are responsible for the cost of the AWS services used while running this sample deployment. There is no additional cost for using this sample. For full details, see the pricing pages for each AWS service you will be using in this sample. Prices are subject to change. > **Note** -This template will deploy a Front-end layer that will contain some public S3 objects. The deployment will fail if the Public Objects are blocked on an account level. +This template will deploy a Front-end layer that will contain some public S3 objects. _The deployment will fail if the Public Objects are blocked at an account level._ 1. Deploy the latest CloudFormation template by following the link below for your preferred AWS region: @@ -62,28 +98,19 @@ The application is accessed using a web browser. The address is the *url* output ### Remove the application -To remove the application open the AWS CloudFormation Console, click the MediaAnalysis project, right-click and select "*Delete Stack*". Your stack will take some time to be deleted. You can track its progress in the "Events" tab. When it is done, the status will change from DELETE_IN_PROGRESS" to "DELETE_COMPLETE". It will then disappear from the list. - -### Transcriber - -To run the Transciber as a standalone application run the following shell commands: +To remove the application: -```bash -cd /src/backend/transcriber +1. Open the AWS CloudFormation Console +1. Click the MediaAnalysis project, right-click and select "*Delete Stack*" +1. Your stack will take some time to be deleted. You can track its progress in the "Events" tab. +1. When it is done, the status will change from DELETE_IN_PROGRESS" to "DELETE_COMPLETE". It will then disappear from the list. +When it is done, the status will change from DELETE_IN_PROGRESS" to "DELETE_COMPLETE". It will then disappear from +the list. -docker build -t transcriber . +## Contributing -docker run ---env AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} ---env AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} ---env TRANSCRIPTS_DYNAMO_DB_TABLE=MediaAnalysisTranscript ---env LOG_LEVEL=INFO ---env AWS_REGION=${AWS_REGION} ---env TASKS_DYNAMO_DB_TABLE=MediaAnalysisTasks ---env MEDIA_URL=${MEDIA_URL} -transcriber java -jar -Dlog4j.configurationFile=log4j2.xml /transcriber.jar -``` +Contributions are more than welcome. Please read the [code of conduct](CODE_OF_CONDUCT.md) and the [contributing guidelines](CONTRIBUTING.md). ## License -This library is licensed under the MIT-0 License. +This sample code is made available under a modified MIT license. See the LICENSE file. diff --git a/docs/arch_diagram.png b/docs/arch_diagram.png new file mode 100644 index 0000000..3e3a3ad Binary files /dev/null and b/docs/arch_diagram.png differ diff --git a/docs/arch_diagram.xml b/docs/arch_diagram.xml new file mode 100644 index 0000000..1e80872 --- /dev/null +++ b/docs/arch_diagram.xml @@ -0,0 +1,2 @@ + + \ No newline at end of file diff --git a/src/backend/transcriber/transcriber.iml b/src/backend/transcriber/transcriber.iml index 6c41b0e..642c084 100644 --- a/src/backend/transcriber/transcriber.iml +++ b/src/backend/transcriber/transcriber.iml @@ -12,7 +12,8 @@ - + + diff --git a/src/cfn/template.yaml b/src/cfn/template.yaml index 5fdbdb5..db09b62 100644 --- a/src/cfn/template.yaml +++ b/src/cfn/template.yaml @@ -12,7 +12,7 @@ Globals: Runtime: nodejs10.x Environment: Variables: - VERSION: '0.11' + VERSION: '0.12' Parameters: @@ -643,6 +643,74 @@ Resources: - Name: BuildOutput RunOrder: 1 + EcsTaskExecutionRole: + Type: AWS::IAM::Role + Properties: + AssumeRolePolicyDocument: + Statement: + - Effect: Allow + Principal: + Service: + - ecs-tasks.amazonaws.com + Action: + - sts:AssumeRole + Path: / + ManagedPolicyArns: + - arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy + + TranscriberLogGroup: + Type: AWS::Logs::LogGroup + Properties: + LogGroupName: !Sub /ecs/${TaskName} + RetentionInDays: 7 + + TranscriberTaskIAMRole: + Type: AWS::IAM::Role + Properties: + AssumeRolePolicyDocument: + Statement: + - Effect: Allow + Principal: + Service: ecs-tasks.amazonaws.com + Action: sts:AssumeRole + Path: / + ManagedPolicyArns: + - arn:aws:iam::aws:policy/AmazonDynamoDBFullAccess + - arn:aws:iam::aws:policy/AmazonTranscribeFullAccess + + TranscriberTaskDefinition: + Type: AWS::ECS::TaskDefinition + Properties: + TaskRoleArn: !Ref TranscriberTaskIAMRole + ExecutionRoleArn: !GetAtt EcsTaskExecutionRole.Arn + NetworkMode: awsvpc + Memory: '4096' + Cpu: '2048' + Family: !Ref TaskName + RequiresCompatibilities: + - FARGATE + ContainerDefinitions: + - Name: !Ref TaskName + Essential: true + Image: !Sub ${AWS::AccountId}.dkr.ecr.${AWS::Region}.amazonaws.com/${ECRRepository}:latest + LogConfiguration: + LogDriver: awslogs + Options: + awslogs-group: !Ref TranscriberLogGroup + awslogs-region: !Ref 'AWS::Region' + awslogs-stream-prefix: !Ref 'AWS::StackName' + Environment: + - Name: TRANSCRIPTS_DYNAMO_DB_TABLE + Value: !Ref TranscriptDynamoTable + - Name: RETRY_THRESHOLD + Value: !Ref RetryThreshold + - Name: TASKS_DYNAMO_DB_TABLE + Value: !Ref TasksDynamoTable + - Name: CLUSTER + Value: !Ref ECSCluster + - Name: AWS_REGION + Value: !Ref 'AWS::Region' + # Data DbReadRole: @@ -1078,74 +1146,6 @@ Resources: DependsOn: - Account - EcsTaskExecutionRole: - Type: AWS::IAM::Role - Properties: - AssumeRolePolicyDocument: - Statement: - - Effect: Allow - Principal: - Service: - - ecs-tasks.amazonaws.com - Action: - - sts:AssumeRole - Path: / - ManagedPolicyArns: - - arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy - - TransciberLogGroup: - Type: AWS::Logs::LogGroup - Properties: - LogGroupName: !Sub /ecs/${TaskName} - RetentionInDays: 7 - - TransciberTaskIAMRole: - Type: AWS::IAM::Role - Properties: - AssumeRolePolicyDocument: - Statement: - - Effect: Allow - Principal: - Service: ecs-tasks.amazonaws.com - Action: sts:AssumeRole - Path: / - ManagedPolicyArns: - - arn:aws:iam::aws:policy/AmazonDynamoDBFullAccess - - arn:aws:iam::aws:policy/AmazonTranscribeFullAccess - - TransciberTaskDefinition: - Type: AWS::ECS::TaskDefinition - Properties: - TaskRoleArn: !Ref TransciberTaskIAMRole - ExecutionRoleArn: !GetAtt EcsTaskExecutionRole.Arn - NetworkMode: awsvpc - Memory: '4096' - Cpu: '2048' - Family: !Ref TaskName - RequiresCompatibilities: - - FARGATE - ContainerDefinitions: - - Name: !Ref TaskName - Essential: true - Image: !Sub ${AWS::AccountId}.dkr.ecr.${AWS::Region}.amazonaws.com/${ECRRepository}:latest - LogConfiguration: - LogDriver: awslogs - Options: - awslogs-group: !Ref TransciberLogGroup - awslogs-region: !Ref 'AWS::Region' - awslogs-stream-prefix: !Ref 'AWS::StackName' - Environment: - - Name: TRANSCRIPTS_DYNAMO_DB_TABLE - Value: !Ref TranscriptDynamoTable - - Name: RETRY_THRESHOLD - Value: !Ref RetryThreshold - - Name: TASKS_DYNAMO_DB_TABLE - Value: !Ref TasksDynamoTable - - Name: CLUSTER - Value: !Ref ECSCluster - - Name: AWS_REGION - Value: !Ref 'AWS::Region' - WebUIBucket: Type: AWS::S3::Bucket Properties: