Merge pull request #10 from apls777/dev

availability zone, retain deletion policy, auto-resize
spotty-cloud · Sep 18, 2018 · 025caf0 · 025caf0
2 parents 60f6b3b + 3d724d7
commit 025caf0
Show file tree

Hide file tree

Showing 9 changed files with 299 additions and 302 deletions.
diff --git a/README.md b/README.md
@@ -1,208 +1,71 @@
 # Spotty
 
-Spotty helps you to train deep learning models on [AWS Spot Instances](https://aws.amazon.com/ec2/spot/).
-
-You don't need to spend time on:
-- manually starting Spot Instances
-- installation of NVIDIA drivers
-- managing snapshots and AMIs
-- detaching remote processes from your SSH sessions
-
-Just start an instance using the following command:
-```bash
-$ spotty start
-```
-It will run a Spot Instance, restore snapshots if any, synchronize the project with the instance 
-and start Docker container with the environment.
-
-Then train your model:
-```bash
-$ spotty run train
-```
-It runs your custom training command inside the Docker container. The remote connection uses 
-[tmux](https://github.com/tmux/tmux/wiki), so you can close the connection and come back to the running process any time later.
-
-Connect to the container if necessary:
-```bash
-$ spotty ssh
-```
-It uses [tmux](https://github.com/tmux/tmux/wiki) session, so you can always detach the session using
-`Crtl`+`b`, then `d` combination of keys and attach that session later using `$ spotty ssh` command again.
+Spotty simplifies training of Deep Learning models on AWS:
 
-## Installation
+- it makes training on AWS GPU instances as simple as a training on your local computer
+- it automatically manages all necessary AWS resources including AMIs, volumes and snapshots
+- it makes your model trainable on AWS by everyone with a couple of commands
+- it detaches remote processes from SSH sessions
+- it saves you up to 70% of the costs by using Spot Instances
+
+## Documentation
 
-To install Spotty use [pip](http://www.pip-installer.org/en/latest/) package manger:
+- See the [wiki section](https://github.com/apls777/spotty/wiki) for the documentation.
+- Read [this](https://medium.com/@apls/how-to-train-deep-learning-models-on-aws-spot-instances-using-spotty-8d9e0543d365) 
+article on Medium for a real-world example.
 
-    $ pip install --upgrade spotty
+## Installation
 
 Requirements:
   * Python 3
   * AWS CLI (see [Installing the AWS Command Line Interface](http://docs.aws.amazon.com/cli/latest/userguide/installing.html))
 
-## Configuration
-
-By default, Spotty is looking for `spotty.yaml` file in the root directory of the project.
-Here is a basic example of such file:
-
-```yaml
-project:
-  name: MyProjectName
-  remoteDir: /workspace/project
-instance:
-  region: us-east-2
-  instanceType: p2.xlarge
-  volumes:
-    - snapshotName: MySnapshotName
-      directory: /workspace
-      size: 10
-  docker:
-    image: tensorflow/tensorflow:latest-gpu-py3
-```
-
-### Available Parameters
-
-__`project`__ section:
-- __`name`__ - the name of your project. It will be used to create S3 bucket and CloudFormation stack to run 
-an instance.
-- __`remoteDir`__ - directory where your project will be stored on the instance. It's usually a directory 
-on the attached volume (see "instance" section).
-- __`syncFilters`__ _(optional)_ - filters to skip some directories or files during synchronization. By default, all project files 
-will be synced with the instance. Example:
-    ```yaml
-    syncFilters:
-      - exclude:
-          - .idea/*
-          - .git/*
-          - data/*
-      - include:
-          - data/test/*
-      - exclude:
-          - data/test/config
-    ```
-
-    It will skip ".idea/", ".git/" and "data/" directories except "data/test/" directory. All files from "data/test/" 
-    directory will be synced with the instance except "data/test/config" file.
-
-    You can read more about filters 
-    here: [Use of Exclude and Include Filter](https://docs.aws.amazon.com/cli/latest/reference/s3/index.html#use-of-exclude-and-include-filters). 
-
-__`instance`__ section:
-- __`region`__ - region where your are going to run the instance (you can use command `spotty spot-prices` to find the 
-cheapest region),
-- __`instanceType`__ - type of the instance to run. You can find more information about 
-types of GPU instances here: 
-[Recommended GPU Instances](https://docs.aws.amazon.com/dlami/latest/devguide/gpu.html).
-- __`amiName`__ _(optional)_ - name of the AMI with NVIDIA Docker (default value is "SpottyAMI"). Use 
-`spotty create-ami` command to create it. This AMI will be used to run your application inside the Docker container.
-- __`maxPrice`__ _(optional)_ - the maximum price per hour that you are willing to pay for a Spot Instance. By default, it's 
-On-Demand price for chosen instance type. Read more here: 
-[Spot Instances](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html).
-- __`rootVolumeSize`__ _(optional)_ - size of the root volume in GB. The root volume will be destroyed once 
-the instance is terminated. Use attached volumes to store the data you need to keep (see "volumes" parameter below).
-- __`volumes`__ _(optional)_ - the list of volumes to attach to the instance:
-    - __`snapshotName`__ _(optional)_ - name of the snapshot to restore. If a snapshot with this name doesn't exists, 
-    it will be created from the volume once the instance is terminated.
-    - __`directory`__ - directory where the volume will be mounted,
-    - __`size`__ _(optional)_ - size of the volume in GB. Size of the volume cannot be less then the size of existing snapshot, but
-    can always be increased.
-    - __`deletionPolicy`__ _(optional)_ - possible values include: "__update_snapshot__" _(value by default)_, 
-    "__create_snapshot__" and  "__delete__". If this parameter is set to "__update_snapshot__", new snapshot with the 
-    same name will be created and the original snapshot will be deleted. For "__create_snapshot__" value, new snapshot 
-    will be created and the original snapshot will be renamed. For "__delete__" value, the volume will be deleted without 
-    creating a snapshot.
-- __`docker`__ - Docker configuration:
-    - __`image`__ _(optional)_ - the name of the Docker image that contains environment for your project. For example, 
-    you could use [TensorFlow image for GPU]((https://hub.docker.com/r/tensorflow/tensorflow/)) 
-    (`tensorflow/tensorflow:latest-gpu-py3`). It already contains NumPy, SciPy, scikit-learn, pandas, Jupyter Notebook and 
-    TensorFlow itself. If you need to use your own image, you can specify the path to your Dockerfile in the 
-    __`file`__ parameter (see below), or push your image to the [Docker Hub](https://hub.docker.com/) and use its name.
-    - __`file`__ _(optional)_ - relative path to your custom Dockerfile. For example, you could take TensorFlow image as a 
-    base one and add [AWS CLI](https://github.com/aws/aws-cli) there to be able to download your datasets from S3:
-        ```dockerfile
-        FROM tensorflow/tensorflow:latest-gpu-py3
-        
-        RUN pip install --upgrade \
-          pip \
-          awscli
-        ```
-    - __`workingDir`__ _(optional)_ - working directory for your custom scripts (see "scripts" section below),
-    - __`dataRoot`__ _(optional)_ - directory where Docker will store all downloaded and built images. You could cache 
-    images on your attached volume to avoid downloading them from internet or building your custom image from scratch 
-    every time when you start an instance.
-    - __`commands`__ _(optional)_ - commands which should be performed once your container is started. For example, you 
-    could download your datasets from S3 bucket to the project directory (see "project" section):
-        ```yaml
-        commands: |
-          aws s3 sync s3://my-bucket/datasets/my-dataset /workspace/project/data
-        ```
-- __`ports`__ _(optional)_ - list of ports to open. For example:
-    ```yaml
-    ports: [6006, 8888]
-    ```
-    It will open ports 6006 for Jupyter Notebook and 8008 for TensorBoard. 
-
-__`scripts`__ section _(optional)_:
-- This section contains customs scripts which can be run using `spotty run <SCRIPT_NAME>`
-command. The following example defines scripts `train`, `jupyter` and `tensorflow`:
-
-    ```yaml
-    project:
-      ...
-    instance:
-      ...
-    scripts:
-      train: |
-        PYTHONPATH=/workspace/project
-        python /workspace/project/model/train.py --num-layers 3
-      jupyter: |
-        /run_jupyter.sh --allow-root
-      tensorboard: |
-        tensorboard --logdir /workspace/outputs
-    ```
+Use [pip](http://www.pip-installer.org/en/latest/) to install or upgrade Spotty:
 
-## Available Commands
+    $ pip install -U spotty
 
-  - `$ spotty start`
-
-    Runs a Spot Instance, synchronizes the project with that instance and starts a Docker container.
+## Get Started
 
-  - `$ spotty stop`
+1. Prepare a `spotty.yaml` file for your project.
 
-    Terminates the running instance and creates snapshots of the attached volumes.
+   - See the file specification [here](https://github.com/apls777/spotty/wiki/Configuration-File).
+   - Read [this](https://medium.com/@apls/how-to-train-deep-learning-models-on-aws-spot-instances-using-spotty-8d9e0543d365) 
+   article for a real-world example.
 
-  - `$ spotty run <SCRIPT_NAME> [--session-name <SESSION_NAME>]`
+2. Create an AMI with NVIDIA Docker. Run the following command from the root directory of your project 
+(where the `spotty.yaml` file is located):
 
-    Runs a custom script inside the Docker container (see "scripts" section in [Available Parameters](#Available-Parameters)).
-
-    Use `Crtl`+`b`, then `d` combination of keys to be detached from SSH session. The script will keep running. 
-    Call `$ spotty run <SCRIPT_NAME>` again to be reattached to the running script. 
-    Read more about tmux here: [tmux Wiki](https://github.com/tmux/tmux/wiki).
-
-    If you need to run the same script several times in parallel, use the `--session-name` parameter to
-    specify different names for tmux sessions.
+    ```bash
+    $ spotty create-ami
+    ```
 
-  - `$ spotty ssh [--host-os]`
+    In several minutes you will have an AMI that can be used for all your projects within the AWS region.
 
-    Connects to the running Docker container or to the instance itself. Use the `--host-os` parameter to connect to the 
-    host OS instead of the Docker container.
+3. Start an instance:
 
-  - `$ spotty sync`
+    ```bash
+    $ spotty start
+    ```
+
+    It will run a Spot Instance, restore snapshots if any, synchronize the project with the running instance 
+    and start the Docker container with the environment.
+
+4. Train a model or run notebooks.
 
-    Synchronizes the project with the running instance. First time it happens automatically once you start an instance, 
-    but you always can use this command to update the project if an instance is already running.
+    You can run custom scripts inside the Docker container using the `spotty run <SCRIPT_NAME>` command. Read more
+    about custom scripts in the documentation: 
+    [Configuration File: "scripts" section](https://github.com/apls777/spotty/wiki/Configuration-File#scripts-section-optional).
+
+    To connect to the running container via SSH, use the following command:
+
+    ```bash
+    $ spotty ssh
+    ```
 
-  - `$ spotty create-ami`
-
-    Creates AMI with NVIDIA Docker. You need to call this command only one time when you start using Spotty, then you 
-    can reuse created AMI for all your projects.
-
-  - `$ spotty delete-ami`
-
-    Deletes an AMI that was created using the command above.
-
-  - `$ spotty spot-prices [--instance-type <INSTANCE_TYPE>]`
+    It runs a [tmux](https://github.com/tmux/tmux/wiki) session, so you can always detach this session using
+    __`Crtl + b`__, then __`d`__ combination of keys. To be attached to that session later, just use the
+    `spotty ssh` command again.
 
-    Returns Spot Instance prices for particular instance type across all AWS regions. Results will be sorted by price.
+## License
 
-All the commands have parameter `--config` that can be used to specify a path to configuration file. By default it's 
-looking for a file `spotty.yaml` in the current working directory.
+[MIT License](LICENSE)
diff --git a/spotty/__init__.py b/spotty/__init__.py
@@ -1 +1 @@
-__version__ = '1.0.8'
+__version__ = '1.1.0'
diff --git a/spotty/commands/spot_prices.py b/spotty/commands/spot_prices.py
@@ -1,9 +1,9 @@
 from argparse import ArgumentParser
 import boto3
-import datetime
 from spotty.commands.abstract import AbstractCommand
 from spotty.helpers.resources import is_valid_instance_type
 from spotty.commands.writers.abstract_output_writrer import AbstractOutputWriter
+from spotty.helpers.spot_prices import get_spot_prices
 
 
 class SpotPricesCommand(AbstractCommand):
@@ -35,21 +35,15 @@ def run(self, output: AbstractOutputWriter):
         prices = []
         for region in regions:
             ec2 = boto3.client('ec2', region_name=region)
-
-            tomorrow_date = datetime.datetime.today() + datetime.timedelta(days=1)
-            res = ec2.describe_spot_price_history(
-                InstanceTypes=[instance_type],
-                StartTime=tomorrow_date,
-                ProductDescriptions=['Linux/UNIX'])
-
-            for row in res['SpotPriceHistory']:
-                prices.append((row['SpotPrice'], row['AvailabilityZone']))
+            res = get_spot_prices(ec2, instance_type)
+            prices += [(price, zone) for zone, price in res.items()]
 
         # sort availability zones by price
         prices.sort(key=lambda x: x[0])
 
         if prices:
+            output.write('Price  Zone')
             for price, zone in prices:
-                output.write('%s   %s' % (price, zone))
+                output.write('%.04f %s' % (price, zone))
         else:
             output.write('Spot instances of this type are not available.')
diff --git a/spotty/commands/start.py b/spotty/commands/start.py
@@ -2,6 +2,7 @@
 from spotty.aws_cli import AwsCli
 from spotty.commands.abstract_config import AbstractConfigCommand
 from spotty.helpers.resources import wait_stack_status_changed
+from spotty.helpers.spot_prices import get_current_spot_price
 from spotty.helpers.validation import validate_instance_config
 from spotty.project_resources.bucket import BucketResource
 from spotty.project_resources.instance_profile import create_or_update_instance_profile
@@ -57,15 +58,25 @@ def run(self, output: AbstractOutputWriter):
         # prepare CloudFormation template
         output.write('Preparing CloudFormation template...')
 
+        # check availability zone
+        availability_zone = instance_config['availabilityZone']
+        if availability_zone:
+            zones = ec2.describe_availability_zones()
+            zone_names = [zone['ZoneName'] for zone in zones['AvailabilityZones']]
+            if availability_zone not in zone_names:
+                raise ValueError('Availability zone "%s" doesn\'t exist in the "%s" region.'
+                                 % (availability_zone, region))
+
+        instance_type = instance_config['instanceType']
         volumes = instance_config['volumes']
         ports = instance_config['ports']
         max_price = instance_config['maxPrice']
         docker_commands = instance_config['docker']['commands']
 
-        template = stack.prepare_template(ec2, volumes, ports, max_price, docker_commands, output)
+        template = stack.prepare_template(ec2, availability_zone, instance_type, volumes, ports, max_price,
+                                          docker_commands)
 
         # create stack
-        instance_type = instance_config['instanceType']
         ami_name = instance_config['amiName']
         root_volume_size = instance_config['rootVolumeSize']
         mount_dirs = [volume['directory'] for volume in volumes]
@@ -89,18 +100,21 @@ def run(self, output: AbstractOutputWriter):
 
         if status == 'CREATE_COMPLETE':
             ip_address = [row['OutputValue'] for row in info['Outputs'] if row['OutputKey'] == 'InstanceIpAddress'][0]
-            log_group = [row['OutputValue'] for row in info['Outputs'] if row['OutputKey'] == 'InstanceLogGroup'][0]
+            availability_zone = [row['OutputValue'] for row in info['Outputs']
+                                 if row['OutputKey'] == 'AvailabilityZone'][0]
+
+            # get the current spot price
+            current_price = get_current_spot_price(ec2, instance_type, availability_zone)
 
             output.write('\n'
                          '--------------------\n'
                          'Instance is running.\n'
                          '\n'
                          'IP address: %s\n'
-                         'CloudWatch Log Group:\n'
-                         '  %s\n'
+                         'Current Spot price: $%.04f\n'
                          '\n'
                          'Use "spotty ssh" command to connect to the Docker container.\n'
-                         '--------------------' % (ip_address, log_group))
+                         '--------------------' % (ip_address, current_price))
         else:
             raise ValueError('Stack "%s" was not created.\n'
                              'Please, see CloudFormation and CloudWatch logs for the details.' % stack.name)
diff --git a/spotty/data/run_container.yaml b/spotty/data/run_container.yaml
@@ -211,6 +211,7 @@ Resources:
                   mkdir -p $MOUNT_DIR
                   mount $DEVICE $MOUNT_DIR
                   chown -R ubuntu:ubuntu $MOUNT_DIR
+                  resize2fs $DEVICE
                 done
           commands:
             mount_volumes:
@@ -551,3 +552,5 @@ Outputs:
     Value: !GetAtt SpotInstance.PublicIp
   InstanceLogGroup:
     Value: !Ref InstanceLogGroup
+  AvailabilityZone:
+    Value: !GetAtt SpotInstance.AvailabilityZone