-
Notifications
You must be signed in to change notification settings - Fork 62
v1 Design Document
@mattBrzezinski on GitHub and the Julia Slack
- Make using AWS easy for the average Julia user
- Use automation and code generation as much as possible
- Create a simple straight-forward systems design
Using AWS Services in Julia is currently more difficult than it needs to be.
Users are currently limited to either the low-level API wrappers, which require knowing the service
, request type
, and uri
as outlined by Amazon.
Or users can use a high-level wrapper package which may or may not be available for the service which they want to use.
Updating these API packages is a manual and undocumented process.
This document proposes a system which can automatically update low and high level API wrappers for AWS Services.
As well as use one of Julia's key features, multiple dispatch, to dispatch on the request type
rather than having an individual function for each AWS Service.
These changes will allow JuliaCloud to always have an up-to-date package with the latest Amazon Service APIs.
There are two categories of packages currently supporting AWS usage in JuliaCloud
.
AWSCore.jl
is the most popular low-level package.
The package consists of five major files:
-
AWSAPI.jl
: Generates theServices.jl
file which contains the low-level API wrappers for each AWS Service -
AWSCore.jl
: Processesjson, query, rest-xml, rest-json
request protocols -
AWSCredentials.jl
: Handles retrieving AWS Credentials from locations such as environment variables, credential / configuration files, etc. -
Services.jl
: Contains a function for every AWS Service -
signaturev4.jl
: Creates theAWS4AuthLayer
to be inserted into the HTTP stack and signs the requests with AWS authentication
Inside of Services.jl
each AWS Service has its own respective service, which is used to call it:
function s3(aws::AWSConfig, verb, resource, args=[])
AWSCore.service_rest_xml(
aws;
service = get(aws, :service_name, "s3"),
version = "2006-03-01",
verb = verb,
resource = resource,
args = args)
end
AWSCore.jl
works by running a Node.js
server which parsing down the AWS SDK JS
to create definitions for each AWS Service.
To use the package it is then up to the end user to know how to call the appropriate operation, which can be done by referencing the AWS Documentation
.
e.g. ListBuckets
operation on AWS S3
using AWSCore.Services
Services.s3(aws_config(), "GET", "/")
Having functions defined for each service in this form does not take advantage of multiple dispatch.
In its current state there are no documented steps to update the Services.jl
file.
If Amazon releases a new service, or updates the API for an existing service the process of updating Services.jl
needs to be done manually.
These packages are much more simple to use as the end user only needs to know the operation they wish to perform. However these high-level packages are currently hand written, limited to certain AWS Services, are prone to errors and/or have limited functionality.
To use a package such as AWSS3.jl
, the end user only needs to know how to call the operation.
e.g. ListBuckets
operation on AWS S3
using AWSCore
s3_list_buckets()
AWSSDK.jl
is a package which contains high-level API definitions for all Amazon Services.
However, because it contains every service as its own module loading this package is quite cumbersome.
I propose that we tag the current version of [email protected]
, and begin working on [email protected]
.
[email protected]
would consist of:
- Taking advantage of Julia's multiple dispatch for making AWS service requests
- Automating the creation and updating of service definitions using GitHub actions
- Low and High Level API wrappers
After the release of [email protected]
the archival and deprecation of other low and high level wrapper packages can occur.
The proposed architecture for a system would look like:
Would hold the structs
for each type of request we will dispatch on.
It will also contain its current functionality of making the requests themselves (JSON, REST-XML, etc.
)
These request function can be used as an entry point, however they are not the recommended route.
This file will be auto-generated by AWSMetadata.jl
.
It will contain the low-level API wrapper objects for each service.
This will be the entry point for the low-level API wrapper.
i.e.
module AWSCoreServices
# ...
const sagemaker_runtime = AWSCorePrototype.RestXMLService("runtime.sagemaker", "2017-05-13")
const s3 = AWSCore.RESTXMLService("s3", "2006-03-01")
const s3_control = AWSCorePrototype.RestXMLService("s3-control", "2018-08-20")
const sagemaker = AWSCorePrototype.JSONService("api.sagemaker", "2017-07-24", "1.1", "SageMaker")
# ...
end
These files will be auto-generated by AWSMetadata.jl
.
Each file will be a sub-module for an AWS Service and contain high-level wrappers for each operation for a service.
These will be the entry points for the high-level API wrappers.
Since these files contain a large amount of functions, including them in the AWSCore
module would take a substantial amount of time.
Instead it will be used by calling a macro
to generate the module and include the service file when needed.
i.e.
module s3
# ...
ListObjects(Bucket) = s3("GET", "/$Bucket")
ListObjectVersions(Bucket) = s3("GET", "/$Bucket?versions")
HeadObject(Bucket, Key) = s3("HEAD", "/$Bucket/$Key+")
PutBucketAcl(Bucket) = s3("PUT", "/$Bucket?acl")
# ...
end
AWSMetadata.jl
contains all the functions for updating both the low and high level API wrappers.
metadata.json
is used in tandem to hold the SHA
hashes for each version, as well as their API Versions.
We can use GitHub actions to automatically create or update AWS Service APIs on a daily basis. We can also use GitHub actions to trigger alarms and gather metrics.
Low-Level wrapper usage would look similar to the current AWSCore.jl
.
using AWSCorePrototype.Services: s3
buckets = s3("GET", "/")
println(buckets)
using .AWSCorePrototype: @service
@service S3
using .S3
buckets = S3.ListBuckets()
println(buckets)
- Code generates low and high level wrappers
- Use multiple dispatch on a request type
- Archive single AWS Service high-level wrappers, and other low-level wrapper packages
- Increase code coverage of unit tests for each AWS Service
- Handling other cloud service providers such as Azure, or Google Cloud Platform
- Decrease the size of the code base
- Increase the performance making requests to AWS
- Get code coverage for unit tests to 100%
To automate the creation of high and low level wrappers in Julia we must pull AWS Service definitions from an external source.
The JavaScript SDK
is the most simple to parse as all service definitions are defined as JSON files while other SDKs define them on a per language basis.
We need to also have some service to run the code which will automate the creation or updating of a service, such as GitHub actions
.
Certain actions are already created, and can simplify this process such as:
We will need to depend on other Julia packages. A short list of them would be:
- DataStructures.jl
- EzXML.jl
- HTTP.jl
- Inifile.jl or some other well maintained alternative?
- JSON3.jl or some other well maintained alternative?
- MbedTLS.jl
- Mocking.jl
- Retry.jl or some other well maintained alternative?
- Sockets.jl
- Users should be able to easily use the package, and only load the necessary modules for their code.
- The package should have a well defined API and design.
- Making calls to AWS should be performant.
- Decrease the time between Amazon launching a service, and the API being available in Julia.
- The only manual process of updating API wrapper definitions should be creating new unit tests and review the generated changes.
- The system should be well documented, such that anyone in the Julia community can have a good understanding of the system components and is able to contribute to the repository.
- The system should be extendable so that new AWS Services can automatically be created.
- How often are services being updated / created?
- Is checking daily a good time frame?
- How do we handle metrics, GitHub Actions?
- If a new
protocol
(notREST-XML, JSON, REST-JSON, or Query
) is being used for a Service we should trigger an alarm. Amazon is now making a new service and the auto-generation code needs to accommodate the newprotocol
.
- Should we attempt to automate the creation of unit tests as well?
- I would argue no, and that these should be created by hand.
- It would be tedious, however it will give us the backing that the generated code is correct.
- It would also be complex to know pre-requisites for certain operations
- i.e. To add a SecurityGroup to an EC2 Instance you'd first need an Instance in a VPC
- Do we go all in on GitHub Actions?
- We can replace TravisCI (what is currently used) to run tests
- How do we deal with
bors
? - Quick overview I found between the two, link
- Which Julia JSON package should we be using?
- Prototype reading JSON for an AWS Service API and compare the performance
- What are the pros/cons of each of them?
- List of Julia JSON packages
- Should we move away from using XMLDict.jl?
- It's very very convenient to use
- How should optional parameters be passed in?
- As a Dictionary? Raw
XML
forRest-XML
calls? Both? - LittleDict?
- As a Dictionary? Raw
-
api_files.txt
is a list of all files in theAWS SDK JS APIs
import re
with open("api_files.txt") as f:
services = set()
for line in f.readlines():
services.add(re.split("-\d", line)[0])
print(len(services)) # 220 (as of 2019-12-17)