-
Notifications
You must be signed in to change notification settings - Fork 49
Description
Describe the feature
This library needs to implement at a minimum three techniques to improve performance:
- Use of HTTP/2
- io_uring
- RDMA and zero-copy buffers
Use Case
Any use case that requires high performance. The current implementation achieves at most 50% of what other S3 libraries can achieve using the identical hardware. This indicates significant performance issues with this software stack.
Proposed Solution
AWS C S3 Library Enhancement Analysis: HTTP/2, io_uring, and RDMA Support
Current HTTP Implementation
The library currently uses HTTP/1.1 through the aws-c-http
dependency. The code shows:
- HTTP connections are managed through
aws_http_connection_manager
- The library uses traditional socket-based networking with TCP keepalive options
- Connection pooling and multiplexing is handled at the application layer
- The mock server tests confirm HTTP/1.1 usage (using Python's h11 library)
Implementation Strategy for Your Requirements
1. HTTP/2 Support
To enable HTTP/2, you'll need to:
Modify the HTTP layer dependency:
- Update
aws-c-http
to support HTTP/2 or integrate a different HTTP/2 library - The connection management code in
source/s3_client.c
would need updates to handle HTTP/2 multiplexing - Modify the request creation logic in the request message builders to use HTTP/2 headers
Key files to modify:
source/s3_client.c
(connection management)include/aws/s3/private/s3_client_impl.h
(connection configuration)- Request message creation functions
2. io_uring Integration
For io_uring support to improve performance:
I/O layer modifications:
- Replace the current async I/O implementation with io_uring-based operations
- The library currently uses
aws-c-io
for I/O operations - this would need io_uring backend support - Modify the buffer management and memory pooling system shown in
docs/memory_aware_request_execution.md
Implementation areas:
- File I/O operations for uploads/downloads
- Network I/O for HTTP requests
- Buffer management for zero-copy operations
3. RDMA and Zero-Copy Support
For RDMA integration to eliminate memory copies:
Memory management overhaul:
- The current buffer pooling system in the library would need modification to support RDMA-registered memory
- Implement RDMA-aware buffer allocation that can be directly accessed by the network hardware
- Modify the part-based upload/download mechanism to use RDMA transfers
Key implementation points:
- Replace the current memory pool (
source/s3_client.c
shows 8MB default part sizes) - Implement RDMA connection management alongside HTTP connections
- Create zero-copy paths for data transfer between user space and S3
Implementation Approach
-
Start with HTTP/2: This is the most straightforward enhancement and will provide immediate performance benefits through connection multiplexing.
-
Add io_uring support: This can be done incrementally, starting with file I/O operations and then extending to network operations.
-
Implement RDMA: This is the most complex change and would require significant architectural modifications to support zero-copy operations.
Current Library Architecture Analysis
The library's current architecture with its connection pooling, part-based transfers, and async I/O foundation provides a good starting point for these enhancements. Key characteristics:
- Default part size: 8MB (configurable)
- Connection management: Uses connection pooling with automatic scaling
- Memory management: Custom buffer pooling system for efficient memory usage
- Async operations: Thread pool-based async I/O system
- Request splitting: Automatic parallel chunk processing
Technical Implementation Details
HTTP/2 Migration Path
// Current HTTP/1.1 connection setup (simplified)
struct aws_http_connection_manager_options http_options = {
.initial_window_size = initial_window_size,
.socket_options = &socket_options,
.tls_options = tls_options,
// Add HTTP/2 specific options
.http2_settings = &http2_settings,
.protocol_version = AWS_HTTP_VERSION_2
};
io_uring Integration Points
- File I/O operations in upload/download paths
- Network socket operations for HTTP requests
- Buffer management for zero-copy file-to-network transfers
RDMA Implementation Strategy
- RDMA connection management parallel to HTTP connections
- Memory registration for S3 transfer buffers
- Protocol negotiation to determine RDMA capability
- Fallback mechanisms for non-RDMA endpoints
Benefits Expected
- HTTP/2: Reduced connection overhead, better multiplexing, header compression
- io_uring: Lower CPU usage, reduced system call overhead, better I/O efficiency
- RDMA: Zero-copy transfers, reduced memory bandwidth usage, lower latency
Dependencies and Prerequisites
- HTTP/2: Update or replace
aws-c-http
dependency - io_uring: Linux kernel 5.1+ support,
liburing
integration - RDMA: InfiniBand or RoCE hardware,
libibverbs
integration
The modular design with separate HTTP, I/O, and client layers makes it possible to implement these features incrementally while maintaining backward compatibility.
Other Information
No response
Acknowledgements
- I may be able to implement this feature request
- This feature might incur a breaking change