-
Notifications
You must be signed in to change notification settings - Fork 92
Description
Summary
The purpose of these updates is to allow Portico to operate in extremely large federations that blend a mix of high and low capability devices (in both a computation and network sense).
The ultimate goal is to put in place a structure that can support:
- 1000+ federates
- A blend of server-class and low-power (IoT) devices
- Support sub-clusters of high-intensity federates in a way that doesn't negatively impact the QoS for the rest of the federation
Background
As part of an ongoing process working with the US National Institute of Standards, we have been looking at way to enable Portico to serve both large federations (>100 federates) and those that simultaneously contain federates running on high-powered infrastructure alongside those that may be running on low-powered or bandwidth constrained devices (such as IoT appliances).
The particular simulations that NIST envisages supporting as part of its UCEF initiative (Universal CPS Environment for Federation, where CPS is Cyber-Physical Systems) are ones that bring requirements that are challenging to meet and which the currently communications operations certainly cannot stretch to accommodate. These include:
- extremely large numbers of federates (1000+)
- federates spread gepgraphically
- spread across multiple control domains
- federations containing small sub-clusters of "high-intensity" federates
- devices that exchange considerable data between them, but which only a limited amount is useful outside that sub-cluster
- a mix of high-power (server grade) components and low-power (IoT) devices
Portico requires changes to support these sorts of federations. The updates necessary to support the extreme end of these environments must also be done in a manner that doesn't impact the easy-of-use of the current fully-distributed, serverless model.
While wonderly simple in many ways, this serverless model has caused a number of problems that we will ultimately be able to address at the same time as this work:
- Federation join process can be unreliable when a number of federates attempt to start simultaneously
- As we are fully-decenatralized, each federate must track the activities of every other federate
- This creates some memory footprint issues
- This can create unnecessary CPU consumption to just keep track of the accounting requirements for participating in a federation
- No easy way to see the federates involved in a federation without joining it
- Network configuration issues to do with multicast cause common problems that result in federates not able to see one another
High-Level Design
The high-level design for this structure can be seen in the following diagram:
Key points to note here are:
- There will be a central RTI server process (will allow this to be transparently auto-started inside first federate also)
- There will be separate 'Control' and 'Data' channels for information exchange
- Control will be for Federate<>RTI exchanges. This covers all services except attribution reflections and interactions
- Data will be for "group" data communications. This covers attribute reflection and interactions
- The administration and service provision of a number of HLA services will move from decentralized back to a central RTI process ('control' channel)
- The exchange of attribute updates and interactions (the vast majority of traffic in a federation) will remain decentralized ('data' channel)
- Federates will be able to connect directly to an RTI, or through a "Forwarder" (much like the curernt WAN forwarder)
- The Forwarder will act as both a data router, and a firewall
- All 'control' messages will be routed back to the RTI
- Only a subset of 'data' messages will be allows to pass (defined in configuration)
- Federates will still filter messages on the receiver side
- When used with clusters behind forwarders, this may be a reduced set of data
Task List
This work is broken down into three phases:
Phase 1: Create the Infrastructure for Central RTI
- 1a: (Infrastructure to support centralized RTI executive process #219) Create the RTI Server, Message Loading and Communications Infrastructure (control/data)
- 1b: (Infrastructure to support centralized RTI executive process #219) Create updated LRC, Message Loading and Communications Infrastructure (control/data). Exchange PING with Server.
Phase 2: Port HLA Services to Central RTI
- 2a: (Federation Management methods (Big 4) support for centralized RTI #220) Port the Big-4 Federation Management Servies
- 2b: (Synchronization point management methods support for centralized RTI #221) Synchronization Points
- 2c: (Pub/Sub Method Support for centralized RTI Exec #222) Publication and Subscription
- 2d: (Attribute Reflection/Interaction methods support for new centralized RTI framework #223) Updates & Interactions
- 2e: (Time management services updates for centralized RTI framework #224) Time Management
- 2f: (Port Ownership Management services to new centralized RTI executive #225) Ownership
- 2g: (Port save/restore services to new centralized RTI executive #226) Save/Restore
- 2h: (Port misc calls to new centralized RTI executive framework #227) Misc Services
Phase 3: Cluster / Forwarder Infrastructure
- 3a: (Add support for Forwarder (multi-network/site router and firewall) #228) Create Forwarder Infrastructure (upstream/downstream connection, local discovery, ...)
- 3b: (Add support for Forwarder (multi-network/site router and firewall) #228) Extend RTI Server to accept connections from Forwarders (TCP)
- 3c: (Add support for Forwarder (multi-network/site router and firewall) #228) Extend RTI Server to accept connections from Forwarders (Multicast)
- 3d: (Add support for Forwarder (multi-network/site router and firewall) #228) Big-4 Testing
- 3e: (Add support for Forwarder (multi-network/site router and firewall) #228) Message Exchange
- 3f: Benchmarking
