Support for binary protocols in Orchestra - Technical Proposal Draft. #197
Replies: 2 comments
-
We removed from the proposal the introduction of the offset attribute. We understood that offsets for fields in binary protocols are calculated from the size of the preceding fields, and they are not specified as absolute by the protocol. Where it is necessary to put empty spaces between fields, we noted that existing exchange binary protocols define "empty" fields for this purpose. This approach has the advantages that the protocol specifications do not need to calculate offsets for fields (based on all preceding fields), which is prone to error. Although we appreciate that SBE v1.0 considers offsets to be optionally available in some cases, we pose that this approach may not be optimal and instead, it should be perhaps changed to use the "empty fields" approach to remove the need for offsets altogether. We equally rejected the idea of providing an offset attribute "just in case someone wants to use it", because we aim to keep the standard minimal and simple. |
Beta Was this translation helpful? Give feedback.
-
I have updated the main proposal to a new version, with more refinements. |
Beta Was this translation helpful? Give feedback.
-
We would like to propose an enhancement to Orchestra to support binary protocols. Specifically, to support encoding (i.e., serialization) in binary formats, as in Layer 6 of the OSI 7-Layers model (the Presentation layer):
Orchestra addresses Layer 7, i.e. the Application layer, by representing the structure and semantic of the API (Messages, Components, Groups, Fields, etc). Because of its FIX heritage, it does implicitly supports encoding in FIX tag/value format (i.e., it provides enough information for such encoding).
However, when encoding in other formats, like binary formats implemented by exchanges, more information is needed. For example, which character to use to pad fields, fields' offsets, padding justification, support for field-maps, etc.
Below is a draft proposal to improve Orchestra's model elements with new attributes to support binary encoding.
Note: the proposal is not finished and we publish it here to receive feedback and contributions.
Binary Protocols Support Proposals
Byte Order
Issue
Binary protocols may choose either big-endian or little-endian byte order for specific reasons. It's important for message protocols to clearly state its byte order in an Orchestra file. In FIX SBE, for instance, the byte order is globally specified in a message schema.
Proposal
Since datatypes are closely related to the encoding protocol, we recommend adding optional
byteOrder
attribute to themappedDatatype
element. It's common for binary protocols to have the same byte order for all datatypes, so the attribute should be also added to thedatatypes
element as a global setting.Valid values should be
bigEndian
andlittleEndian
. The absence of the attribute would mean byte order is unspecified (e.g. for character types).Example
See also:
Null Value Indication
Issue
Many binary protocols are fixed-length, meaning every field must be included in the transmission, even if it's optional and not used in that specific message. Those protocols define a special value for each datatype, usually called null value. This null value indicates that an optional field is not being used. When the encoder or decoder encounters this null value, it treats the field as if it is not set.
Orchestra standard doesn't provide a way to map a datatype with its null value.
Proposal
We recommend adding
nullValue
attribute to the<mappedDatatype>
element.Example
String Termination and Padding Characters
Issue
Most binary encodings use fixed-length fields for alphanumeric data, i.e. for string values. If the actual value doesn't occupy the field length entirely, it
could either be padded with a particular character on the beginning or the end (padded string), or terminated with NUL character (null-terminated string). In fact, a null-terminated string can be considered as a special case of padding with
NUL
control characters.Protocols have different policies on including the null-terminator in the string length. This means that in some cases (e.g. in SBE, NYSE Pillar) if the actual length of the value fits the entire length of the field, then it's acceptable that the value doesn't include a null-terminator character. In other cases (e.g. in HKEX OCG) a null-terminator is required to be present in the field's value, making one byte less space available for the actual string.
Hence, there is a requirement to define a padding side and character, and a null-termination requirement flag for fixed-length string datatypes.
Proposal
We recommend adding following optional attributes to the
mappedDatatype
element:paddingSide
attribute which could be set toleft
orright
.paddingChar
attribute which could be set to a single character or to a special valueNUL
which represents null control character (since this character cannot be used in XML directly).nullTerminated
attribute which could be set totrue
orfalse
. The following rules apply when the attribute is set totrue
:paddingSide="left"
then null-terminator character must be present at the left side of a string.Example
Field Presence Map
Issue
Some protocols like HKEX OCG have a feature called Field Presence Maps, which allows to skip transmission of unset optional fields.
It's a special type of field which contains a bitmap describing the presence of subsequent fields in the actual message instance. The presence map provides an efficient way to check which optional fields are present in a received message. Bits corresponding to required fields are always set to 1.
Those types may be represented as special datatypes mapped to bitsets. In HKEX OCG, for example, a field of such datatype is included in a message header and also it precedes every repeating group instance:
There should be a standard approach to associate the bitmap field with the components and groups it relates to, in the same way a data field may be associated with its length field using the
lengthId
attribute.Proposal
We recommend adding optional
presenceMapId
attribute to container elements likemessage/structure
,component
, andgroup
. This attribute must reference a field which contains presence flag for every direct member of the container element. This approach is like the one with thelengthId
anddiscriminatorId
attributes of a field definition, which are already a part of the standard.Example
Beta Was this translation helpful? Give feedback.
All reactions