Marshal bitfield #1378

z16 · 2015-04-29T18:25:00Z

Bitfield marshalling class

As with previous pull requests that adjusted the packet base class and introduced a marshallable string wrapper, this one also aims to improve server/client communication. SE uses bit-packed values in multiple places, and retrieving/setting them is generally a pain. It involves calculating with hard-coded offsets, that all have to be adjusted in multiple places every time the packet structure changes. The calculations themselves are bit shifts of varying size that also repeats code unnecessarily in multiple places.

This class is supposed to facilitate the usage of bit-packed values.

Will it break DSP?

No. I figured I'll put this at the top, since that is what most people will be interested in at first. This pull request only provides the class, no DSP code actually uses it, as such it cannot break anything that already exists on its own.

As for actual usage of this code, I did perform a reasonable amount of testing on it, and I managed to find a few error cases, which I have since ironed out. That said, there's no guarantee that everything in it works flawlessly. This can have a wide range of applications, as such many things could potentially go wrong. However, I'm fairly confident that most problem cases have been found and eliminated through either special handling (rarely) or better design (usually).

On bit packing

Bit-packed values are used to conserve space, something that SE seems to be superficially careful about (although surprisingly wasteful in other places). The idea is to limit the space requirements to between-byte values, i.e. bit values. So if a value only goes from 0 to 100 it doesn't need a full byte (8 bits for SE), instead it will fit just fine into 7 bits. So 1 bit could be conserved and used for something else.

As an example, let's examine the byte sequence containing crafting skill levels. SE sends one unit of such a sequence as a two byte long structure. The 16 bits are used as follows:

xxxxx xxxxxxxxxx x
  |       |      |
rank    level  capped

The first five bits signal the rank of the user. There are 11 distinct ranks, which fits into a 4 bit value. Since 5 bits are currently used a maximum of 32 ranks could be supported with the current structure.

The second bit sequence consists of 10 bits and represents the user level. The maximum level is currently 110, which fits into a 7 bit value. 10 bits allow for 1024 distinct values, so that is the (theoretical) maximum level with the current structure.

The last bit stands alone and is interpreted as a boolean value and represents whether or not the skill is capped for the user.

To allow the same range of values without bit packing would require one byte for the rank, two bytes for the level and one byte for the cap indicator, resulting in four bits total. So SE cut the space requirements in half by using bit packing here.

Problems with bit field implementations

There are two standard ways to deal with bit fields, both of which are not appropriate for our needs. The first is a regular bit field, which is a feature directly supported by the language. We could (naively) define it as follows:

struct craft_skill
{
    uint8_t rank : 5;
    uint16_t level : 10;
    bool capped : 1;
};

The problem here is that alignment is not guaranteed by the standard. And unlike many things not guaranteed by the standard this is actually handled very differently across compilers. One thing that certain compilers do differently is to align values of different size to the next byte boundary. For example, the uint8_t and uint16_t are of different sizes, so some compilers would not align them within the same byte (which is required for the bit packing to work). Instead it would would shift the level value to the next byte boundary.

For example, the struct as defined above has a size of 2 bytes on GCC and 6 bytes on MSVC. Neither of them violate the standard (in this respect). Hence uniformity is not possible under such conditions.

The second standard option for a bit-manipulating struct is std::bitset. The problem with that is that it does not allow to create multi-bit spanning values. Its purpose is to manipulate single bits inside a field. It also does not guarantee underlying size, which is important for packet-communication.

There are very few custom implementations, and none satisfy the requirements I was hoping for for such a class, which are threefold:

It needs to be explicit and consistent in memory size across compilers
It should support implicit type casting
It should be fast

A few other minor requirements were also of import (easy to use, for example, which almost all such implementations violate, or no repetition in code) but these three are the most distinguished ones.

Class implementation

The class I'm proposing with this pull request fulfills all three requirements:

It will always take up the specified size in bytes in memory, across all three major compilers
It supports implicit casting of underlying values to the provided types
With the exception of only a certain number of bit shifts (which are done during run-time) it is almost entirely calculated at compile-time

Explicit in size

This is achieved by having only one member, an unsigned char array of fixed size (specified at compile time). All other needed values (the information for each bit-packed value) are provided as template parameters and accessed through compile-time functions, which end up hard-coded into the resulting assembly.

This results in the class being a bit ugly (to look at, not to use). But it's necessary to achieve the desired effect of not taking up more memory than it's allowed to.

Implicit typing

The type information is stored as metadata in the form of the provided template parameters. This means that when accessing a value, the return type is known and it can be cast to that before the function returns. This avoids having to cast the value to its intended type at the call site and saves the developer time and effort.

Execution speed

At first look the class seems to do a lot of calculations, but due to the template-based implementation as mentioned before almost all those calculations are done at compile time. This means, that compilation will take longer, but in return it will execute as fast as manual calculations could possibly allow.

Class usage

Note: I am implying using namespace marshal; for the following sections. All relevant classes and functions are declared inside that namespace.

Definition

The following is the general structure of a definition of this class:

bitfield<size,
    bitsection<bits, type>,
    bitsection<bits, type>,
    ...
> foo;

This will declare a variable foo, which is a bit field containing the specified number of sections and has underlying size size in memory. For example, the crafting skill type we defined above would look like this:

bitfield<2,
    bitsection<5, uint8_t>,
    bitsection<10, uint16_t>,
    bitsection<1, bool>
> craft_skill;

This will guarantee that the value craft_skill will take up exactly 2 bytes in memory. Note that we don't even have to make those values of type uint8_t and uint16_t. We normally do that to more accurately represent their maximum size, but it's not necessary at all and can, in fact, be more harmful than useful. If, for example, they increase the size of the first value to go above the uint8_t boundary, we would have to manually adjust that. Since it doesn't matter what numeric type it is internally, it could as well be defined as an int, which happens to be the default type for a bitsection. So the above could be simplified to this:

bitfield<2,
    bitsection<5>,
    bitsection<10>,
    bitsection<1, bool>
> craft_skill;

You can also use known enums as types. For example, we know all possible guild ranks for crafting purposes. We can define the enum for it:

namespace Ranks
{
    enum Rank
    {
        Amateur,
        Recruit,
        Novice,
        Initiate,
        Apprentice,
        Journeyman,
        Craftsman,
        Artisan,
        Adept,
        Veteran,
        Expert,
    };
}

Now we can specify that as the type of the bit section, so it will be interpreted as one of these values:

bitfield<2,
    bitsection<5, Ranks::Rank>,
    bitsection<10>,
    bitsection<1, bool>
> craft_skill;

Getters/Setters

This is one part of the class implementation that I'm not entirely satisfied with. Unfortunately I don't think that there is currently a better solution to this problem. There might be in the future, if either the constexpr or template mechanisms are expanded the correct way, but until then this will have to do.

I'm providing two ways to access sections defined inside the bitfield. The first is a more object-oriented approach and provides mirror functions member functions get and set that do the retrieval/assignment. To retrieve a value you need to call the get member function with the respective index as a template parameter. The set member function works the same way, but takes another function argument, the value to set it to:

// Getters
Ranks::Rank rank = craft_skill.get<0>();
int level = craft_skill.get<1>();
bool capped = craft_skill.get<2>();

// Setters
craft_skill.set<0>(Ranks::Amateur);
craft_skill.set<1>(5);
craft_skill.set<2>(false);

The second way mirrors how other standard C++ types are doing it (e.g. std::array and std::tuple). This is a style that I personally do not like much, but I've included it as an option to allow for more generic function uses:

Ranks::Rank rank = get<0>(craft_skill);
int level = get<1>(craft_skill);
bool capped = get<2>(craft_skill);

set<0>(craft_skill, Ranks::Amateur);
set<1>(craft_skill, 5);
set<2>(craft_skill, false);

The drawback of such getters and setters (whether members or free functions) is that they require an integral template argument that defines the index of the section. This can be made prettier by defining enums that contain appropriate values which stand for the respective bit field index:

namespace Fields
{
    enum Field
    {
        Rank,
        Level,
        Capped,
    };
}

Then you can access them by the corresponding enum value:

// Getters
Ranks::Rank rank = craft_skill.get<Fields::Rank>();
int level = craft_skill.get<Fields::Level>();
bool capped = craft_skill.get<Fields::Capped>();

// Setters
craft_skill.set<Fields::Rank>(Ranks::Amateur);
craft_skill.set<Fields::Level>(5);
craft_skill.set<Fields::Capped>(false);

Sequential definition

The main advantage that this class wanted to achieve is to make bit fields usable within struct definitions and that it will align as expected. For example, let's examine the server-to-client packet 0x062, which transmits skill information to the client. The structure is as follows:

  4 bytes: packet header (id, size, sequence)
124 bytes: junk (used to be used for job ability recast timers)
 96 bytes: combat skills, 2 bytes each (array of 48 skills)
 20 bytes: craft skills, 2 bytes each (array of 10 skills)
 12 bytes: junk (filled with 0xFF)

We can model that with the following definition (omitting the header, since that is the same for every packet):

struct packet0x062
{
    unsigned char junk1[0x7C];
    bitfield<2,
        bitsection<15>,
        bitsection<1, bool>
    > combat_skills[0x30];
    bitfield<2,
        bitsection<5, Ranks::Rank>,
        bitsection<10>,
        bitsection<1, bool>
    > craft_skills[0x0A];
    unsigned char junk2[0x0C];
} payload;

Now we can access individual values as previously described, and in addition to that we can easily populate the entire struct by using the packet base class's ref member. Assuming we add the above struct to the 0x062 packet's header, then we can do the following:

    ref<packet0x062>(data, 0x04) = payload;

This will populate the packet fully and correctly based on the provided definition. Currently this is done by manually masking and shifting the necessary bits wherever it's necessary. Below are a few examples in the current code, along with an explanation, so you can judge for yourself how unintuitive some of these operations are.

// Marking a skill as "capped"
                PChar->WorkingSkills.skill[SkillID] |= 0x8000;
// Increasing a skill by one
                    PChar->WorkingSkills.skill[skillID] += 0x20;
// Marking a skill as "capped", with a chance to overflow
                PChar->WorkingSkills.skill[i] += 0x8000;
// Writing the current skill value (full bit field)
            PChar->WorkingSkills.skill[i] = (PChar->RealSkills.skill[i] / 10) * 0x20 + PChar->RealSkills.rank[i]
// Retrieves the current combat skill (does not work for craft skills)
        return WorkingSkills.skill[SkillID] & 0x7FFF;

Performance

Many computations in this class are done at compile time, i.e. when the bit field is defined. All offsets and sizes (for both bits and bytes) are computed at compile-time, as are the ranges of each bit section. The only things that still need to be computed at run time are the bit shifts and masks necessary to retrieve or assign the actual value, and there is no way around that (this also has to be done without this class).

There are a few performance areas that can still be improved, which would involve overloading the getter/setter to work faster on certain types. I may do this at some point, but it's optional and will only save the copy of at most five bytes for each operation.

The recursions in the class are only in the template, as such they are all evaluated at compile-time as well and do not incur any kind of computation overhead during run-time.

Example

The code for a full example showcasing most of its features can be viewed and executed with a sample packet here.

teschnei · 2015-05-05T03:49:29Z

it looks fine, but a) VC14 is just around the corner so I might wait for that (i'm already using VS2015 RC) and b) i'll probably merge it next time I'm modifying something that could be changed to use it

teschnei · 2015-05-27T21:26:28Z

lets talk about this again soon (as I'll be refactoring the action packet sometime soon and need to figure out if this can help with that or not)

nesstea · 2015-07-05T05:04:15Z

your commits are so sexy. hot damn.

bendangelo · 2016-03-05T19:56:11Z

This ever going in? Can someone just pull his branch and get it up to speed with master.

takhlaq · 2016-03-05T19:56:52Z

when someone feels like redoing the packets yes

teschnei · 2016-03-05T20:05:16Z

it doesn't need any merging, it's just adding a utility in a new file. I
don't want to merge it until I need it

On Sat, Mar 5, 2016 at 12:56 PM, Tahir Akhlaq [email protected]
wrote:

when someone feels like redoing the packets yes

—
Reply to this email directly or view it on GitHub
#1378 (comment)
.

takhlaq

i approve

z16 added 3 commits April 28, 2015 00:10

Added bitfield marshalling type

73adfd1

This time with 100% more marshalling

a5a3d40

Removed redundancies, added safety checks, provided free getter/setter

4018c1a

whasf assigned teschnei May 1, 2015

takhlaq added the rebase pl0x label Jun 14, 2017

takhlaq approved these changes Feb 6, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Marshal bitfield #1378

Marshal bitfield #1378

z16 commented Apr 29, 2015

teschnei commented May 5, 2015

teschnei commented May 27, 2015

nesstea commented Jul 5, 2015

bendangelo commented Mar 5, 2016

takhlaq commented Mar 5, 2016

teschnei commented Mar 5, 2016

takhlaq left a comment

Marshal bitfield #1378

Are you sure you want to change the base?

Marshal bitfield #1378

Conversation

z16 commented Apr 29, 2015

Bitfield marshalling class

Will it break DSP?

On bit packing

Problems with bit field implementations

Class implementation

Explicit in size

Implicit typing

Execution speed

Class usage

Definition

Getters/Setters

Sequential definition

Performance

Example

teschnei commented May 5, 2015

teschnei commented May 27, 2015

nesstea commented Jul 5, 2015

bendangelo commented Mar 5, 2016

takhlaq commented Mar 5, 2016

teschnei commented Mar 5, 2016

takhlaq left a comment

Choose a reason for hiding this comment