Skip to content

Commit 40e8592

Browse files
author
Daniel Lemire
committed
ok
1 parent 5439256 commit 40e8592

File tree

11 files changed

+144
-180
lines changed

11 files changed

+144
-180
lines changed

README.md

Lines changed: 24 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,57 +1,56 @@
11
# SimdBase64
2-
## Fast WHATWG forgiving-base64 in C#
2+
## Fast WHATWG forgiving base64 decoding in C#
33

4-
The C# standard library has fast (SIMD-based) base64 encoding functions, but it lacks
5-
really fast base64 decoding function. The initial work that lead to the fast functions in the runtime
6-
was carried out by [gfoidl](https://github.com/gfoidl/Base64).
4+
Base64 is a standard approach to represent any binary data as ASCII. It is part of the email
5+
standard (MIME) and is commonly used to embed data in XML, HTML or JSON files. For example,
6+
images can be encoded as text using base64. Base64 is also used to represent cryptographic keys.
77

8-
- There are accelerated base64 functions for UTF-8 inputs in the .NET runtime, but they are not optimal:
9-
we can make them 50% faster or more.
10-
- There is no accelerated base64 functions for UTF-16 inputs (e.g., `string` types). We can be several times faster.
11-
12-
The goal of this project is to provide the fast WHATWG forgiving-base64 algorithm already
13-
used in major JavaScript runtimes (Node.js and Bun) to C#.
14-
15-
Importantly, we only focus on base64 decoding. It is a more challenging problem than base64 encoding because
8+
Our processors have SIMD instructions that are ideally suited to encode and decode base64.
9+
Encoding is somewhat easier than decoding. Decoding is a more challenging problem than base64 encoding because
1610
of the presence of allowable white space characters and the need to validate the input. Indeed, all
1711
inputs are valid for encoding, but only some inputs are valid for decoding. Having to skip white space
18-
characters makes accelerated decoding somewhat difficult.
12+
characters makes accelerated decoding somewhat difficult. We refer to this decoding as WHATWG forgiving-base64 decoding.
1913

14+
The C# standard library has fast (SIMD-based) base64 encoding functions. It also has fast decoding
15+
functions. Yet these accelerated base64 decoding functions for UTF-8 inputs in the .NET runtime are not optimal:
16+
we beat them by 1.7 x to 1.9 x on inputs of a few kilobytes or more by using a novel different algorithm.
17+
This fast WHATWG forgiving-base64 algorithm is already used in major JavaScript runtimes (Node.js and Bun).
2018

21-
## Results (SimdBase64 vs. fast .NET functions)
2219

2320

21+
## Results (SimdBase64 vs. fast .NET functions)
22+
2423
We use the enron base64 data for benchmarking, see benchmark/data/email.
2524
We process the data as UTF-8 (ASCII) using the .NET accelerated functions
2625
as a reference (`System.Buffers.Text.Base64.DecodeFromUtf8`).
2726

2827

29-
| processor | SimdBase64(GB/s) | .NET speed (GB/s) | speed up |
28+
| processor | SimdBase64 (GB/s) | .NET speed (GB/s) | speed up |
3029
|:----------------|:------------------------|:-------------------|:-------------------|
3130
| Apple M2 processor (ARM) | 6.5 | 3.8 | 1.7 x |
32-
| Intel Ice Lake (AVX2) | 6.6 | 3.4 | 1.9 x |
33-
| Intel Ice Lake (SSSE3) | 4.9 | 3.4 | 1.4 x |
31+
| Intel Ice Lake | 6.5 | 3.4 | 1.9 x |
3432

35-
Our results are more impressive when comparing against the standard base64 string decoding
36-
function (`Convert.FromBase64String(mystring)`), but it is explained in part by the fact
37-
that the .NET team did not accelerated them using SIMD instructions. Thus we omit them, only
38-
comparing with the SIMD-accelerated .NET functions.
3933

34+
As an aside, there is no accelerated base64 functions for UTF-16 inputs (e.g., `string` types).
35+
We can multiply the decoding speed compared to the .NET standard library (`Convert.FromBase64String(mystring)`),
36+
but we omit the numbers for simplicity.
4037

4138
## AVX-512
4239

4340
As for .NET 9, the support for AVX-512 remains incomplete in C#. In particular, important
44-
VBMI2 instructions are missing. Hence, we are not using AVX-512 under x64 systems.
41+
VBMI2 instructions are missing. Hence, we are not using AVX-512 under x64 systems at this time.
42+
However, as soon as .NET offers the necessary support, we will update our results.
4543

4644
## Requirements
4745

4846
We require .NET 9 or better: https://dotnet.microsoft.com/en-us/download/dotnet/9.0
4947

50-
5148
## Usage
5249

5350
The library only provides Base64 decoding functions, because the .NET library already has
54-
fast Base64 encoding functions.
51+
fast Base64 encoding functions. We support both `Span<byte>` (ASCII or UTF-8) and
52+
`Span<char>` (UTF-16) as input. If you have C# string, you can get its `Span<char>` with
53+
the `AsSpan()` method.
5554

5655
```c#
5756
string base64 = "SGVsbG8sIFdvcmxkIQ=="; // could be span<byte> in UTF-8 as well
@@ -64,7 +63,6 @@ fast Base64 encoding functions.
6463
// Encoding.UTF8.GetString(answer) == "Hello, World!"
6564
```
6665

67-
6866
## Running tests
6967

7068
```
@@ -161,6 +159,7 @@ You can convert an integer to a hex string like so: `$"0x{MyVariable:X}"`.
161159
- [gfoidl.Base64](https://github.com/gfoidl/Base64): original code that lead to the SIMD-based code in the runtime
162160
- [simdutf's base64 decode](https://github.com/simdutf/simdutf/blob/74126531454de9b06388cb2de78b18edbfcfbe3d/src/westmere/sse_base64.cpp#L337)
163161
- [WHATWG forgiving-base64 decode](https://infra.spec.whatwg.org/#forgiving-base64-decode)
162+
- The initial work that lead to the fast functions in the runtime was carried out by [gfoidl](https://github.com/gfoidl/Base64).
164163

165164
## More reading
166165

benchmark/Benchmark.cs

Lines changed: 67 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,63 @@ public string GetValue(Summary summary, BenchmarkCase benchmarkCase)
6666
public string Legend { get; } = "The speed in gigabytes per second";
6767
}
6868

69-
[SimpleJob(launchCount: 1, warmupCount: 10, iterationCount: 10)]
69+
#pragma warning disable CA1515
70+
public class DataVolume : IColumn
71+
{
72+
static double GetDirectorySize(string folderPath)
73+
{
74+
double totalSize = 0;
75+
DirectoryInfo di = new DirectoryInfo(folderPath);
76+
long fileCount = di.EnumerateFiles("*.*", SearchOption.AllDirectories).Count();
77+
78+
foreach (FileInfo fi in di.EnumerateFiles("*.*", SearchOption.AllDirectories))
79+
{
80+
totalSize += fi.Length;
81+
}
82+
83+
return totalSize / fileCount;
84+
}
85+
public string GetValue(Summary summary, BenchmarkCase benchmarkCase)
86+
{
87+
#pragma warning disable CA1062
88+
var ourReport = summary.Reports.First(x => x.BenchmarkCase.Equals(benchmarkCase));
89+
var fileName = (string)benchmarkCase.Parameters["FileName"];
90+
double length = 0;
91+
if (File.Exists(fileName))
92+
{
93+
FileInfo fi = new FileInfo(fileName);
94+
length = fi.Length;
95+
length /= File.ReadAllLines(fi.FullName).Length;
96+
}
97+
else if (Directory.Exists(fileName))
98+
{
99+
length = GetDirectorySize(fileName);
100+
}
101+
if (ourReport.ResultStatistics is null)
102+
{
103+
return "N/A";
104+
}
105+
return $"{length / 1000:#####.00}";
106+
}
107+
108+
public string GetValue(Summary summary, BenchmarkCase benchmarkCase, SummaryStyle style) => GetValue(summary, benchmarkCase);
109+
public bool IsDefault(Summary summary, BenchmarkCase benchmarkCase) => false;
110+
public bool IsAvailable(Summary summary) => true;
111+
112+
public string Id { get; } = nameof(DataVolume);
113+
public string ColumnName { get; } = "kiB/input";
114+
public bool AlwaysShow { get; } = true;
115+
public ColumnCategory Category { get; } = ColumnCategory.Custom;
116+
#pragma warning disable CA1805
117+
public int PriorityInCategory { get; } = 0;
118+
#pragma warning disable CA1805
119+
public bool IsNumeric { get; } = false;
120+
public UnitType UnitType { get; } = UnitType.Dimensionless;
121+
public string Legend { get; } = "The average data volume in kilobytes per input";
122+
}
123+
124+
125+
[SimpleJob(launchCount: 1, warmupCount: 10, iterationCount: 20)]
70126
[Config(typeof(Config))]
71127
#pragma warning disable CA1515
72128
public class RealDataBenchmark
@@ -77,9 +133,8 @@ private sealed class Config : ManualConfig
77133
static bool warned;
78134
public Config()
79135
{
136+
AddColumn(new DataVolume());
80137
AddColumn(new Speed());
81-
82-
83138
if (RuntimeInformation.ProcessArchitecture == Architecture.Arm64)
84139
{
85140
if (!warned)
@@ -88,7 +143,7 @@ public Config()
88143
Console.WriteLine("ARM64 system detected.");
89144
warned = true;
90145
}
91-
AddFilter(new AnyCategoriesFilter(["arm64", "runtime", "gfoidl"]));
146+
AddFilter(new AnyCategoriesFilter(["ARM64", "runtime"]));
92147

93148
}
94149
else if (RuntimeInformation.ProcessArchitecture == Architecture.X64)
@@ -101,7 +156,7 @@ public Config()
101156
Console.WriteLine("X64 system detected (Intel, AMD,...) with AVX-512 support.");
102157
warned = true;
103158
}
104-
AddFilter(new AnyCategoriesFilter(["avx512", "avx", "sse", "runtime", "gfoidl"]));
159+
AddFilter(new AnyCategoriesFilter(["avx512", "AVX", "SSE", "runtime"]));
105160
}
106161
else if (Avx2.IsSupported)
107162
{
@@ -111,7 +166,7 @@ public Config()
111166
Console.WriteLine("X64 system detected (Intel, AMD,...) with AVX2 support.");
112167
warned = true;
113168
}
114-
AddFilter(new AnyCategoriesFilter(["avx", "sse", "runtime", "gfoidl"]));
169+
AddFilter(new AnyCategoriesFilter(["AVX", "SSE", "runtime"]));
115170
}
116171
else if (Ssse3.IsSupported && Popcnt.IsSupported)
117172
{
@@ -121,7 +176,7 @@ public Config()
121176
Console.WriteLine("X64 system detected (Intel, AMD,...) with Ssse3 support.");
122177
warned = true;
123178
}
124-
AddFilter(new AnyCategoriesFilter(["sse", "runtime", "gfoidl"]));
179+
AddFilter(new AnyCategoriesFilter(["SSE", "runtime"]));
125180
}
126181
else if (Sse3.IsSupported && Popcnt.IsSupported)
127182
{
@@ -131,7 +186,7 @@ public Config()
131186
Console.WriteLine("X64 system detected (Intel, AMD,...) with Sse3 support.");
132187
warned = true;
133188
}
134-
AddFilter(new AnyCategoriesFilter(["sse", "runtime", "gfoidl"]));
189+
AddFilter(new AnyCategoriesFilter(["SSE", "runtime"]));
135190
}
136191
else
137192
{
@@ -141,12 +196,12 @@ public Config()
141196
Console.WriteLine("X64 system detected (Intel, AMD,...) without relevant SIMD support.");
142197
warned = true;
143198
}
144-
AddFilter(new AnyCategoriesFilter(["scalar", "runtime", "gfoidl"]));
199+
AddFilter(new AnyCategoriesFilter(["scalar", "runtime"]));
145200
}
146201
}
147202
else
148203
{
149-
AddFilter(new AnyCategoriesFilter(["scalar", "runtime", "gfoidl"]));
204+
AddFilter(new AnyCategoriesFilter(["scalar", "runtime"]));
150205
}
151206

152207
}
@@ -346,7 +401,6 @@ public unsafe void RunAVX2DecodingBenchmarkUTF8(string[] data, int[] lengths)
346401
{
347402
for (int i = 0; i < FileContent.Length; i++)
348403
{
349-
//string s = FileContent[i];
350404
byte[] base64 = input[i];
351405
byte[] dataoutput = output[i];
352406
int bytesConsumed = 0;
@@ -366,7 +420,6 @@ public unsafe void RunOurDecodingBenchmarkUTF8(string[] data, int[] lengths)
366420
{
367421
for (int i = 0; i < FileContent.Length; i++)
368422
{
369-
//string s = FileContent[i];
370423
byte[] base64 = input[i];
371424
byte[] dataoutput = output[i];
372425
int bytesConsumed = 0;
@@ -620,36 +673,26 @@ public unsafe void DotnetRuntimeSIMDBase64RealDataUTF8()
620673
RunRuntimeSIMDDecodingBenchmarkUTF8(FileContent, DecodedLengths);
621674
}
622675

623-
//[Benchmark]
624-
//[BenchmarkCategory("default", "runtime")]
625676
public unsafe void DotnetRuntimeSIMDBase64RealDataWithAllocUTF8()
626677
{
627678
RunRuntimeSIMDDecodingBenchmarkWithAllocUTF8(FileContent, DecodedLengths);
628679
}
629680

630-
//[Benchmark]
631-
//[BenchmarkCategory("default", "runtime")]
632681
public unsafe void DotnetRuntimeBase64RealDataUTF16()
633682
{
634683
RunRuntimeDecodingBenchmarkUTF16(FileContent, DecodedLengths);
635684
}
636685

637-
//[Benchmark]
638-
//[BenchmarkCategory("SSE")]
639686
public unsafe void SSEDecodingRealDataUTF8()
640687
{
641688
RunSSEDecodingBenchmarkUTF8(FileContent, DecodedLengths);
642689
}
643690

644-
//[Benchmark]
645-
//[BenchmarkCategory("SSE")]
646691
public unsafe void SSEDecodingRealDataWithAllocUTF8()
647692
{
648693
RunSSEDecodingBenchmarkWithAllocUTF8(FileContent, DecodedLengths);
649694
}
650695

651-
//[Benchmark]
652-
//[BenchmarkCategory("AVX")]
653696
public unsafe void AVX2DecodingRealDataUTF8()
654697
{
655698
RunAVX2DecodingBenchmarkUTF8(FileContent, DecodedLengths);
@@ -662,59 +705,43 @@ public unsafe void SimdBase64DecodingRealDataUTF8()
662705
RunOurDecodingBenchmarkUTF8(FileContent, DecodedLengths);
663706
}
664707

665-
666-
//[Benchmark]
667-
//[BenchmarkCategory("AVX")]
668708
public unsafe void AVX2DecodingRealDataWithAllocUTF8()
669709
{
670710
RunAVX2DecodingBenchmarkWithAllocUTF8(FileContent, DecodedLengths);
671711
}
672712

673-
674-
[Benchmark]
675-
[BenchmarkCategory("arm64")]
676713
public unsafe void ARMDecodingRealDataUTF8()
677714
{
678715
RunARMDecodingBenchmarkUTF8(FileContent, DecodedLengths);
679716
}
680717

681-
//[Benchmark]
682-
//[BenchmarkCategory("arm64")]
683718
public unsafe void ARMDecodingRealDataWithAllocUTF8()
684719
{
685720
RunARMDecodingBenchmarkWithAllocUTF8(FileContent, DecodedLengths);
686721
}
687722

688-
//[Benchmark]
689-
//[BenchmarkCategory("arm64")]
723+
690724
public unsafe void ARMDecodingRealDataUTF16()
691725
{
692726
RunARMDecodingBenchmarkUTF16(FileContent, DecodedLengths);
693727
}
694728

695-
//[Benchmark]
696-
//[BenchmarkCategory("SSE")]
729+
697730
public unsafe void SSEDecodingRealDataUTF16()
698731
{
699732
RunSSEDecodingBenchmarkUTF16(FileContent, DecodedLengths);
700733
}
701734

702-
//[Benchmark]
703-
//[BenchmarkCategory("SSE")]
704735
public unsafe void SSEDecodingRealDataWithAllocUTF16()
705736
{
706737
RunSSEDecodingBenchmarkWithAllocUTF16(FileContent, DecodedLengths);
707738
}
708739

709-
//[Benchmark]
710-
//[BenchmarkCategory("AVX")]
711740
public unsafe void AVX2DecodingRealDataUTF16()
712741
{
713742
RunAVX2DecodingBenchmarkUTF16(FileContent, DecodedLengths);
714743
}
715744

716-
//[Benchmark]
717-
//[BenchmarkCategory("AVX")]
718745
public unsafe void AVX2DecodingRealDataWithAllocUTF16()
719746
{
720747
RunAVX2DecodingBenchmarkWithAllocUTF16(FileContent, DecodedLengths);

0 commit comments

Comments
 (0)