Skip to content

Commit 2e20044

Browse files
committed
Add v2 using generics
1 parent 79084ac commit 2e20044

11 files changed

+634
-9
lines changed

.github/workflows/ci.yml

+68-6
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@ on:
66
pull_request:
77

88
jobs:
9-
build:
10-
name: CI
9+
build_v2:
10+
name: Build for v2
1111
runs-on: ubuntu-latest
1212

1313
steps:
@@ -22,14 +22,13 @@ jobs:
2222
echo github.event.changes.title.from=$CI_PR_PREV_TITLE
2323
2424
- name: Set up Go
25-
uses: actions/setup-go@v2
25+
uses: actions/setup-go@v3
2626
with:
27-
go-version: '~1.17.9'
27+
go-version: '~1.18'
2828
id: go
2929

3030
- name: Install utilities
3131
run: |
32-
go install golang.org/x/lint/golint@latest
3332
go install golang.org/x/tools/cmd/goimports@latest
3433
go install honnef.co/go/tools/cmd/staticcheck@latest
3534
# display Go environment for reference
@@ -47,21 +46,84 @@ jobs:
4746
4847
- name: Get dependencies
4948
run: |
49+
cd v2
5050
go mod tidy
5151
/usr/bin/git diff --exit-code
5252
5353
- name: Build
5454
run: |
55+
cd v2
5556
go build -v ./...
5657
5758
- name: Check
5859
run: |
60+
cd v2
5961
go vet ./...
60-
golint ./...
6162
staticcheck ./...
6263
goimports -w .
6364
/usr/bin/git diff --exit-code
6465
66+
- name: Test
67+
run: |
68+
cd v2
69+
go test -v ./...
70+
71+
build_v1:
72+
name: Build for v1
73+
runs-on: ubuntu-latest
74+
75+
steps:
76+
- name: Log
77+
env:
78+
CI_EVENT_ACTION: ${{ github.event.action }}
79+
CI_PR_TITLE: ${{ github.event.pull_request.title }}
80+
CI_PR_PREV_TITLE: ${{ github.event.changes.title.from }}
81+
run: |
82+
echo github.event.action=$CI_EVENT_ACTION
83+
echo github.event.pull_request.title=$CI_PR_TITLE
84+
echo github.event.changes.title.from=$CI_PR_PREV_TITLE
85+
86+
- name: Set up Go
87+
uses: actions/setup-go@v3
88+
with:
89+
go-version: '~1.17'
90+
id: go
91+
92+
- name: Install utilities
93+
run: |
94+
go install golang.org/x/lint/golint@latest
95+
go install golang.org/x/tools/cmd/goimports@latest
96+
go install honnef.co/go/tools/cmd/staticcheck@latest
97+
# display Go environment for reference
98+
go env
99+
100+
- name: Check out code
101+
uses: actions/checkout@v2
102+
103+
- uses: actions/cache@v2
104+
with:
105+
path: ~/go/pkg/mod
106+
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
107+
restore-keys: |
108+
${{ runner.os }}-go-
109+
110+
- name: Get dependencies
111+
run: |
112+
go mod tidy
113+
/usr/bin/git diff --exit-code
114+
115+
- name: Build
116+
run: |
117+
go build -v ./...
118+
119+
- name: Check
120+
run: |
121+
go vet ./*.go
122+
golint ./*.go
123+
staticcheck ./*.go
124+
goimports -w ./*.go
125+
/usr/bin/git diff --exit-code
126+
65127
- name: Test
66128
run: |
67129
go test -v ./...

LICENSE

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Copyright 2021 The Sensible Code Company Ltd
1+
Copyright 2022 The Sensible Code Company Ltd
22

33
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
44
associated documentation files (the "Software"), to deal in the Software without restriction,

README.md

+5
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
# faststringmap
22

3+
## v2 : Latest for Go 1.18 onwards
4+
**v2** is the latest which uses generics and runs on Go1.18. See [v2/README.md](v2/README.md) for details.
5+
6+
## v1 : for Go 1.17 and earlier
7+
38
`faststringmap` is a fast read-only string keyed map for Go (golang).
49
For our use case it is approximately 5 times faster than using Go's
510
built-in map type with a string key. It also has the following advantages:

uint32_store.go

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
// Copyright 2021 The Sensible Code Company Ltd
1+
// Copyright 2022 The Sensible Code Company Ltd
22
// Author: Duncan Harris
33

44
package faststringmap

uint32_store_example_test.go

+3
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,6 @@
1+
// Copyright 2022 The Sensible Code Company Ltd
2+
// Author: Duncan Harris
3+
14
package faststringmap_test
25

36
import (

uint32_store_test.go

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
// Copyright 2021 The Sensible Code Company Ltd
1+
// Copyright 2022 The Sensible Code Company Ltd
22
// Author: Duncan Harris
33

44
package faststringmap_test

v2/README.md

+65
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# faststringmap
2+
3+
`faststringmap` is a fast read-only string keyed map for Go (golang).
4+
For our use case it is approximately 5 times faster than using Go's
5+
built-in map type with a string key. It also has the following advantages:
6+
7+
* look up strings and byte slices without use of the `unsafe` package
8+
* minimal impact on GC due to lack of pointers in the data structure
9+
* data structure can be trivially serialized to disk or network
10+
11+
faststringmap v2 is built using Go generics for Go 1.18 onwards.
12+
13+
`faststringmap` is a variant of a data structure called a
14+
[Trie](https://en.wikipedia.org/wiki/Trie).
15+
At each level we use a slice to hold the next possible byte values.
16+
This slice is of length one plus the difference between the lowest and highest
17+
possible next bytes of strings in the map. Not all the entries in the slice are
18+
valid next bytes. `faststringmap` is thus more space efficient for keys using a
19+
small set of nearby runes, for example those using a lot of digits.
20+
21+
There are two variants provided:
22+
23+
* `Map` is a version using a single slice and indexes which can be directly
24+
serialized (e.g. to a file). It contains no embedded pointers so has minimal
25+
impact on GC.
26+
27+
* `MapFaster` has improved performance by using a slice for the `next` fields.
28+
This avoids a bounds check when looking up the entry for a byte. However, it
29+
comes at the cost of easy serialization and introduces a lot of pointers which
30+
will have impact on GC. It is not possible to directly construct the slice version
31+
in the same way so that the whole store is one block of memory. So this code provides
32+
a function to create it from `Map`. An alternative construction might create distinct
33+
slice objects at each level.
34+
35+
## Example
36+
37+
Example usage can be found in the tests and also
38+
[`fast_string_map_example_test.go`](fast_string_map_example_test.go)
39+
which shows a populated data structure to aid understanding.
40+
41+
## Motivation
42+
43+
I created `faststringmap` in order to improve the speed of parsing CSV
44+
where the fields were category codes from survey data. The majority of these
45+
were numeric (`"1"`, `"2"`, `"3"`...) plus a distinct code for "not applicable".
46+
I was struck that in the simplest possible cases (e.g. `"1"` ... `"5"`) the map
47+
should be a single slice lookup.
48+
49+
Our fast CSV parser provides fields as byte slices into the read buffer to
50+
avoid creating string objects. So I also wanted to facilitate key lookup from a
51+
`[]byte` rather than a string. This is not possible using a built-in Go map without
52+
use of the `unsafe` package.
53+
54+
## Benchmarks
55+
56+
Below are example benchmarks from my laptop which are for looking up every element
57+
in a map of size 1000. So approximate times are 25ns per lookup for the Go native map
58+
and 5ns per lookup for the ``faststringmap``.
59+
```
60+
cpu: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
61+
BenchmarkUint32Store
62+
BenchmarkUint32Store-8 218463 4959 ns/op
63+
BenchmarkGoStringToUint32
64+
BenchmarkGoStringToUint32-8 49279 24483 ns/op
65+
```

0 commit comments

Comments
 (0)