Refactor `contrib/raftexample` using interfaces to better separate concerns #15471

mhagger · 2023-03-14T01:05:14Z

I come to etcd by way of evaluating go.etcd.io/raft for use in a project. When I saw that there was a sample application raftexample showing how to use the module, I started reading it, hoping that this would make it obvious what components I would have to implement to use raft in my own application.

But unfortunately, it wasn't easy to figure that out from the source code of raftexample. True, it provides a nice end-to-end example. Also, raftNode is a fairly nice abstraction, showing how to deal with the channels going into and coming out of the raft library. But I didn't find the interface between raftNode and the "application code" super transparent. For example,

It is not obvious what features the Snapshotter must provide.
It's hard to see the boundary between the raft code and the finite state machine being driven by it. kvstore is the FSM, but it also does things like read straight out of raft-managed channels and interact directly with the snapshotter.
httpKVAPI contains a pointer to the whole kvstore, even though it really only interacts with its "public" interface, Propose() and Lookup().
The design forces an awkward initialization step, where main() creates a getSnaphost function that closes over an initialized variable (kvs) then passes that function into newRaftNode(). That function returns a shapshotterReady <-chan *snap.Snapshotter channel from which main() reads a Snapshotter. It was unclear how much of this was essential complexity and how much could be avoided. (It turns out that most of this complexity can be avoided.)
newRaftNode() returns commitC, which is handed on to newKVStore() and read from a goroutine in that type. It seems like it would be simpler for raftNode to read from commitC and invoke kvstore methods to trigger commits.
newRaftNode() returns errorC, to which it writes an error if there is a problem. But multiple pieces of code try to read from that channel, and the semantics of Go channels make it indeterminate which one will discover that there was an error. (In general, error reporting is somewhat inconsistent.)

This PR consists of a series of refactorings that attempt to fix a bunch of the problems listed above:

It introduces a few interfaces and uses them to better separate concerns:
- SnapshotStorage represents permanent storage for snapshots. It only needs to know how to read and write them:
```
type SnapshotStorage interface {
    SaveSnap(snapshot raftpb.Snapshot) error
    Load() (*raftpb.Snapshot, error)
    LoadNewestAvailable(walSnaps []walpb.Snapshot) (*raftpb.Snapshot, error)
}
```
- FSM represents the finite state machine that raftNode is able to drive. It only has to know how to take and restore snapshots and to apply commits:
```
type FSM interface {
    TakeSnapshot() ([]byte, error)
    RestoreSnapshot(snapshot []byte) error
    ApplyCommits(commit *commit) error
}
```
- KVStore represents the interface that httpKVAPI uses to interact with the key-value store. Note that this interface only exposes application-level concepts and doesn't really expose anything about raft, but internally it has to know how to serialize its calls into log entries that can be managed by raft:
```
type KVStore interface {
    Propose(k, v string)
    Lookup(k string) (string, bool)
}
```
kvstore now presents a KVStore face to httpKVAPI and a FSM face to raftNode, but itself is now a passive type (no internal goroutines).
Now raftNode implements more of the raft data flows internally; in particular, its ProcessCommits() method reads commits from commitC and applies them to the kvstore via the FSM interface.
Now main() creates the snapshotter and passes it to raftNode. This eliminates the complexity of snapshotterReady.
raftNode now has Done() and Err() methods (replacing the errorC channel) that the caller can use to find out when it is done and whether it experienced any errors.

I think these changes make the code easier to read, and more importantly, make it easier for the reader to figure out would they would have to change to use the raft module within their own application, even if that application doesn't happen to be a key-value store.

This is a long patch series, but each commit is a self-contained step, meant to be readable/reviewable independently, and the tests pass after each commit. I'd be happy to break it up into multiple smaller PRs if that is preferable.

Along the way I fixed a couple of other minor problems, such as the ambiguity about error reporting mentioned above, and in the strange TestProposeOnCommit() test. See the commit messages for more information.

I only changed code within contrib/raftexample. But within that package, I have changed some internal interfaces. I assume that since this is just demonstration code, there are no backwards-compatibility promises made about this package. For the same reason, I have introduced some interfaces where there were none before, which will probably decrease performance very slightly and maybe force a few more allocations. Again, my goal was to make the code easier to understand, and I assume that preserving uncompromising performance is a lower priority than clarity.

I think that some further refactoring would be nice, to better delineate the interfaces between raftNode and the write-ahead log, and perhaps between raftNode and the transport layer. But I'll wait to see how this PR is received before proceeding.

I hope this is helpful!

This interface defines what `kvstore` needs for saving and loading snapshots. Signed-off-by: Michael Haggerty <[email protected]>

This interface defines what `httpKVAPI` needs as the backing key-value store. Signed-off-by: Michael Haggerty <[email protected]>

Initialize the snapshot storage in `newRaftNode()` rather than in `startRaft()`. This is a step towards getting rid of the `snapshotStorageReady` channel. Signed-off-by: Michael Haggerty <[email protected]>

This is another step towards getting rid of `snapshotStorageReady`. Signed-off-by: Michael Haggerty <[email protected]>

Rename `newRaftNode()` to `startRaftNode()`, and change it to replay the WAL synchronously, before returning. This doesn't change anything, because the callers were all waiting for `snapshotStorageReady` before proceeding anyway. Signed-off-by: Michael Haggerty <[email protected]>

Use a local variable (in `startRaftNode()`) instead. Signed-off-by: Michael Haggerty <[email protected]>

This is more consistent with the types used elsewhere for IDs. Signed-off-by: Michael Haggerty <[email protected]>

This is the sort of thing that the caller should be able to inject, and it's not part of the raft algorithm itself. This change make it clearer what a user of the raft algorithm must supply to it. Signed-off-by: Michael Haggerty <[email protected]>

Extract two new methods, for clarity and to reduce code duplication. Signed-off-by: Michael Haggerty <[email protected]>

Change `newKVStore()` to set up the initial state, but not start the `readCommits()` goroutine anymore. Instead, change the callers to call `processCommits()` (a renamed version of `readCommits()`) themselves (in a goroutine). The main benefit of this change is that the `kvstore` can be instantiated before calling `startRaftNode()`, meaning that `startRaftNode()` can be passed `kvs.getSnapshot` as an argument. Previously, it had to be passed an anonymous function that refers to a variable (`kvs`) that has not yet been initialized. This asynchrony was confusing and unnecessary. Signed-off-by: Michael Haggerty <[email protected]>

It wasn't doing anything useful. Signed-off-by: Michael Haggerty <[email protected]>

Add a new interface, `FSM`, which represents a finite state machine that can be driven by raft Also add `kvfsm`, which is an `FSM` view of a `kvstore`. Treating `kvstore` and `kvfsm` as separate types means that the public interface of `kvstore` is not contaminated with internal methods that are only there to support raft and should not be invoked by the user. Signed-off-by: Michael Haggerty <[email protected]>

Signed-off-by: Michael Haggerty <[email protected]>

Make `cluster` hold an array of `peer`s, rather than one array for each peer attribute. Continue storing the names separately, since this `[]string` has to be passed to `startRaftNode()`. Signed-off-by: Michael Haggerty <[email protected]>

The tests can be made cleverer if each `peer` in a `cluster` can be instantiated with an arbitrary finite state machine. We'll use this to make some tests able to operate more like normal clients of `raftNode`s. Signed-off-by: Michael Haggerty <[email protected]>

Signed-off-by: Michael Haggerty <[email protected]>

Handle such errors at the caller, instead. Signed-off-by: Michael Haggerty <[email protected]>

This is a step towards moving `ProcessCommits()` out of this interface and into `raftNode`. Signed-off-by: Michael Haggerty <[email protected]>

Instead, make `LoadAndApplySnapshot()` part of the `FSM` interface, and have the callers of `newKVStore()` invoke that method after calling `newKVStore()`. This is a step towards moving this functionality entirely out of `FSM`. Signed-off-by: Michael Haggerty <[email protected]>

Now that the `FSM` interface is more capable, there is no reason that this method, which has more to do with raft anyway, can't be implemented in `raftNode`. This makes it easier to implement new FSMs without having to reimplement this method. This means that `newRaftNode()` has to return a pointer to the `raftNode`, so make that change, too. This change will make other future changes simpler, too. Signed-off-by: Michael Haggerty <[email protected]>

Make this a standard part of starting up a node. Signed-off-by: Michael Haggerty <[email protected]>

This is the last use of `kvstore.snapshotStorage`, so remove that field as well. Signed-off-by: Michael Haggerty <[email protected]>

The old method, monitoring the `errorC` channel, is not great because the error only pops out of that channel once. Which of the pieces of code that are waiting on that channel reads the error? Nobody knows! Instead, provide a `Done()` method and an `Err()` method that interested parties can use to determine when the node finishes and what error it returned. The callers haven't been changed yet. Signed-off-by: Michael Haggerty <[email protected]>

Use the `done` channel instead of the `errorC` channel for monitoring for the completion of the raft node. If the channel is closed, don't log the error, since the caller of `ProcessCommits()` is already taking care of that. Since this means that we're not killing the server by aborting the program, we now have to call `srv.Close()` explicitly. Signed-off-by: Michael Haggerty <[email protected]>

Signed-off-by: Michael Haggerty <[email protected]>

Instead of driving the channels within the test, user a cleverer FSM and call `ProcessCommits()` to manage the channels. The old version of the test only looked at the first data item in each `commit`, even though multiple data items can be sent in a single `commit`. It also only looked at the first 100 commits, even though there is a multiplication factor because each node initiates 100 proposals. The new version checks all data items, and checks that the total number of commits is as expected (namely, 100 for each node, or 300 total). Signed-off-by: Michael Haggerty <[email protected]>

Keep `commitC` and `errorC` internal to `raftNode`: don't return them from `newRaftNode()`, and don't require them as arguments to `(*raftNode).ProcessCommits()`. (Some tests still need them, but they can access the members directly on the `raftNode` instance.) Signed-off-by: Michael Haggerty <[email protected]>

ahrtr · 2023-03-14T02:26:26Z

Thanks @mhagger for the PR.

Actually we are planning to remove contrib/raftexample from etcd. Please see the comments below,

There is an issue etcd-io/raft#2 to implement a self-contained raft example in the raft repo. Probably you can discuss with @Elbehery on how to integrate your PR with his (not ready yet).

Elbehery · 2023-03-14T09:21:57Z

Thanks @ahrtr 👍🏽

@mhagger Hello ✋🏽 .. I am also based in Berlin 😄

I have dropped you a msg on k8s Slack

Looking forward

mhagger · 2023-03-14T11:16:10Z

@ahrtr: thanks for the info and the link.

I'm actually happy that raftexample is being moved out of the etcd repo. I was kindof wondering why it was there in the first place, aside from the convenience of "borrowing" dependencies from etcd. But that heavy use of etcd dependencies, IMHO, makes raftexample less useful as a pedagogical demonstration of how to use the raft library.

I think it would be attractive for raftexample to have its own barebones implementations of a WAL and a snapshotter, maybe even just using naive and non-performant (but correct!) filesystem calls for example. It might also be nice for it to offer industrial-strength implementations of those concepts, too, for people who don't want to customize them for their own application. But let these components be hidden behind a Go interface to make them easier to swap out.

stale · 2023-06-18T02:05:41Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

mhagger · 2023-07-19T23:38:16Z

Closing. I implemented the same refactoring in Elbehery/raft#1 against the raft repo. That PR supersedes this one.

mhagger added 30 commits March 12, 2023 15:11

Define SnapshotStorage interface

692742c

This interface defines what `kvstore` needs for saving and loading snapshots. Signed-off-by: Michael Haggerty <[email protected]>

Define KVStore interface

b30063f

This interface defines what `httpKVAPI` needs as the backing key-value store. Signed-off-by: Michael Haggerty <[email protected]>

raftNode: initialize snapshotStorage in newRaftNode()

233ece9

Initialize the snapshot storage in `newRaftNode()` rather than in `startRaft()`. This is a step towards getting rid of the `snapshotStorageReady` channel. Signed-off-by: Michael Haggerty <[email protected]>

newRaftNode(): inline part of the goroutine's work

824a79a

This is another step towards getting rid of `snapshotStorageReady`. Signed-off-by: Michael Haggerty <[email protected]>

raftNode.snapdir: remove member

eb1aea2

Use a local variable (in `startRaftNode()`) instead. Signed-off-by: Michael Haggerty <[email protected]>

raftNode.id: convert type to uint64

abcdd02

This is more consistent with the types used elsewhere for IDs. Signed-off-by: Michael Haggerty <[email protected]>

kvstore.loadAndApplySnapshot(), applyCommits(): extract methods

373f623

Extract two new methods, for clarity and to reduce code duplication. Signed-off-by: Michael Haggerty <[email protected]>

kvstore.loadSnapshot(): inline method

5b9f7e1

It wasn't doing anything useful. Signed-off-by: Michael Haggerty <[email protected]>

Move more functionality from kvstore to kvfsm

81cea27

Signed-off-by: Michael Haggerty <[email protected]>

TestProposeOnCommit(): add some clarifying comments

28c4524

Signed-off-by: Michael Haggerty <[email protected]>

raftexample_test: introduce peer type

e837a99

Make `cluster` hold an array of `peer`s, rather than one array for each peer attribute. Continue storing the names separately, since this `[]string` has to be passed to `startRaftNode()`. Signed-off-by: Michael Haggerty <[email protected]>

kvfsm.applyCommits(): return an error

ba6f5ba

Signed-off-by: Michael Haggerty <[email protected]>

FSM.ProcessCommits(): return an error rather than calling log.Fatal()

be93074

Handle such errors at the caller, instead. Signed-off-by: Michael Haggerty <[email protected]>

FSM.ApplyCommits(): new method

69be6dc

This is a step towards moving `ProcessCommits()` out of this interface and into `raftNode`. Signed-off-by: Michael Haggerty <[email protected]>

newRaftNode(): call LoadAndApplySnapshot() here

a2793ee

Make this a standard part of starting up a node. Signed-off-by: Michael Haggerty <[email protected]>

LoadAndApplySnapshot(): move method to raftNode and make it private

bf32472

This is the last use of `kvstore.snapshotStorage`, so remove that field as well. Signed-off-by: Michael Haggerty <[email protected]>

cluster.Close(): read any error from the node directly

03508d2

Signed-off-by: Michael Haggerty <[email protected]>

TestProposeOnCommit(): read any error from the node directly

78ece8c

Signed-off-by: Michael Haggerty <[email protected]>

cluster.Cleanup(): new method, extracted from Close()

f599f20

Signed-off-by: Michael Haggerty <[email protected]>

ahrtr mentioned this pull request Mar 14, 2023

raftexample: apply conf change entries before send message #15450

Closed

Elbehery mentioned this pull request Apr 4, 2023

WIP:standalone raftexample etcd-io/raft#41

Closed

stale bot added the stale label Jun 18, 2023

ahrtr mentioned this pull request Jul 10, 2023

Implement a self-contained raft example etcd-io/raft#2

Open

Elbehery mentioned this pull request Jul 10, 2023

add selfcontained raftexample etcd-io/raft#67

Closed

mhagger mentioned this pull request Jul 19, 2023

Refactor raftexample using interfaces to better separate concerns Elbehery/raft#1

Open

mhagger closed this Jul 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor `contrib/raftexample` using interfaces to better separate concerns #15471

Refactor `contrib/raftexample` using interfaces to better separate concerns #15471

mhagger commented Mar 14, 2023

ahrtr commented Mar 14, 2023

Elbehery commented Mar 14, 2023

mhagger commented Mar 14, 2023

stale bot commented Jun 18, 2023

mhagger commented Jul 19, 2023

Refactor contrib/raftexample using interfaces to better separate concerns #15471

Refactor contrib/raftexample using interfaces to better separate concerns #15471

Conversation

mhagger commented Mar 14, 2023

ahrtr commented Mar 14, 2023

Elbehery commented Mar 14, 2023

mhagger commented Mar 14, 2023

stale bot commented Jun 18, 2023

mhagger commented Jul 19, 2023

Refactor `contrib/raftexample` using interfaces to better separate concerns #15471

Refactor `contrib/raftexample` using interfaces to better separate concerns #15471