Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor contrib/raftexample using interfaces to better separate concerns #15471

Closed
wants to merge 30 commits into from

Conversation

mhagger
Copy link

@mhagger mhagger commented Mar 14, 2023

I come to etcd by way of evaluating go.etcd.io/raft for use in a project. When I saw that there was a sample application raftexample showing how to use the module, I started reading it, hoping that this would make it obvious what components I would have to implement to use raft in my own application.

But unfortunately, it wasn't easy to figure that out from the source code of raftexample. True, it provides a nice end-to-end example. Also, raftNode is a fairly nice abstraction, showing how to deal with the channels going into and coming out of the raft library. But I didn't find the interface between raftNode and the "application code" super transparent. For example,

  • It is not obvious what features the Snapshotter must provide.
  • It's hard to see the boundary between the raft code and the finite state machine being driven by it. kvstore is the FSM, but it also does things like read straight out of raft-managed channels and interact directly with the snapshotter.
  • httpKVAPI contains a pointer to the whole kvstore, even though it really only interacts with its "public" interface, Propose() and Lookup().
  • The design forces an awkward initialization step, where main() creates a getSnaphost function that closes over an initialized variable (kvs) then passes that function into newRaftNode(). That function returns a shapshotterReady <-chan *snap.Snapshotter channel from which main() reads a Snapshotter. It was unclear how much of this was essential complexity and how much could be avoided. (It turns out that most of this complexity can be avoided.)
  • newRaftNode() returns commitC, which is handed on to newKVStore() and read from a goroutine in that type. It seems like it would be simpler for raftNode to read from commitC and invoke kvstore methods to trigger commits.
  • newRaftNode() returns errorC, to which it writes an error if there is a problem. But multiple pieces of code try to read from that channel, and the semantics of Go channels make it indeterminate which one will discover that there was an error. (In general, error reporting is somewhat inconsistent.)

This PR consists of a series of refactorings that attempt to fix a bunch of the problems listed above:

  • It introduces a few interfaces and uses them to better separate concerns:

    • SnapshotStorage represents permanent storage for snapshots. It only needs to know how to read and write them:
      type SnapshotStorage interface {
          SaveSnap(snapshot raftpb.Snapshot) error
          Load() (*raftpb.Snapshot, error)
          LoadNewestAvailable(walSnaps []walpb.Snapshot) (*raftpb.Snapshot, error)
      }
      
    • FSM represents the finite state machine that raftNode is able to drive. It only has to know how to take and restore snapshots and to apply commits:
      type FSM interface {
          TakeSnapshot() ([]byte, error)
          RestoreSnapshot(snapshot []byte) error
          ApplyCommits(commit *commit) error
      }
      
    • KVStore represents the interface that httpKVAPI uses to interact with the key-value store. Note that this interface only exposes application-level concepts and doesn't really expose anything about raft, but internally it has to know how to serialize its calls into log entries that can be managed by raft:
      type KVStore interface {
          Propose(k, v string)
          Lookup(k string) (string, bool)
      }
      
  • kvstore now presents a KVStore face to httpKVAPI and a FSM face to raftNode, but itself is now a passive type (no internal goroutines).

  • Now raftNode implements more of the raft data flows internally; in particular, its ProcessCommits() method reads commits from commitC and applies them to the kvstore via the FSM interface.

  • Now main() creates the snapshotter and passes it to raftNode. This eliminates the complexity of snapshotterReady.

  • raftNode now has Done() and Err() methods (replacing the errorC channel) that the caller can use to find out when it is done and whether it experienced any errors.

I think these changes make the code easier to read, and more importantly, make it easier for the reader to figure out would they would have to change to use the raft module within their own application, even if that application doesn't happen to be a key-value store.

This is a long patch series, but each commit is a self-contained step, meant to be readable/reviewable independently, and the tests pass after each commit. I'd be happy to break it up into multiple smaller PRs if that is preferable.

Along the way I fixed a couple of other minor problems, such as the ambiguity about error reporting mentioned above, and in the strange TestProposeOnCommit() test. See the commit messages for more information.

I only changed code within contrib/raftexample. But within that package, I have changed some internal interfaces. I assume that since this is just demonstration code, there are no backwards-compatibility promises made about this package. For the same reason, I have introduced some interfaces where there were none before, which will probably decrease performance very slightly and maybe force a few more allocations. Again, my goal was to make the code easier to understand, and I assume that preserving uncompromising performance is a lower priority than clarity.

I think that some further refactoring would be nice, to better delineate the interfaces between raftNode and the write-ahead log, and perhaps between raftNode and the transport layer. But I'll wait to see how this PR is received before proceeding.

I hope this is helpful!

mhagger added 30 commits March 12, 2023 15:11
This interface defines what `kvstore` needs for saving and loading
snapshots.

Signed-off-by: Michael Haggerty <[email protected]>
This interface defines what `httpKVAPI` needs as the backing key-value
store.

Signed-off-by: Michael Haggerty <[email protected]>
Initialize the snapshot storage in `newRaftNode()` rather than in
`startRaft()`. This is a step towards getting rid of the
`snapshotStorageReady` channel.

Signed-off-by: Michael Haggerty <[email protected]>
This is another step towards getting rid of `snapshotStorageReady`.

Signed-off-by: Michael Haggerty <[email protected]>
Rename `newRaftNode()` to `startRaftNode()`, and change it to replay
the WAL synchronously, before returning. This doesn't change anything,
because the callers were all waiting for `snapshotStorageReady` before
proceeding anyway.

Signed-off-by: Michael Haggerty <[email protected]>
Use a local variable (in `startRaftNode()`) instead.

Signed-off-by: Michael Haggerty <[email protected]>
This is more consistent with the types used elsewhere for IDs.

Signed-off-by: Michael Haggerty <[email protected]>
This is the sort of thing that the caller should be able to inject,
and it's not part of the raft algorithm itself. This change make it
clearer what a user of the raft algorithm must supply to it.

Signed-off-by: Michael Haggerty <[email protected]>
Extract two new methods, for clarity and to reduce code duplication.

Signed-off-by: Michael Haggerty <[email protected]>
Change `newKVStore()` to set up the initial state, but not start the
`readCommits()` goroutine anymore. Instead, change the callers to call
`processCommits()` (a renamed version of `readCommits()`) themselves
(in a goroutine).

The main benefit of this change is that the `kvstore` can be
instantiated before calling `startRaftNode()`, meaning that
`startRaftNode()` can be passed `kvs.getSnapshot` as an argument.
Previously, it had to be passed an anonymous function that refers to a
variable (`kvs`) that has not yet been initialized. This asynchrony
was confusing and unnecessary.

Signed-off-by: Michael Haggerty <[email protected]>
It wasn't doing anything useful.

Signed-off-by: Michael Haggerty <[email protected]>
Add a new interface, `FSM`, which represents a finite state machine
that can be driven by raft

Also add `kvfsm`, which is an `FSM` view of a `kvstore`. Treating
`kvstore` and `kvfsm` as separate types means that the public
interface of `kvstore` is not contaminated with internal methods that
are only there to support raft and should not be invoked by the user.

Signed-off-by: Michael Haggerty <[email protected]>
Make `cluster` hold an array of `peer`s, rather than one array for
each peer attribute.

Continue storing the names separately, since this `[]string` has to be
passed to `startRaftNode()`.

Signed-off-by: Michael Haggerty <[email protected]>
The tests can be made cleverer if each `peer` in a `cluster` can be
instantiated with an arbitrary finite state machine. We'll use this to
make some tests able to operate more like normal clients of
`raftNode`s.

Signed-off-by: Michael Haggerty <[email protected]>
Handle such errors at the caller, instead.

Signed-off-by: Michael Haggerty <[email protected]>
This is a step towards moving `ProcessCommits()` out of this interface
and into `raftNode`.

Signed-off-by: Michael Haggerty <[email protected]>
Instead, make `LoadAndApplySnapshot()` part of the `FSM` interface,
and have the callers of `newKVStore()` invoke that method after
calling `newKVStore()`. This is a step towards moving this
functionality entirely out of `FSM`.

Signed-off-by: Michael Haggerty <[email protected]>
Now that the `FSM` interface is more capable, there is no reason that
this method, which has more to do with raft anyway, can't be
implemented in `raftNode`. This makes it easier to implement new FSMs
without having to reimplement this method.

This means that `newRaftNode()` has to return a pointer to the
`raftNode`, so make that change, too. This change will make other
future changes simpler, too.

Signed-off-by: Michael Haggerty <[email protected]>
Make this a standard part of starting up a node.

Signed-off-by: Michael Haggerty <[email protected]>
This is the last use of `kvstore.snapshotStorage`, so remove that
field as well.

Signed-off-by: Michael Haggerty <[email protected]>
The old method, monitoring the `errorC` channel, is not great because
the error only pops out of that channel once. Which of the pieces of
code that are waiting on that channel reads the error? Nobody knows!

Instead, provide a `Done()` method and an `Err()` method that
interested parties can use to determine when the node finishes and
what error it returned.

The callers haven't been changed yet.

Signed-off-by: Michael Haggerty <[email protected]>
Use the `done` channel instead of the `errorC` channel for monitoring
for the completion of the raft node. If the channel is closed, don't
log the error, since the caller of `ProcessCommits()` is already
taking care of that.

Since this means that we're not killing the server by aborting the
program, we now have to call `srv.Close()` explicitly.

Signed-off-by: Michael Haggerty <[email protected]>
Instead of driving the channels within the test, user a cleverer FSM
and call `ProcessCommits()` to manage the channels.

The old version of the test only looked at the first data item in each
`commit`, even though multiple data items can be sent in a single
`commit`. It also only looked at the first 100 commits, even though
there is a multiplication factor because each node initiates 100
proposals. The new version checks all data items, and checks that the
total number of commits is as expected (namely, 100 for each node, or
300 total).

Signed-off-by: Michael Haggerty <[email protected]>
Keep `commitC` and `errorC` internal to `raftNode`: don't return them
from `newRaftNode()`, and don't require them as arguments to
`(*raftNode).ProcessCommits()`.

(Some tests still need them, but they can access the members directly
on the `raftNode` instance.)

Signed-off-by: Michael Haggerty <[email protected]>
@ahrtr
Copy link
Member

ahrtr commented Mar 14, 2023

Thanks @mhagger for the PR.

Actually we are planning to remove contrib/raftexample from etcd. Please see the comments below,

There is an issue etcd-io/raft#2 to implement a self-contained raft example in the raft repo. Probably you can discuss with @Elbehery on how to integrate your PR with his (not ready yet).

@Elbehery
Copy link
Member

Thanks @ahrtr 👍🏽

@mhagger Hello ✋🏽 .. I am also based in Berlin 😄

I have dropped you a msg on k8s Slack

Looking forward

@mhagger
Copy link
Author

mhagger commented Mar 14, 2023

@ahrtr: thanks for the info and the link.

I'm actually happy that raftexample is being moved out of the etcd repo. I was kindof wondering why it was there in the first place, aside from the convenience of "borrowing" dependencies from etcd. But that heavy use of etcd dependencies, IMHO, makes raftexample less useful as a pedagogical demonstration of how to use the raft library.

I think it would be attractive for raftexample to have its own barebones implementations of a WAL and a snapshotter, maybe even just using naive and non-performant (but correct!) filesystem calls for example. It might also be nice for it to offer industrial-strength implementations of those concepts, too, for people who don't want to customize them for their own application. But let these components be hidden behind a Go interface to make them easier to swap out.

@stale
Copy link

stale bot commented Jun 18, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@mhagger
Copy link
Author

mhagger commented Jul 19, 2023

Closing. I implemented the same refactoring in Elbehery/raft#1 against the raft repo. That PR supersedes this one.

@mhagger mhagger closed this Jul 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

3 participants