From 1bf2e953483bec8deae689868ef93a80f4d9b0d1 Mon Sep 17 00:00:00 2001 From: shahar Date: Thu, 15 Jun 2023 10:02:41 +0300 Subject: [PATCH 1/2] Document how to benchmark Dragonfly Fixes #60 --- docs/development/benchmarking.md | 85 ++++++++++++++++++++++++++++++++ 1 file changed, 85 insertions(+) create mode 100644 docs/development/benchmarking.md diff --git a/docs/development/benchmarking.md b/docs/development/benchmarking.md new file mode 100644 index 00000000..b87fe72b --- /dev/null +++ b/docs/development/benchmarking.md @@ -0,0 +1,85 @@ +--- +sidebar_position: 3 +--- + +# Benchmarking + +Do you have an existing Redis environment and would like to see if Dragonfly could be a better +replacement?
+Are you developing a service and would like to determine which cloud instance type to +allocate for Dragonfly?
+Do you wonder how many replicas you need to support your workload? + +If so, read on, because this page is for you! + +## Choosing an Environment + +A benchmark is done to assess the performance aspects of a system. In the case of Dragonfly, a +benchmark is commonly used to assess the CPU and memory performance & utilization. + +Depending on the goals of your benchmark, you should choose the machine size accordingly. For a +production mimicking benchmark, you should use a machine size and traffic load similar to that of +your busiest production timing, or even higher to allow for some cushion. + +If you do not use a cloud instance, it might be a good idea to configure your CPU's governance to +performance by issuing: + +```shell +sudo apt install linux-tools-common linux-tools-generic +sudo cpupower frequency-set --governor performance +``` + +Then, when you're done with the benchmark you could reboot your machine or run the following: + +```shell +sudo cpupower frequency-set --governor powersave +``` + +## Setting Up Dragonfly + +Dragonfly can run in [Docker](/getting-started/docker) or directly installed as a +[binary](/getting-started/binary) on your machine. See the [Getting Started](/getting-started) page +for other options and the latest documentation. + +## Reducing Noise + +Ideally, a benchmark should be run in as similar as possible environment as the production setup. + +In busy production deployments, it is common to run Dragonfly in its own machine (virtual or +dedicated). If you plan to do so in your production setup as well (which we highly recommend), +consider running the benchmark in a similar way. + +In practice, it means that any other systems in your setup (like other services & databases) should +run in other machines. Importantly, also the software that sends the traffic should run in another +machine. + +## Sending Traffic + +If your service already has existing benchmarking tools, or ways to record and replay production +traffic, you should definitely use them. That would be the closest estimation to what a real +production deployment with a backing Dragonfly would look like. + +If, like many others, you do not (yet) have such a tool, you could either write your own tool to +simulate production traffic or use an existing tool like `memtier_benchmark`. + +When writing your own tool, try to recreate the production traffic as closely as possible. Use the +same commands (like `SET`, `GET`, `SADD`, etc), with the expected ratio between them, and the +expected key and value sizes. + +If you choose to use an existing benchmarking tool, a popular and mature one is +[`memtier_benchmark`](https://github.com/RedisLabs/memtier_benchmark). It's an Open Source tool for +generic load generation and benchmarking with many features. Check out their documentation page for +more details, but as a quick reference you could use: + +```shell +memtier_benchmark \ + --server= \ + --threads= \ + --clients= \ + --requests= +``` + +## Having Troubles? Anything Unclear? + +Improving our documentation and helping the community is always of the higher priority for us, so +please feel free to reach out! From e233ff5a3b2f5eb4d7be848c4d788d40404dde0f Mon Sep 17 00:00:00 2001 From: shahar Date: Thu, 15 Jun 2023 12:16:27 +0300 Subject: [PATCH 2/2] Feedback --- docs/development/benchmarking.md | 43 ++++++++++++++++++++++---------- 1 file changed, 30 insertions(+), 13 deletions(-) diff --git a/docs/development/benchmarking.md b/docs/development/benchmarking.md index b87fe72b..631c1fec 100644 --- a/docs/development/benchmarking.md +++ b/docs/development/benchmarking.md @@ -12,7 +12,7 @@ Do you wonder how many replicas you need to support your workload? If so, read on, because this page is for you! -## Choosing an Environment +## Squeezing the Best Performance A benchmark is done to assess the performance aspects of a system. In the case of Dragonfly, a benchmark is commonly used to assess the CPU and memory performance & utilization. @@ -21,20 +21,35 @@ Depending on the goals of your benchmark, you should choose the machine size acc production mimicking benchmark, you should use a machine size and traffic load similar to that of your busiest production timing, or even higher to allow for some cushion. -If you do not use a cloud instance, it might be a good idea to configure your CPU's governance to -performance by issuing: +### `io_uring` -```shell -sudo apt install linux-tools-common linux-tools-generic -sudo cpupower frequency-set --governor performance -``` +Dragonfly supports both `epoll` and [`io_uring`](https://en.wikipedia.org/wiki/Io_uring) Linux APIs. +`io_uring` is a newer API, which is faster. Dragonfly runs best with `io_uring`, but it is only +available with Linux kernels >= 5.1. -Then, when you're done with the benchmark you could reboot your machine or run the following: +`io_uring` is available in Debian versions Bullseye (11) or later, Ubuntu 21.04 or later, Red Hat +Enterprise Linux 9.3 or later, Fedora 37 or later. + +To find if your machine has `io_uring` support you could run the following: ```shell -sudo cpupower frequency-set --governor powersave +grep io_uring_setup /proc/kallsyms ``` +### Choosing Instance Type + +Cloud providers, such as Amazon AWS, provide different types and sizes of virtual machines. When in +doubt, you could always opt in for a bigger instance (for both Dragonfly and the client to send the +benchmarking traffic) so that you'll know what the upper limit is. + +### Choosing Thread Count + +By default, Dragonfly will create a thread for each available CPU on the machine. You can modify +this behavior with the `--proactor_threads` flag. Generally you should not use this flag for a +machine dedicated to running Dragonfly. You can specify a lower number if you only want Dragonfly to +utilize some of the machine, but don't specify a higher number (i.e. more than CPUs) as it would +degrade performance. + ## Setting Up Dragonfly Dragonfly can run in [Docker](/getting-started/docker) or directly installed as a @@ -50,8 +65,8 @@ dedicated). If you plan to do so in your production setup as well (which we high consider running the benchmark in a similar way. In practice, it means that any other systems in your setup (like other services & databases) should -run in other machines. Importantly, also the software that sends the traffic should run in another -machine. +run in other machines. Importantly, also **the software that sends the traffic should run in another +machine.** ## Sending Traffic @@ -68,8 +83,10 @@ expected key and value sizes. If you choose to use an existing benchmarking tool, a popular and mature one is [`memtier_benchmark`](https://github.com/RedisLabs/memtier_benchmark). It's an Open Source tool for -generic load generation and benchmarking with many features. Check out their documentation page for -more details, but as a quick reference you could use: +generic load generation and benchmarking with many features. We use it for benchmarking constantly. +Check out their [documentation +page](https://redis.com/blog/memtier_benchmark-a-high-throughput-benchmarking-tool-for-redis-memcached/) +for more details, but as a quick reference you could use: ```shell memtier_benchmark \