huge write IO without reason #7456

skliarie · 2020-06-10T17:26:22Z

Version information:

go-ipfs version: 0.5.1-8431e2e
Repo version: 9
System version: amd64/linux
Golang version: go1.13.10
runs as docker container, taken from ipfs/go-ipfs:v0.5.1

Description:

Repo of 3TB that has many different pins. No ongoing download in progress, but the repo still grows at speed of 20-30 KBytes/s (BTRFS, noatime,compress=zlib:3,space_cache). At the same time, average IO is 20-30 MegaBytes (!) per second.
The IO is pretty random that slow down magnetic media tremendously, with average 2-15 ipfs processes constantly sitting in I/O blocked state:

# vmstat -n 60
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
...
 0  2 422624 194356  21900 19943532    0    1   215 13104 7642 17343  3  2 52 43  0
 0 15 422880 203296  21896 19935368    1    1   217 13420 7754 18294  1  2 51 45  0
 0 12 423136 211772  21768 19929644    1    5   271 13350 7774 18120  3  2 47 48  0

The text was updated successfully, but these errors were encountered:

RubenKelevra · 2020-06-13T08:17:01Z

Hey @skliarie,

has the situation changed in the last 3 days?

We need some additional information to your configuration:

do you use flatfs or badger as datastore?
do you take part on a cluster, like a collabcluster?
which experimental flags do you use (config file)?
which flags do you use on the daemon?

Can you provide the debug information from the daemon when the daemon runs the unexpected IO workload?

skliarie · 2020-06-13T17:41:56Z

I use flatfs.
No cluster, standalone machine
ipfs started using "ipfs daemon --migrate=true", see attached my config file

cat /tmp/ipfs_repo_stat

NumObjects: 27482408
RepoSize: 2712552599390
StorageMax: 4000000000000
RepoPath: /data/ipfs
Version: fs-repo@9

skliarie · 2020-06-13T17:43:17Z

ipfs_config.txt

skliarie · 2020-06-13T18:42:41Z

ipfs-profile-bframe-2020-06-13T18:39:42+0000.tar.gz

RubenKelevra · 2020-06-14T00:56:45Z

Thanks!

Do you run any operations on your node via API or access any data via the http gateway?

Do you have any service running, like a system monitoring agent, which supports IPFS and accesses the API to collect any metrics?

Does this issue persists when you restart your daemon, if so, does this issue start right away or after some delay?

skliarie · 2020-06-14T03:57:29Z

I have no http gateway, only the API.
No system monitoring like prometheus (if you mean it). Just basic checks.
Yes, the issue starts right away after daemon restart.

Stebalien · 2020-06-19T00:02:26Z

The repo will grow as the DHT stores data (and as you send out providers for data), but 20-30KB/s is way too fast. How are you calculating disk usage? Are you seeing the disk usage reported by ipfs repo stat go up by 20KB/s?

In terms of the disk reads, this is probably bitswap. You're connected to a lot of peers that are probably looking for data. You may also want to set Datastore.BloomFilterSize (in the config) to 67108864 (64MiB). This will construct a bloom filter of 64MiB on start and every time you GC (expensive) so you don't have to check your disk to see if you have something every time a peer asks.

skliarie · 2020-06-19T13:42:41Z

"ipfs repo stat" takes too long time to execute, I do df for that.

I have BloomFilterSize 0, I guess it is unlimited, no? If yes, this explains huge size of the IPFS process:

# ps awux | head
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
user    3223 37.0 13.6 17767376 3360848 pts/0 Sl+ Jun15 1964:04 ipfs daemon --migrate=true

And I do see (using iotop) that it is ipfs process that causes the writes and not BTRFS:

Total DISK READ :     218.76 K/s | Total DISK WRITE :      40.88 M/s
Actual DISK READ:     204.18 K/s | Actual DISK WRITE:      35.32 M/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                          
32037 be/4 user     29.17 K/s   40.88 M/s  0.00 % 99.99 % ipfs daemon --migrate=true

The writes don't have constant speed, it fluctuates between 0 to 40MB/s. I tried to find correlation between the writes (vmstat -n 1) and bitswap operation counters (watch ipfs stats bitswap), to no avail.

One thing I noticed, is that the writes are coming in bursts. See attached output of vmstat -n 1 for 10 mins.

It looks like there are some buffers that are flushed once in a while. Keep in mind that the node has huge number of peers (8755 at the moment) and 3TB worth of various popular pinsets.

vmstat-n1.txt.gz

skliarie · 2020-06-19T13:47:33Z

Find attached imon.sh script I wrote to monitor the ipfs node and output for several hours that it produces.

imon.sh.gz
ipfs_repo_stats.txt.gz

Stebalien · 2020-06-19T14:49:04Z

I have BloomFilterSize 0, I guess it is unlimited, no? If yes, this explains huge size of the IPFS process:

0 means 0 (disabled).

IPFS memory usage is usually correlated with the number of peers you have.

Looking at the profile, I see what's going on. The problem is the DHT. Your node is backed up trying to write incoming provider records to disk: libp2p/go-libp2p-kad-dht#675. It could just be that your disk is too slow, but I'm wondering if your node just happens to be close to some really popular content.

For now, I'd try running your daemon with "dhtclient".

cc @aschmahmann

skliarie · 2020-06-22T04:16:19Z

ipfs 0.6.0, switched to dhtclient, the args are now "--routing=dhtclient --migrate=true --enable-pubsub-experiment". BloomFilterSize": 536870912.
write speed dropped to 5MB average per minute, fluctuating between 1 to 15MB/s in much smoother way than before.
Number of partners dropped from 8k to about 700, no idea why..
The node feels faster though.

Stebalien · 2020-06-22T18:46:21Z

It dropped because peers are no longer trying to connect to you to advertise/find content in the DHT. Now, all of your peers are:

Peers you've connected to (e.g., via the DHT).
Peers trying to download content from you.

Stebalien · 2021-03-11T15:34:05Z

Closing as there's nothing actionable here, unfortunately.

skliarie added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Jun 10, 2020

Stebalien removed the need/triage Needs initial labeling and prioritization label Mar 11, 2021

Stebalien closed this as completed Mar 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

huge write IO without reason #7456

huge write IO without reason #7456

skliarie commented Jun 10, 2020 •

edited

Loading

RubenKelevra commented Jun 13, 2020 •

edited

Loading

skliarie commented Jun 13, 2020 •

edited

Loading

skliarie commented Jun 13, 2020

skliarie commented Jun 13, 2020

RubenKelevra commented Jun 14, 2020

skliarie commented Jun 14, 2020

Stebalien commented Jun 19, 2020

skliarie commented Jun 19, 2020

skliarie commented Jun 19, 2020

Stebalien commented Jun 19, 2020

skliarie commented Jun 22, 2020

Stebalien commented Jun 22, 2020

Stebalien commented Mar 11, 2021

huge write IO without reason #7456

huge write IO without reason #7456

Comments

skliarie commented Jun 10, 2020 • edited Loading

Version information:

Description:

RubenKelevra commented Jun 13, 2020 • edited Loading

skliarie commented Jun 13, 2020 • edited Loading

cat /tmp/ipfs_repo_stat

skliarie commented Jun 13, 2020

skliarie commented Jun 13, 2020

RubenKelevra commented Jun 14, 2020

skliarie commented Jun 14, 2020

Stebalien commented Jun 19, 2020

skliarie commented Jun 19, 2020

skliarie commented Jun 19, 2020

Stebalien commented Jun 19, 2020

skliarie commented Jun 22, 2020

Stebalien commented Jun 22, 2020

Stebalien commented Mar 11, 2021

skliarie commented Jun 10, 2020 •

edited

Loading

RubenKelevra commented Jun 13, 2020 •

edited

Loading

skliarie commented Jun 13, 2020 •

edited

Loading