Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

huge write IO without reason #7456

Closed
skliarie opened this issue Jun 10, 2020 · 13 comments
Closed

huge write IO without reason #7456

skliarie opened this issue Jun 10, 2020 · 13 comments
Labels
kind/bug A bug in existing code (including security flaws)

Comments

@skliarie
Copy link

skliarie commented Jun 10, 2020

Version information:

go-ipfs version: 0.5.1-8431e2e
Repo version: 9
System version: amd64/linux
Golang version: go1.13.10
runs as docker container, taken from ipfs/go-ipfs:v0.5.1

Description:

Repo of 3TB that has many different pins. No ongoing download in progress, but the repo still grows at speed of 20-30 KBytes/s (BTRFS, noatime,compress=zlib:3,space_cache). At the same time, average IO is 20-30 MegaBytes (!) per second.
The IO is pretty random that slow down magnetic media tremendously, with average 2-15 ipfs processes constantly sitting in I/O blocked state:

# vmstat -n 60
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
...
 0  2 422624 194356  21900 19943532    0    1   215 13104 7642 17343  3  2 52 43  0
 0 15 422880 203296  21896 19935368    1    1   217 13420 7754 18294  1  2 51 45  0
 0 12 423136 211772  21768 19929644    1    5   271 13350 7774 18120  3  2 47 48  0
@skliarie skliarie added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Jun 10, 2020
@RubenKelevra
Copy link
Contributor

RubenKelevra commented Jun 13, 2020

Hey @skliarie,

has the situation changed in the last 3 days?

We need some additional information to your configuration:

  • do you use flatfs or badger as datastore?
  • do you take part on a cluster, like a collabcluster?
  • which experimental flags do you use (config file)?
  • which flags do you use on the daemon?

Can you provide the debug information from the daemon when the daemon runs the unexpected IO workload?

@skliarie
Copy link
Author

skliarie commented Jun 13, 2020

I use flatfs.
No cluster, standalone machine
ipfs started using "ipfs daemon --migrate=true", see attached my config file

cat /tmp/ipfs_repo_stat

NumObjects: 27482408
RepoSize: 2712552599390
StorageMax: 4000000000000
RepoPath: /data/ipfs
Version: fs-repo@9

@skliarie
Copy link
Author

ipfs_config.txt

@skliarie
Copy link
Author

ipfs-profile-bframe-2020-06-13T18:39:42+0000.tar.gz

@RubenKelevra
Copy link
Contributor

Thanks!

Do you run any operations on your node via API or access any data via the http gateway?

Do you have any service running, like a system monitoring agent, which supports IPFS and accesses the API to collect any metrics?

Does this issue persists when you restart your daemon, if so, does this issue start right away or after some delay?

@skliarie
Copy link
Author

I have no http gateway, only the API.
No system monitoring like prometheus (if you mean it). Just basic checks.
Yes, the issue starts right away after daemon restart.

@Stebalien
Copy link
Member

The repo will grow as the DHT stores data (and as you send out providers for data), but 20-30KB/s is way too fast. How are you calculating disk usage? Are you seeing the disk usage reported by ipfs repo stat go up by 20KB/s?

In terms of the disk reads, this is probably bitswap. You're connected to a lot of peers that are probably looking for data. You may also want to set Datastore.BloomFilterSize (in the config) to 67108864 (64MiB). This will construct a bloom filter of 64MiB on start and every time you GC (expensive) so you don't have to check your disk to see if you have something every time a peer asks.

@skliarie
Copy link
Author

"ipfs repo stat" takes too long time to execute, I do df for that.

I have BloomFilterSize 0, I guess it is unlimited, no? If yes, this explains huge size of the IPFS process:

# ps awux | head
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
user    3223 37.0 13.6 17767376 3360848 pts/0 Sl+ Jun15 1964:04 ipfs daemon --migrate=true

And I do see (using iotop) that it is ipfs process that causes the writes and not BTRFS:

Total DISK READ :     218.76 K/s | Total DISK WRITE :      40.88 M/s
Actual DISK READ:     204.18 K/s | Actual DISK WRITE:      35.32 M/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                          
32037 be/4 user     29.17 K/s   40.88 M/s  0.00 % 99.99 % ipfs daemon --migrate=true

The writes don't have constant speed, it fluctuates between 0 to 40MB/s. I tried to find correlation between the writes (vmstat -n 1) and bitswap operation counters (watch ipfs stats bitswap), to no avail.

One thing I noticed, is that the writes are coming in bursts. See attached output of vmstat -n 1 for 10 mins.

It looks like there are some buffers that are flushed once in a while. Keep in mind that the node has huge number of peers (8755 at the moment) and 3TB worth of various popular pinsets.

vmstat-n1.txt.gz

@skliarie
Copy link
Author

Find attached imon.sh script I wrote to monitor the ipfs node and output for several hours that it produces.

imon.sh.gz
ipfs_repo_stats.txt.gz

@Stebalien
Copy link
Member

I have BloomFilterSize 0, I guess it is unlimited, no? If yes, this explains huge size of the IPFS process:

0 means 0 (disabled).

IPFS memory usage is usually correlated with the number of peers you have.


Looking at the profile, I see what's going on. The problem is the DHT. Your node is backed up trying to write incoming provider records to disk: libp2p/go-libp2p-kad-dht#675. It could just be that your disk is too slow, but I'm wondering if your node just happens to be close to some really popular content.

For now, I'd try running your daemon with "dhtclient".

cc @aschmahmann

@skliarie
Copy link
Author

ipfs 0.6.0, switched to dhtclient, the args are now "--routing=dhtclient --migrate=true --enable-pubsub-experiment". BloomFilterSize": 536870912.
write speed dropped to 5MB average per minute, fluctuating between 1 to 15MB/s in much smoother way than before.
Number of partners dropped from 8k to about 700, no idea why..
The node feels faster though.

@Stebalien
Copy link
Member

It dropped because peers are no longer trying to connect to you to advertise/find content in the DHT. Now, all of your peers are:

  1. Peers you've connected to (e.g., via the DHT).
  2. Peers trying to download content from you.

@Stebalien Stebalien removed the need/triage Needs initial labeling and prioritization label Mar 11, 2021
@Stebalien
Copy link
Member

Closing as there's nothing actionable here, unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws)
Projects
None yet
Development

No branches or pull requests

3 participants