-
-
Notifications
You must be signed in to change notification settings - Fork 76
Description
Hi! Thank you for the great tool ❤️
I started to use it recently, by migrating from other pastebin, and as far as I can see, rustypaste with duplicate_files = false tries to hash each file in upload folder to guess if file which to be uploaded already exists.
Here is info about my existing uploads:
# ls -lha data/files/ | wc -l
727
# du -hs data/files/
3.9G .data/files/
Uploaded file:
$ ls -lha favicon-152.png
-rw-r--r-- 1 vrein vrein 7.7K Dec 20 2020 favicon-152.png
Uploading time with duplicate_files = true:
$ time curl http://127.0.0.1:8880/ -F "file=@favicon-152.png"
http://127.0.0.1:8880/favicon-152.AqlQ6JKAp9.png
real 0m0.010s
user 0m0.006s
sys 0m0.003s
Uploading time with duplicate_files = false:
$ time curl http://127.0.0.1:8880/ -F "file=@favicon-152.png"
http://127.0.0.1:8880/favicon-152.AqlQ6JKAp9.png
real 0m10.411s
user 0m0.007s
sys 0m0.003s
I've added some random large files with dd if=/dev/urandom of=largefile bs=1M count=... and summarized in table:
| total files count | total files size | uploading time |
|---|---|---|
| 727 | 3.9G | 10.411s |
| 728 (+1 1G) | 4.9G | 13.137s |
| 729 (+1x2G +1x1G) | 6.9G | 19.254s |
| 3730 (+3k 1M) | 6.9G | 18.403s |
Upload time mostly depends on total files size, files count - unless reached a few millions - should not impact drastically.
I think this is a really great feature, but with current implementation it is prone to enlarge uploading time as file size and count increase, so maybe adding simple cache mechanism, like storing file hashes in memory or in file is worth implementing.
rustypaste version: built from source bf6dd31
os: arch linux
kernel: 6.10.10-arch1-1
processor: Intel(R) Xeon(R) CPU E3-1230 v6 @ 3.50GHz
Unfortunately I have no experience with rust, so may help only with testing and debugging :)