Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NixOS, service fails immediately result 'exit-code' #123

Closed
evils opened this issue Aug 21, 2019 · 24 comments
Closed

NixOS, service fails immediately result 'exit-code' #123

evils opened this issue Aug 21, 2019 · 24 comments

Comments

@evils
Copy link

evils commented Aug 21, 2019

Hi, I hope this is the right place for this issue.
The nixpkgs repo seems quite busy, and @charles-dyfis-net (the nix package maintainer?) seems active here too.

I'm running on NixOS 19.03 with the latest kernel (5.2.9) and bees-service 0.6.1

My config is:

  boot.kernelPackages = pkgs.linuxPackages_latest;
  
  services.beesd.filesystems = {
    bulk = { 
      spec = "LABEL=bulk";
      hashTableSizeMB = 2048;
      verbosity = 7;
    };
  };

systemctl status shows:

[email protected] - Block-level BTRFS deduplication for bulk
   Loaded: loaded (/nix/store/cpd8f8y88dkv13q127sbjva0ffq06mnn-unit-beesd-bulk.service/[email protected]; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Wed 2019-08-21 20:06:32 CEST; 42min ago
  Process: 21979 ExecStopPost=/nix/store/lqbb19zgqnm3m2miyixk3m8l0zw4v31y-bees-service-0.6.1/bin/bees-service-wrapper cleanup LABEL=bulk verbosity=7 idxSizeMB=2048 workDir=.beeshome (code=exited, status=1/FAILURE)
  Process: 21976 ExecStart=/nix/store/lqbb19zgqnm3m2miyixk3m8l0zw4v31y-bees-service-0.6.1/bin/bees-service-wrapper run LABEL=bulk verbosity=7 idxSizeMB=2048 workDir=.beeshome -- --no-timestamps (code=exited, status=1/FAILURE)
 Main PID: 21976 (code=exited, status=1/FAILURE)
      CPU: 11ms

Aug 21 20:06:32 valix systemd[1]: Started Block-level BTRFS deduplication for bulk.
Aug 21 20:06:32 valix systemd[1]: [email protected]: Main process exited, code=exited, status=1/FAILURE
Aug 21 20:06:32 valix systemd[1]: [email protected]: Control process exited, code=exited status=1
Aug 21 20:06:32 valix systemd[1]: [email protected]: Failed with result 'exit-code'.
Aug 21 20:06:32 valix systemd[1]: [email protected]: Consumed 11ms CPU time

Changing the verbosity doesn't change this, is there another location i should look for logs?

Nix only seems to provide the beesd command and not bees, attempting to run sudo beesd -v 8 <UUID> fails in:

/nix/store/wnjv27b3j6jfdl0968xpcymlc7chpqil-gnugrep-3.3/bin/grep: /var/run/bees/configs/bees//*.conf: No such file or directory
ERROR: No config for <UUID>

Thanks for your consideration and let me know if there is anything i can do to help solve this.

@charles-dyfis-net
Copy link

Hmm -- I should probably get a nixpkgs PR into place for bees 0.7alpha, which is what I'm actually using in production these days. That said, the service wrapper is unmodified between the releases, so I don't expect there'll be any impact on your issue, whatever it may be.

What you can do immediately is run the service wrapper by hand, with bees_debug=1 set in the environment, to get a trace-level log of startup. Thus:

nix run nixpkgs.bees -c env bees_debug=1 bees-service-wrapper run LABEL=bulk verbosity=7 idxSizeMB=2048 workDir=.beeshome -- --no-timestamps

That should provide a trace-level log to allow further analysis.

@evils
Copy link
Author

evils commented Aug 21, 2019

Thanks for such a quick response, here's the result of that command

$ nix run nixpkgs.bees -c env bees_debug=1 bees-service-wrapper run LABEL=bulk verbosity=7 idxSizeMB=2048 workDir=.beeshome -- --no-timestamps
[8 copied (27.1 MiB), 4.4 MiB DL]
:bees-service-wrapper:45+allConfigNames=(blockdev fsSpec home idxSize idxSizeMB mntDir runDir status verbosity workDir)
:bees-service-wrapper:48+altConfigNames=([BEESHOME]=home [BEESSTATUS]=status [MNT_DIR]=mntDir [UUID]=uuid [WORK_DIR]=runDir [DB_SIZE]=idxSize)
:bees-service-wrapper:48+declare -A altConfigNames
:bees-service-wrapper:88+uuid_re='^[[:xdigit:]]{8}-[[:xdigit:]]{4}-[[:xdigit:]]{4}-[[:xdigit:]]{4}-[[:xdigit:]]{12}$'
:bees-service-wrapper:202+((  7 >= 2  ))
:bees-service-wrapper:203+declare -f do_run
:bees-service-wrapper:204+mode=run
:bees-service-wrapper:204+shift
:bees-service-wrapper:206+args=("$1")
:bees-service-wrapper:206+declare -a args
:bees-service-wrapper:206+shift
:bees-service-wrapper:210+((  5  ))
:bees-service-wrapper:211+[[ verbosity=7 = *=* ]]
:bees-service-wrapper:212+set_option verbosity=7
:bees-service-wrapper:82+local k v
:bees-service-wrapper:83+k=verbosity
:bees-service-wrapper:83+v=7
:bees-service-wrapper:84+[[ -n '' ]]
:bees-service-wrapper:85+printf -v bees_verbosity %s 7
:bees-service-wrapper:220+shift
:bees-service-wrapper:210+((  4  ))
:bees-service-wrapper:211+[[ idxSizeMB=2048 = *=* ]]
:bees-service-wrapper:212+set_option idxSizeMB=2048
:bees-service-wrapper:82+local k v
:bees-service-wrapper:83+k=idxSizeMB
:bees-service-wrapper:83+v=2048
:bees-service-wrapper:84+[[ -n '' ]]
:bees-service-wrapper:85+printf -v bees_idxSizeMB %s 2048
:bees-service-wrapper:220+shift
:bees-service-wrapper:210+((  3  ))
:bees-service-wrapper:211+[[ workDir=.beeshome = *=* ]]
:bees-service-wrapper:212+set_option workDir=.beeshome
:bees-service-wrapper:82+local k v
:bees-service-wrapper:83+k=workDir
:bees-service-wrapper:83+v=.beeshome
:bees-service-wrapper:84+[[ -n '' ]]
:bees-service-wrapper:85+printf -v bees_workDir %s .beeshome
:bees-service-wrapper:220+shift
:bees-service-wrapper:210+((  2  ))
:bees-service-wrapper:211+[[ -- = *=* ]]
:bees-service-wrapper:213+[[ -- = -- ]]
:bees-service-wrapper:214+shift
:bees-service-wrapper:215+args+=("$@")
:bees-service-wrapper:216+break
:bees-service-wrapper:223+do_run LABEL=bulk --no-timestamps
:bees-service-wrapper:165+local db old_db_size
:bees-service-wrapper:167+_setup LABEL=bulk
:bees-service-wrapper:96+declare fstype
:bees-service-wrapper:97+bees_fsSpec=LABEL=bulk
:bees-service-wrapper:97+shift
:bees-service-wrapper:100+bees_config_dir=/etc/bees
:bees-service-wrapper:101+[[ LABEL=bulk =~ ^[[:xdigit:]]{8}-[[:xdigit:]]{4}-[[:xdigit:]]{4}-[[:xdigit:]]{4}-[[:xdigit:]]{12}$ ]]
:bees-service-wrapper:138+[[ LABEL=bulk = */* ]]
:bees-service-wrapper:138+readConfigFileIfExists /etc/bees/LABEL=bulk.conf
:bees-service-wrapper:68+local line
:bees-service-wrapper:69+[[ -s /etc/bees/LABEL=bulk.conf ]]
:bees-service-wrapper:69+return 1
:bees-service-wrapper:141+[[ -n '' ]]
:bees-service-wrapper:143+read -r bees_uuid fstype
::bees-service-wrapper:143+findmnt -n -o uuid,fstype LABEL=bulk
:bees-service-wrapper:143+exit

@charles-dyfis-net
Copy link

charles-dyfis-net commented Aug 21, 2019

Ahh! So the findmnt command can't find a filesystem with a label of bulk.

If you want to look at the identifiers for your various filesystems, you might look at the output of blkid (or, if you don't have util-linux in your current environment, nix run nixpkgs.utillinux -c blkid); to update your configuration file with an selector that'll correctly match for the filesystem in question.

@evils
Copy link
Author

evils commented Aug 21, 2019

That's what i thought too, but running findmnt -n -o uuid bulk finds the UUID of the btrfs filesystem called bulk.
i just tried the same nix configuration with spec = "UUID=<UUID>";, that seems to have the same failure mode.

@charles-dyfis-net
Copy link

Hmmm. So, the failing line is:

read -r bees_uuid fstype < <(findmnt -n -o uuid,fstype "$bees_fsSpec") && [[ $fstype ]] || exit

If you run findmnt -n -o uuid,fstype LABEL=bulk; echo "status=$?", what's the full/precise output?

@evils
Copy link
Author

evils commented Aug 21, 2019

status=1

@charles-dyfis-net
Copy link

Hmmmm. Any chance adding the -e flag to findmnt (--evaluate) changes that? I don't need it on my system, but I wonder if it might be related to what you're seeing on yours.

@evils
Copy link
Author

evils commented Aug 21, 2019

no change

@evils
Copy link
Author

evils commented Aug 21, 2019

i'm not clear on how findmnt is supposed to use the argument LABEL=bulk (this is findmnt from util-linux 2.33.1)
maybe something like this is intended? findmnt -no uuid,fstype,label | awk '/bulk/ {print $1}
Though just findmnt -no uuid bulk is equivalent.

@evils
Copy link
Author

evils commented Aug 21, 2019

this seems to work
sudo nix run nixpkgs.bees -c env bees_debug=1 bees-service-wrapper run bulk verbosity=7 idxSizeMB=2048 workDir=.beeshome -- --no-timestamps

@evils
Copy link
Author

evils commented Aug 21, 2019

i suspect the LABEL= part is supposed to be stripped at some point
just removing it from the nix config doesn't seem to work

@charles-dyfis-net
Copy link

charles-dyfis-net commented Aug 21, 2019

As the author of the script in question -- I very much intended LABEL= to be passed to findmnt, so one could use other selectors as appropriate. For example:

$ findmnt -n -o uuid,fstype LABEL=boot
6808-90AA                            vfat

...or...

$ findmnt -n -o uuid,fstype PARTLABEL=primary
c080615f-0c0f-4b65-9cd9-ef5cd4ccd3e5 btrfs

@charles-dyfis-net
Copy link

...that said, if we change the -n to -ne, does it work with the LABEL= still in-place for you?

@evils
Copy link
Author

evils commented Aug 21, 2019

no

@evils
Copy link
Author

evils commented Aug 21, 2019

adding --fstab makes it output the UUID followed by the fstype though

@charles-dyfis-net
Copy link

Ahh! Is the filesystem mounted at the time of testing?

@evils
Copy link
Author

evils commented Aug 21, 2019

yes

@charles-dyfis-net
Copy link

Interesting. If you can figure out how to reproduce, it'd be curious to run that down. (You might see the automated tests at https://github.com/NixOS/nixpkgs/blob/master/nixos/tests/bees.nix as a starting point).

@charles-dyfis-net
Copy link

...anyhow, if our immediate fix is to add an option in the wrapper script and Nix module to pass --fstab through, I don't have a particular problem with doing so -- with the caveat that it feels a little hacky, like we're working around a symptom without having a root cause for the problem.

@evils
Copy link
Author

evils commented Aug 21, 2019

i'm a hobbyist and quite new to nixos and unsure of how to do either of those things...

@charles-dyfis-net
Copy link

I'll try to get a PR into nixpkgs with the extra wrapper option this weekend at the latest, and can help you run/test it locally when we get there.

Speaking of which -- there are arguably two parts to the fix. The Nix module part will be becoming a nixpkgs PR; the part that applies to this repo is an addition to PR #104, and can arguably be folded into that ticket, so I don't think there's call to keep it open here.

@charles-dyfis-net
Copy link

Thinking about it a bit more --

Rather than passing through an extra configuration knob, better to add an automatic fallback from kernel to fstab lookup; that way we aren't adding an additional thing that needs to be tuned.

@evils
Copy link
Author

evils commented Aug 21, 2019

that does seem like a cleaner way to go about it
i didn't expect this would get pushed to nixpkgs so quickly
would you still want me to try and replicate the issue? it sounds like your PR would be ready before i figure out how to run the unmodified test...

@evils evils closed this as completed Aug 21, 2019
@charles-dyfis-net
Copy link

One important caveat: I expect to have a PR filed against nixpkgs over the weekend; that's not to say it'll be merged. The review queue there is long and slow-moving; being flagged as a package's maintainer package gets one a little bit of leeway, but not always very much.

Either way, I'll @-notify you in on that PR when it's filed, and we can pick back up there.

FRidh pushed a commit to NixOS/nixpkgs that referenced this issue May 9, 2020
As reported by @evils-devils in Zygo/bees#123, the bees service wrapper
can fail on account of `findmnt` not being able to identify a mounted
filesystem using the default (kernel-introspection) mechanism.

Fall back to mtab and fstab-based inspection in turn should this fail.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants