Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Goroutine leak in tests #79

Open
awnumar opened this issue Jul 4, 2019 · 10 comments
Open

Goroutine leak in tests #79

awnumar opened this issue Jul 4, 2019 · 10 comments

Comments

@awnumar
Copy link
Contributor

awnumar commented Jul 4, 2019

Version: master

Steps to Reproduce

  1. Checkout master.
  2. Run $ go test -race -count=100 ./...
  3. Observe goroutine leak

screen

... scrolls up for a loong time. Interestingly removing the go qualifier from https://github.com/davrodpin/mole/blob/master/tunnel/tunnel.go#L254-L255 just causes the test to hang indefinitely.

@awnumar
Copy link
Contributor Author

awnumar commented Jul 4, 2019

What is the purpose of these two lines? The connection objects are being simultaneously written to each other. How is this safe?

mole/tunnel/tunnel.go

Lines 254 to 255 in a7edeac

go copyConn(channel.conn, remoteConn)
go copyConn(remoteConn, channel.conn)

mole/tunnel/tunnel.go

Lines 334 to 339 in a7edeac

func copyConn(writer, reader net.Conn) {
_, err := io.Copy(writer, reader)
if err != nil {
log.Errorf("%v", err)
}
}

@davrodpin
Copy link
Owner

@awnumar, what is the Go version you're using?

I ran the test with -race- and -count 100 and got no errors:

$ for x in $(seq 1 10);do  go test -race -count=100 ./... || break; done
ok      github.com/davrodpin/mole/cli   1.257s
?       github.com/davrodpin/mole/cmd/mole      [no test files]
ok      github.com/davrodpin/mole/storage       2.363s
ok      github.com/davrodpin/mole/tunnel        11.749s
ok      github.com/davrodpin/mole/cli   1.270s
?       github.com/davrodpin/mole/cmd/mole      [no test files]
ok      github.com/davrodpin/mole/storage       2.296s
ok      github.com/davrodpin/mole/tunnel        12.088s
ok      github.com/davrodpin/mole/cli   1.255s
?       github.com/davrodpin/mole/cmd/mole      [no test files]
ok      github.com/davrodpin/mole/storage       2.327s
ok      github.com/davrodpin/mole/tunnel        12.302s
ok      github.com/davrodpin/mole/cli   1.248s
?       github.com/davrodpin/mole/cmd/mole      [no test files]
ok      github.com/davrodpin/mole/storage       2.257s
ok      github.com/davrodpin/mole/tunnel        12.243s
ok      github.com/davrodpin/mole/cli   1.245s
?       github.com/davrodpin/mole/cmd/mole      [no test files]
ok      github.com/davrodpin/mole/storage       2.260s
ok      github.com/davrodpin/mole/tunnel        12.256s
ok      github.com/davrodpin/mole/cli   1.225s
?       github.com/davrodpin/mole/cmd/mole      [no test files]
ok      github.com/davrodpin/mole/storage       1.976s
ok      github.com/davrodpin/mole/tunnel        11.905s
ok      github.com/davrodpin/mole/cli   1.251s
?       github.com/davrodpin/mole/cmd/mole      [no test files]
ok      github.com/davrodpin/mole/storage       2.372s
ok      github.com/davrodpin/mole/tunnel        12.161s
ok      github.com/davrodpin/mole/cli   1.250s
?       github.com/davrodpin/mole/cmd/mole      [no test files]
ok      github.com/davrodpin/mole/storage       2.301s
ok      github.com/davrodpin/mole/tunnel        12.091s
ok      github.com/davrodpin/mole/cli   1.447s
?       github.com/davrodpin/mole/cmd/mole      [no test files]
ok      github.com/davrodpin/mole/storage       2.777s
ok      github.com/davrodpin/mole/tunnel        13.391s
ok      github.com/davrodpin/mole/cli   1.321s
?       github.com/davrodpin/mole/cmd/mole      [no test files]
ok      github.com/davrodpin/mole/storage       2.697s
ok      github.com/davrodpin/mole/tunnel        12.364s

@davrodpin
Copy link
Owner

What is the purpose of these two lines? The connection objects are being simultaneously written to each other. How is this safe?

The purpose is to create a bidirectional channel between the client and the remote endpoint by keep sending the data from the client to the ssh server to the remote endpoint and vice-versa.

Since io.Copy will only stop copying if either the source or the destination EOF, I can keep two goroutines, one for each direction.

client <-- channel.conn --> ssh server <-- remoteConn --> endpoint

I don't see why it wouldn't be safe but please let me know your thoughts.

@awnumar
Copy link
Contributor Author

awnumar commented Jul 5, 2019

:: go version go1.12.6 linux/amd64

Could you run $ ulimit -l on your system and see what you get?

Since io.Copy will only stop copying if either the source or the destination EOF, I can keep two goroutines, one for each direction.

Ah, I understand. It's safe because what you write to X is not the same data you read from X. My local tests are still playing up but I'm still investigating so I'll let you know.

@davrodpin
Copy link
Owner

Could you run $ ulimit -l on your system and see what you get?

Same result with go 1.12.6 and ulimit -l.

I will try on a linux machine later today.

$ go version
go version go1.12.6 darwin/amd64
$ ulimit -l
unlimited
$ go test -race -count=100 ./...
ok      github.com/davrodpin/mole/cli   1.304s
?       github.com/davrodpin/mole/cmd/mole      [no test files]
ok      github.com/davrodpin/mole/storage       2.545s
ok      github.com/davrodpin/mole/tunnel        11.374s
$

@davrodpin
Copy link
Owner

@awnumar, I couldn't reproduce the issue above on linux (see below).

But I find issues with two tests, TestHandlePassword and TestUpdatePassphrase, from tunnel/key_test.go when the -count=100 flag is passed (see bellow)

All tests but TestHandlePassword and TestUpdatePassphrase

$ uname -a
Linux xyz 4.4.0-151-generic #178-Ubuntu SMP Tue Jun 11 08:30:22 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
$ ulimit -l
64
$ go version
go version go1.12.6 linux/amd64
$ go test -race -count=100 ./...
ok      github.com/davrodpin/mole/cli   1.424s
?       github.com/davrodpin/mole/cmd/mole      [no test files]
ok      github.com/davrodpin/mole/storage       2.326s
ok      github.com/davrodpin/mole/tunnel        21.818s

TestHandlePassword (similar result from TestUpdatePassphrase execution)

$ go test -race -count=100 ./... -run TestHandlePassword
ok      github.com/davrodpin/mole/cli   1.053s [no tests to run]
?       github.com/davrodpin/mole/cmd/mole      [no test files]
ok      github.com/davrodpin/mole/storage       1.061s [no tests to run]
--- FAIL: TestHandlePassword (0.00s)
panic: <memguard::memcall::Lock> could not acquire lock on 0x7fbd9ab30000, limit reached? [Err: cannot allocate memory] [recovered]
        panic: <memguard::memcall::Lock> could not acquire lock on 0x7fbd9ab30000, limit reached? [Err: cannot allocate memory]

goroutine 39 [running]:
testing.tRunner.func1(0xc0002d4500)
        /users/mole/go/src/testing/testing.go:830 +0x69d
panic(0x951ea0, 0xc0002be540)
        /users/mole/go/src/runtime/panic.go:522 +0x1b5
github.com/awnumar/memguard/core.Panic(0x951ea0, 0xc0002be540)
        /users/mole/go/pkg/mod/github.com/awnumar/[email protected]/core/exit.go:73 +0x28f
github.com/awnumar/memguard/core.NewBuffer(0x4, 0x8, 0xc0002c02a8, 0xc0001aac98)
        /users/mole/go/pkg/mod/github.com/awnumar/[email protected]/core/buffer.go:75 +0x776
github.com/awnumar/memguard.NewBuffer(0x4, 0xc0002c02a8)
        /users/mole/go/pkg/mod/github.com/awnumar/[email protected]/buffer.go:44 +0x3d
github.com/awnumar/memguard.NewBufferFromBytes(0xc0002c02a8, 0x4, 0x8, 0x4)
        /users/mole/go/pkg/mod/github.com/awnumar/[email protected]/buffer.go:60 +0x3d
github.com/davrodpin/mole/tunnel.(*PemKey).updatePassphrase(0xc000303eb0, 0xc0002c02a8, 0x4, 0x8)
        /users/mole/mole/tunnel/key.go:116 +0xc1
github.com/davrodpin/mole/tunnel.(*PemKey).HandlePassphrase(0xc0001aaeb0, 0xc0001aae80, 0xc0002f3800, 0x9f2)
        /users/mole/mole/tunnel/key.go:101 +0x327
github.com/davrodpin/mole/tunnel.TestHandlePassword(0xc0002d4500)
        /users/mole/mole/tunnel/key_test.go:75 +0x1a9
testing.tRunner(0xc0002d4500, 0x9e12b8)
        /users/mole/go/src/testing/testing.go:865 +0x164
created by testing.(*T).Run
        /users/mole/go/src/testing/testing.go:916 +0x65b
FAIL    github.com/davrodpin/mole/tunnel        0.079s

@awnumar
Copy link
Contributor Author

awnumar commented Jul 8, 2019

@davrodpin I also caught this issue. I'm in the process of designing a patch for it.

LockedBuffer objects lock pages into memory, preventing the kernel from swapping them out to disk. On some systems there is a per-process limit. On yours it seems to be 64 kilobytes.

My idea is to load the pem key from file and decrypt it with the passphrase into the decrypted form immediately, and store the raw decrypted key in a LockedBuffer object which can then be sealed into an Enclave object. It's possible to store as many Enclave objects as you like.

The other solution is to destroy PemKey objects that are no longer needed but this is in my opinion less clean and could result in leaks still.

This is probably unrelated to the Goroutine leak issue however so perhaps open a new issue to track it?

@awnumar
Copy link
Contributor Author

awnumar commented Jul 8, 2019

Are these PemKey objects used only to generate ssh.Signers?

@davrodpin
Copy link
Owner

My idea is to load the pem key from file and decrypt it with the passphrase into the decrypted form immediately, and store the raw decrypted key in a LockedBuffer object which can then be sealed into an Enclave object. It's possible to store as many Enclave objects as you like.

Not sure how effective would be to keep the key safe since it will be copied to the signer object and we won't be able to protect that part of the memory.
To keep all secure, we might need to keep the passphrase in a way it could be reused (I am currently working on a feature that will allow tunnel reconnection) then resolve and load the pemkey and create the signer object in one shot.
Then we could keep the signer object sealed until it is used and then destroy it.

@awnumar
Copy link
Contributor Author

awnumar commented Jul 16, 2019

@davrodpin Yeah thought that might be the case. I'm looking into trying to initialise the Signer object inside a guarded memory region but for now all I've done is store the decrypted key in an Enclave object.

It's not yet finished and is broken with a segmentation fault somewhere.

Tree: https://github.com/awnumar/mole/tree/enclave
Commit: awnumar@aac96dc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants