Skip to content

Segfault when repeatedly calling endpoint and compiled in release mode #790

@timvisee

Description

@timvisee

I seem to be getting a segfault when repeatedly calling a Rocket endpoint, when running a build in release mode. I'm calling this endpoint form a browser using a axios GET request. I'm not quite sure whether Rocket itself is causing this segfault, as I'm not too familiar with debugging and solving these issues. Because of that I'm mostly guessing in the wild and am trying attempting various things, which I'll try to report here. Maybe it's caused by a different crate or by the latest Rust nightly, in which case I will escalate the issue.

Simplify endpoint/route
I attempted to simplify the Rocket endpoint as much as possible, by just returning a Json<bool>, and without doing any special logic inside the endpoint/route itself. The segfault still kept happening.
The endpoint that was still causing segfaults when calling it can be seen here.

System allocator
The project uses the default allocator. Switching to the system allocator using the following code does prevent the segfault from happening. Using this shouldn't however be the solution:

#![feature(alloc_system, allocator_api)]
extern crate alloc_system;
use alloc_system::System;
#[global_allocator]
static A: System = System;

Valgrind
I did run the project through valgrind with the default options, and got various different traces. I don't know whether these are useful as I'm not too familiar with these segfaults. Here are two of those traces however:

==25874== Memcheck, a memory error detector
==25874== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==25874== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==25874== Command: ./target/release/cant-touch-this
==25874==
Initializing core...
Not loading templates, no file exists
🔧  Configured for production.
    => address: 0.0.0.0
    => port: 8000
    => log: critical
    => workers: 2
    => secret key: provided
    => limits: forms = 32KiB
    => keep-alive: 5s
    => tls: disabled
    => [extra] template_dir: "./res/templates"
🚀  Rocket has launched from http://0.0.0.0:8000
==25874==
==25874== Process terminating with default action of signal 11 (SIGSEGV)
==25874==  Bad permissions for mapped region at address 0x6600000
==25874==    at 0x3DFF98: je_tcache_bin_flush_small (tcache.c:0)
==25874==    by 0x3C1C6A: je_tcache_dalloc_small (tcache.h:419)
==25874==    by 0x3C1C6A: je_arena_sdalloc (arena.h:1499)
==25874==    by 0x3C1C6A: je_isdalloct (jemalloc_internal.h:1195)
==25874==    by 0x3C1C6A: je_isqalloc (jemalloc_internal.h:1205)
==25874==    by 0x3C1C6A: isfree (jemalloc.c:1921)
==25874==    by 0x14D8F1: core::ptr::drop_in_place (in /home/timvisee/projects/cant-touch-this/target/release/cant-touch-this)
==25874==    by 0x1536C5: cant_touch_this::web::server::rocket_route_fn_visualizer_points (in /home/timvisee/projects/cant-touch-this/target/release/cant-touch-this)
==25874==    by 0x287AB9: <F as rocket::handler::Handler>::handle (in /home/timvisee/projects/cant-touch-this/target/release/cant-touch-this)
==25874==    by 0x2A250A: rocket::rocket::Rocket::route_and_process (in /home/timvisee/projects/cant-touch-this/target/release/cant-touch-this)
==25874==    by 0x29FF15: <rocket::rocket::Rocket as hyper::server::Handler>::handle (in /home/timvisee/projects/cant-touch-this/target/release/cant-touch-this)
==25874==    by 0x28EA78: <hyper::server::Worker<H>>::handle_connection (in /home/timvisee/projects/cant-touch-this/target/release/cant-touch-this)
==25874==    by 0x2A6E07: hyper::server::listener::spawn_with::{{closure}} (in /home/timvisee/projects/cant-touch-this/target/release/cant-touch-this)
==25874==    by 0x2A4646: std::sys_common::backtrace::__rust_begin_short_backtrace (in /home/timvisee/projects/cant-touch-this/target/release/cant-touch-this)
==25874==    by 0x2A4C16: _ZN3std9panicking3try7do_call17h53241b36e4781d0eE.llvm.1788207313042385272 (in /home/timvisee/projects/cant-touch-this/target/release/cant-touch-this)
==25874==    by 0x3B9C09: __rust_maybe_catch_panic (lib.rs:102)
==25874==
==25874== HEAP SUMMARY:
==25874==     in use at exit: 310,727 bytes in 1,269 blocks
==25874==   total heap usage: 4,587 allocs, 3,318 frees, 710,817 bytes allocated
==25874==
==25874== LEAK SUMMARY:
==25874==    definitely lost: 0 bytes in 0 blocks
==25874==    indirectly lost: 0 bytes in 0 blocks
==25874==      possibly lost: 4,256 bytes in 14 blocks
==25874==    still reachable: 306,471 bytes in 1,255 blocks
==25874==                       of which reachable via heuristic:
==25874==                         stdstring          : 10,938 bytes in 243 blocks
==25874==         suppressed: 0 bytes in 0 blocks
==25874== Rerun with --leak-check=full to see details of leaked memory
==25874==
==25874== For counts of detected and suppressed errors, rerun with: -v
==25874== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
fish: “and valgrind ./target/release/c…” terminated by signal SIGSEGV (Address boundary error)

And two other traces (Gist).

Question
So, the question is of course. What is causing these segfaults, and how can we solve it? Is it caused by an issue in Rocket, or in another crate. Or is it my wrongdoing?

I sadly have no idea at all what the cause could be. I saw quite a few serde related entries for (de)serializing JSON before thus thought that serde was the problem. However, the issue kept occurring when I started returning a plain String.

Version and system information:

Type Version
Rocket v0.4-dev @ f857f81 (from git)
Host Ubuntu 18.04 (Linux 4.15.0-33-generic x86_64)
rustc rustc 1.31.0-nightly (423d81098 2018-10-08)
cargo cargo 1.31.0-nightly (ad6e5c003 2018-09-28)
My project timvisee/cant-touch-this#89af935

Additional notes

  • This does only occur in --release mode.
  • It seems to be happening since about two weeks (while I was using 46da03c)
  • I'm calling the endpoint each 50ms, so about 20 times a second.
  • It looks like 39 requests succeed after which the ~40th fails.
  • Returning a String (with "".into()) instead of a Json value doesn't prevent the segfault.
  • Normal (successful) requests aren't shown in the (valgrind) logs as I've set logging to critical.

Metadata

Metadata

Assignees

No one assigned

    Labels

    no bugThe reported bug was confirmed nonexistent

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions