Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow for big types in mpsc channels on separate thread #102246

Closed
janie177 opened this issue Sep 24, 2022 · 8 comments · Fixed by #132738
Closed

Stack overflow for big types in mpsc channels on separate thread #102246

janie177 opened this issue Sep 24, 2022 · 8 comments · Fixed by #132738
Labels
C-bug Category: This is a bug. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@janie177
Copy link

I tried this code:

use std::sync::mpsc::*;

struct BigStruct {
    _data: [u32; 32000],
}

fn main() {

let thread = std::thread::spawn(move || {
    let (_sender, receiver) = channel::<BigStruct>();
    
    for _data in receiver.try_iter() {

    }
    
});

    thread.join().unwrap();
    println!("Done :)");
}

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=7ebdbceac5477ebb48a65f5ecacb5c62

I expected to see this happen:
The program shouldn't do much beside printing a friendly message.

Instead, this happened:
When running in Debug mode, the program crashes due to a stack overflow. In release mode it behaves as expected. Looping over the data in the receiver in combination with being on a separate thread triggers it.
I tried running on nightly too, which produces the same results.
I've now switched to crossbeam as a workaround, which does not suffer from the same issue fortunately.

rustc --version --verbose:

rustc 1.64.0 (a55dd71d5 2022-09-19)
binary: rustc
commit-hash: a55dd71d5fb0ec5a6a3a9e8c27b2127ba491ce52
commit-date: 2022-09-19
host: x86_64-pc-windows-msvc
release: 1.64.0
LLVM version: 14.0.6
Backtrace

thread '<unknown>' has overflowed its stack

@janie177 janie177 added the C-bug Category: This is a bug. label Sep 24, 2022
@lapointexavier
Copy link

lapointexavier commented Aug 1, 2023

I've run into this when upgrading from 1.70.0 to 1.71.0. Compiling in release mode prevented the segfault when using 1.71.0, though the issue was not happening in 1.70.0.

@ChrisDenton
Copy link
Member

ChrisDenton commented Aug 11, 2023

I've not been able to reproduce this using the program in the OP. I've tried on both 1.71.1 and 1.73.0-nightly. In both cases it prints "Done :)".

rustc 1.73.0-nightly (439d066bc 2023-08-10)
binary: rustc
commit-hash: 439d066bcf9496b1b8c8dde8bef3bce607a621bb
commit-date: 2023-08-10
host: x86_64-pc-windows-msvc
release: 1.73.0-nightly
LLVM version: 17.0.0

@lapointexavier
Copy link

@ChrisDenton OP's didn't trigger it for me either, but after playing with it I was able to reproduce:

Repro

use std::sync::mpsc::*;

struct BigStruct {
    _data: [u8; 32768],
}

fn main() {
    let (sender, receiver) = channel::<BigStruct>();

    let thread1 = std::thread::spawn(move || {
        let data = BigStruct { _data: [0u8; 32768] };
        sender.send(data).unwrap();
    });

    thread1.join().unwrap();
    for _data in receiver.try_iter() {

    }
    println!("Done :)");
}

Output

~/Dev/hello_world$ cargo run --release
   Compiling hello_world v0.1.0 (/.../hello_world)
    Finished release [optimized] target(s) in 0.56s
     Running `target/release/hello_world`
Done :)

~/Dev/hello_world$ cargo run
   Compiling hello_world v0.1.0 (/.../hello_world)
    Finished dev [unoptimized + debuginfo] target(s) in 0.31s
     Running `target/debug/hello_world`

thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
Aborted (core dumped)
rustc --version --verbose
rustc 1.71.1 (eb26296b5 2023-08-03)
binary: rustc
commit-hash: eb26296b556cef10fb713a38f3d16b9886080f26
commit-date: 2023-08-03
host: x86_64-unknown-linux-gnu
release: 1.71.1
LLVM version: 16.0.5

@lapointexavier
Copy link

lapointexavier commented Aug 11, 2023

Also this may be system dependent, but I was playing with slice sizes and the trigger seems to be 8120 + 1 for the overflow. I don't know why release mode doesn't overflow, maybe something is being optimized.

@ChrisDenton ChrisDenton added T-libs Relevant to the library team, which will review and decide on the PR/issue. and removed needs-triage-legacy labels Aug 12, 2023
@ChrisDenton
Copy link
Member

Ok, I've not yet investigated but my guess would be something about how it's not being optimized in start_send is multiplying the needed stack space by a fair amount.

@ibraheemdev might have some ideas. Note though this only affects opt-level=0 builds.

@ChrisDenton ChrisDenton added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Aug 12, 2023
@Enselic
Copy link
Member

Enselic commented Nov 5, 2024

The reproduer can be simplifed to:

// We must keep receiver alive
#[allow(unused_variables)]

fn main() {
    let (sender, receiver) = std::sync::mpsc::channel::<[u8; 32768]>();

    let thread = std::thread::spawn(move || {
        sender.send([0u8; 32768]).unwrap();
    });

    thread.join().unwrap();

    println!("Done :)");
}

Regressed versions

The code runs fine in debug builds with 1.66 but not with 1.67. So this is a regression-from-stable-to-stable, but arguably not very severe since this seemingly has gone mostly unnoticed for a considerable amount of time.

With cargo bisect-rustc we find that this runs fine with nightly-2022-11-13 but not nightly-2022-11-14 which leaves us with

git log --oneline --no-merges \
    ^6284998a2677d7e3e8420db783f3aa4fd80d7423 \
    e631891f7ad40eac3ef58ec3c2b57ecd81e40615 

of which #93563 seems most likely.

To demonstrate:

for toolchain in 1.66 \
                 1.67 \
                 nightly-2022-11-13 \
                 nightly-2022-11-14 ; do
    rm -rf target
    echo "\nUsing $toolchain"
    cargo +$toolchain run --quiet
done
Using 1.66
Done :)

Using 1.67
thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow

Using nightly-2022-11-13
Done :)

Using nightly-2022-11-14
thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow

Disassembly

The disassembly of std::sync::mpmc::list::Channel<[u8; 32768]>::start_send<[u8; 32768]> where we SIGSEGV in gdb looks like this:

||   0x55555555df70 <_ZN3std4sync4mpmc4list16Channel$LT$T$GT$10start_send17h8dac1358fea503eaE>       mov    %rsp,%r11        
││   0x55555555df73 <_ZN3std4sync4mpmc4list16Channel$LT$T$GT$10start_send17h8dac1358fea503eaE+3>     sub    $0x1f0000,%r11   
││   0x55555555df7a <_ZN3std4sync4mpmc4list16Channel$LT$T$GT$10start_send17h8dac1358fea503eaE+10>    sub    $0x1000,%rsp     
││  >0x55555555df81 <_ZN3std4sync4mpmc4list16Channel$LT$T$GT$10start_send17h8dac1358fea503eaE+17>    movq   $0x0,(%rsp)      

So we SIGSEGV when 0 is being written to the top of the stack (probably due to stack probing) after growing it ~0x1f0000 bytes which is ~2MB.

In the LLVM IR we can find:

  %_34 = alloca [1016064 x i8], align 8
  %_31 = alloca [1016064 x i8], align 8

I looked at the MIR too but couldn't find anything suspicious. Maybe someone used to debuggging monomorphization/codegen can easily find out why the stack grows by 2MB.

My system

$ cargo +1.67 rustc -- -vV
rustc 1.67.1 (d5a82bbd2 2023-02-07)
binary: rustc
commit-hash: d5a82bbd26e1ad8b7401f6a718a9c57c96905483
commit-date: 2023-02-07
host: x86_64-unknown-linux-gnu
release: 1.67.1
LLVM version: 15.0.6
$ uname -a
Linux martins-dator 5.15.153.1-microsoft-standard-WSL2 #1 SMP Fri Mar 29 23:14:13 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

@Enselic Enselic added the regression-from-stable-to-stable Performance or correctness regression from one stable version to another. label Nov 5, 2024
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Nov 5, 2024
@Amanieu Amanieu added P-medium Medium priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Nov 6, 2024
@Amanieu
Copy link
Member

Amanieu commented Nov 6, 2024

The cause seems to be due to the definition of Block which contains 31 copies of T. start_send calls Box::new(Block::<T>::new()) which is where the 1MB allocation comes from. This could be solved if Block was initialized in-place rather than using Box::new.

cuviper added a commit to cuviper/rust that referenced this issue Nov 7, 2024
The channel's `Block::new` was causing a stack overflow because it held
32 item slots, instantiated on the stack before moving to `Box::new`.
The 32x multiplier made modestly-large item sizes untenable.

That block is now initialized directly on the heap.

Fixes rust-lang#102246
@cuviper
Copy link
Member

cuviper commented Nov 7, 2024

This could be solved if Block was initialized in-place rather than using Box::new.

See #132738

When running in Debug mode, the program crashes due to a stack overflow. In release mode it behaves as expected.

I'll bet in release mode the optimizer was figuring out that direct initialization, since it knows about heap allocations.

There will remain some unstated limitation on item size due to copies in the call stack, especially in debug mode when it's not inlined, but at least we won't have that multiplier from the Block slots.

cuviper added a commit to cuviper/crossbeam that referenced this issue Nov 7, 2024
The list channel's `Block::new` was causing a stack overflow because it
held 32 item slots, instantiated on the stack before moving to
`Box::new`. The 32x multiplier made modestly-large item sizes untenable.

That block is now initialized directly on the heap.

References from the `std` channel implementation:
* rust-lang/rust#102246
* rust-lang/rust#132738
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this issue Nov 7, 2024
…eemdev

Initialize channel `Block`s directly on the heap

The channel's `Block::new` was causing a stack overflow because it held
32 item slots, instantiated on the stack before moving to `Box::new`.
The 32x multiplier made modestly-large item sizes untenable.

That block is now initialized directly on the heap.

Fixes rust-lang#102246
cuviper added a commit to cuviper/crossbeam that referenced this issue Nov 7, 2024
The list channel's `Block::new` was causing a stack overflow because it
held 32 item slots, instantiated on the stack before moving to
`Box::new`. The 32x multiplier made modestly-large item sizes untenable.

That block is now initialized directly on the heap.

References from the `std` channel implementation:
* rust-lang/rust#102246
* rust-lang/rust#132738
bors added a commit to rust-lang-ci/rust that referenced this issue Nov 8, 2024
Initialize channel `Block`s directly on the heap

The channel's `Block::new` was causing a stack overflow because it held
32 item slots, instantiated on the stack before moving to `Box::new`.
The 32x multiplier made modestly-large item sizes untenable.

That block is now initialized directly on the heap.

Fixes rust-lang#102246

try-job: test-various
taiki-e pushed a commit to crossbeam-rs/crossbeam that referenced this issue Nov 8, 2024
The list channel's `Block::new` was causing a stack overflow because it
held 32 item slots, instantiated on the stack before moving to
`Box::new`. The 32x multiplier made modestly-large item sizes untenable.

That block is now initialized directly on the heap.

References from the `std` channel implementation:
* rust-lang/rust#102246
* rust-lang/rust#132738
Zalathar added a commit to Zalathar/rust that referenced this issue Nov 8, 2024
…eemdev

Initialize channel `Block`s directly on the heap

The channel's `Block::new` was causing a stack overflow because it held
32 item slots, instantiated on the stack before moving to `Box::new`.
The 32x multiplier made modestly-large item sizes untenable.

That block is now initialized directly on the heap.

Fixes rust-lang#102246

try-job: test-various
@bors bors closed this as completed in 03383ad Nov 8, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Nov 8, 2024
Rollup merge of rust-lang#132738 - cuviper:channel-heap-init, r=ibraheemdev

Initialize channel `Block`s directly on the heap

The channel's `Block::new` was causing a stack overflow because it held
32 item slots, instantiated on the stack before moving to `Box::new`.
The 32x multiplier made modestly-large item sizes untenable.

That block is now initialized directly on the heap.

Fixes rust-lang#102246

try-job: test-various
mati865 pushed a commit to mati865/rust that referenced this issue Nov 12, 2024
…eemdev

Initialize channel `Block`s directly on the heap

The channel's `Block::new` was causing a stack overflow because it held
32 item slots, instantiated on the stack before moving to `Box::new`.
The 32x multiplier made modestly-large item sizes untenable.

That block is now initialized directly on the heap.

Fixes rust-lang#102246

try-job: test-various
taiki-e pushed a commit to crossbeam-rs/crossbeam that referenced this issue Dec 15, 2024
The list channel's `Block::new` was causing a stack overflow because it
held 32 item slots, instantiated on the stack before moving to
`Box::new`. The 32x multiplier made modestly-large item sizes untenable.

That block is now initialized directly on the heap.

References from the `std` channel implementation:
* rust-lang/rust#102246
* rust-lang/rust#132738
taiki-e pushed a commit to crossbeam-rs/crossbeam that referenced this issue Dec 15, 2024
The list channel's `Block::new` was causing a stack overflow because it
held 32 item slots, instantiated on the stack before moving to
`Box::new`. The 32x multiplier made modestly-large item sizes untenable.

That block is now initialized directly on the heap.

References from the `std` channel implementation:
* rust-lang/rust#102246
* rust-lang/rust#132738
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. P-medium Medium priority regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants