Skip to content

rust : Add RROS_SPINLOCK flag and adjust spinlock initialization for new constructor #66

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 25 commits into
base: v6.6.y-rros-rebase
Choose a base branch
from

Conversation

Caspian443
Copy link

  • Add another config RROS_SPINLOCK
  • Modify some CONFIG_RROS to CONFIG_RROS_SPINLOCK
  • Adjust the spinlock initialization in rust/kernel/memory_rros.rs to accommodate the new constructor

@Caspian443 Caspian443 changed the title V6.6.y rros rebase rust : Add RROS_SPINLOCK flag and adjust spinlock initialization for new constructor Nov 8, 2024
@Caspian443 Caspian443 marked this pull request as draft November 8, 2024 02:57
Copy link
Contributor

@shannmu shannmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, there is a CI failure. Please make each commit pass compile.

let mut spinlock_init = unsafe { SpinLock::new(1, lock_name, lock_key) } ;
// let pinned = unsafe { Pin::new_unchecked(&mut spinlock) };
// spinlock_init!(pinned, "spinlock");
let mut spinlock= Box::pin_init(spinlock_init).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a macro named new_spinlock. I think you could try it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm so careless,I'll try it.

@Caspian443
Copy link
Author

BTW, there is a CI failure. Please make each commit pass compile.
Thank you for the feedback. I’ll ensure that each commit passes compilation and address the CI failure.

@@ -94,7 +94,7 @@ fn __rros_set_idle_schedparam(
// let mut thread_lock = thread_unwrap.lock();
let p_unwrap = p.unwrap();
thread_unwrap.lock().state &= !T_WEAK;
let prio = unsafe { (*p_unwrap.locked_data().get()).idle.prio };
let prio = unsafe { (*p_unwrap.lock().deref()).idle.prio };
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why there is a locked_data rather than a lock. I think the lock is better here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the version now has only lock.

// spinlock_init!(
// unsafe { Pin::new_unchecked(&mut self.watchpoints) },
// "RrosPollHead"
// );
}
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

// init_static_sync! {
// static ACTIVE_PORT_LIST : SpinLock<ListThreadSafeWrapper> = ListThreadSafeWrapper(List::new());
// }
static ACTIVE_PORT_LIST : Pin<Box<SpinLock<ListThreadSafeWrapper>>> = Box::pin_init(new_spinlock!(ListThreadSafeWrapper(List::new()))).unwrap();

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is good for me for now. Box::pin_init returns a Result and using unwrap here is a temporary solution. I believe we need to address this problem later.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@Caspian443
Copy link
Author

I have aligned the new class_create function, and class_new can successfully return. However, the program still terminates at this position(waiting for a long time).
image
The error is like this:
image

@Caspian443
Copy link
Author

I would like to show you the process of debugging with gdb in this video
https://github.com/user-attachments/assets/af4d3cd0-4a10-4660-983a-a86f5ece5ee1

@shannmu
Copy link
Contributor

shannmu commented Nov 25, 2024

However, the program still terminates at this position(waiting for a long time).

I think this ret_from_fork means it is in a thread stask. And the do_populate_rootfs is to load the built-in initramfs. So init of RROS is good and there may be an error for your filesystem

@@ -1218,7 +1218,7 @@ fn init_clock(clock: *mut RrosClock, master: *mut RrosClock) -> Result<usize> {
clocklist_lock_init();
CLOCKLIST_LOCK.get().unwrap().lock();
CLOCK_LIST.add_head(clock);
CLOCKLIST_LOCK.get().unwrap().unlock();
CLOCKLIST_LOCK.get().unwrap().unlock(&());
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I added this parameter is that GuardState is an empty tuple (), indicating that no additional state needs to be maintained between locking and unlocking.
image

unsafe {
rust_helper_hard_spin_unlock(self.lock().lock as *const Lock<_, _> as *mut bindings::raw_spinlock)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I eliminated lock() by using the unlock of the Backend.

shannmu pushed a commit to shannmu/RROS that referenced this pull request Dec 15, 2024
…b folio

A kernel crash was observed when migrating hugetlb folio:

BUG: kernel NULL pointer dereference, address: 0000000000000008
PGD 0 P4D 0
Oops: Oops: 0002 [BUPT-OS#1] PREEMPT SMP NOPTI
CPU: 0 PID: 3435 Comm: bash Not tainted 6.10.0-rc6-00450-g8578ca01f21f BUPT-OS#66
RIP: 0010:__folio_undo_large_rmappable+0x70/0xb0
RSP: 0018:ffffb165c98a7b38 EFLAGS: 00000097
RAX: fffffbbc44528090 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffffa30e000a2800 RSI: 0000000000000246 RDI: ffffa3153ffffcc0
RBP: fffffbbc44528000 R08: 0000000000002371 R09: ffffffffbe4e5868
R10: 0000000000000001 R11: 0000000000000001 R12: ffffa3153ffffcc0
R13: fffffbbc44468000 R14: 0000000000000001 R15: 0000000000000001
FS:  00007f5b3a716740(0000) GS:ffffa3151fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000010959a000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 __folio_migrate_mapping+0x59e/0x950
 __migrate_folio.constprop.0+0x5f/0x120
 move_to_new_folio+0xfd/0x250
 migrate_pages+0x383/0xd70
 soft_offline_page+0x2ab/0x7f0
 soft_offline_page_store+0x52/0x90
 kernfs_fop_write_iter+0x12c/0x1d0
 vfs_write+0x380/0x540
 ksys_write+0x64/0xe0
 do_syscall_64+0xb9/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5b3a514887
RSP: 002b:00007ffe138fce68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f5b3a514887
RDX: 000000000000000c RSI: 0000556ab809ee10 RDI: 0000000000000001
RBP: 0000556ab809ee10 R08: 00007f5b3a5d1460 R09: 000000007fffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c
R13: 00007f5b3a61b780 R14: 00007f5b3a617600 R15: 00007f5b3a616a00

It's because hugetlb folio is passed to __folio_undo_large_rmappable()
unexpectedly.  large_rmappable flag is imperceptibly set to hugetlb folio
since commit f6a8dd9 ("hugetlb: convert alloc_buddy_hugetlb_folio to
use a folio").  Then commit be9581e ("mm: fix crashes from deferred
split racing folio migration") makes folio_migrate_mapping() call
folio_undo_large_rmappable() triggering the bug.  Fix this issue by
clearing large_rmappable flag for hugetlb folios.  They don't need that
flag set anyway.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: f6a8dd9 ("hugetlb: convert alloc_buddy_hugetlb_folio to use a folio")
Fixes: be9581e ("mm: fix crashes from deferred split racing folio migration")
Signed-off-by: Miaohe Lin <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
shannmu pushed a commit to shannmu/RROS that referenced this pull request Jan 2, 2025
Interrupts on/off mismatch when locking the recycling queue.  At this
chance annotate some routines wrt the legit call context.

[   39.474194] ================================
[   39.474195] WARNING: inconsistent lock state
[   39.474197] 6.1.54+ BUPT-OS#66 Not tainted
[   39.474200] --------------------------------
[   39.474201] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
[   39.474203] eth0.rx/370 [HC1[0]:SC0[0]:HE0:SE1] takes:
[   39.474207] ffffffff8276b8d8 (recycling_lock){?.+.}-{0:0}, at: free_skb_inband+0x5/0xa0
[   39.474222] {HARDIRQ-ON-W} state was registered at:
[   39.474223]   __lock_acquire+0x363/0x9a0
[   39.474229]   lock_acquire+0xbe/0x2a0
[   39.474233]   free_skb_inband+0x2a/0xa0
[   39.474236]   evl_net_free_skb+0x11/0x90
[   39.474239]   evl_net_do_rx+0x1b7/0x280
[   39.474242]   kthread_trampoline+0x1c7/0x2d0
[   39.474246]   kthread+0xf5/0x120
[   39.474251]   ret_from_fork+0x22/0x30
[   39.474255] irq event stamp: 24
[   39.474256] hardirqs last  enabled at (23): [<ffffffff81b06125>] _raw_spin_unlock_irqrestore+0x65/0x80
[   39.474262] hardirqs last disabled at (24): [<ffffffff81afb971>] __schedule+0x3a1/0x770
[   39.474266] softirqs last  enabled at (0): [<ffffffff810c1ad6>] copy_process+0x796/0x18c0
[   39.474271] softirqs last disabled at (0): [<0000000000000000>] 0x0
[   39.474274]
[   39.474274] other info that might help us debug this:
[   39.474275]  Possible unsafe locking scenario:
[   39.474275]
[   39.474276]        CPU0
[   39.474276]        ----
[   39.474277]   lock(recycling_lock);
[   39.474278]   <Interrupt>
[   39.474279]     lock(recycling_lock);
[   39.474280]
[   39.474280]  *** DEADLOCK ***

Signed-off-by: Philippe Gerum <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants