Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wayland Support #11257

Closed
wants to merge 8 commits into from
Closed

Wayland Support #11257

wants to merge 8 commits into from

Conversation

LaserEyess
Copy link

Rebase of #7452

This is not ready yet, I'm only putting it up in case someone wants to help. The rebase builds, but wayland does not work, it still fails to initialize the video backend.

@LaserEyess LaserEyess force-pushed the wayland branch 3 times, most recently from c9517bc to 49cbaea Compare November 6, 2022 22:00
@LaserEyess
Copy link
Author

LaserEyess commented Nov 6, 2022

Currently there's a deadlock introduced in 052354d that causes dolphin to lock up and not display a window. Something is implemented incorrectly in ControllerInterface/Wayland/ I'm guessing.

It appears to have something to do with UpdateInput() Any hints would be appreciated.

For my own reference later

The deadlock always happens in the main thread on ControllerEmu::EmulatedController::UpdateReferences. It calls std::scoped_lock lk(s_get_state_mutex, devi.GetDevicesMutex()), but that mutex (m_devices_mutex) is always heald by thread 12, specifically the thread running HotkeyScheduler::Run().

Inside that thread, a loop is calling ControllerInterface::UpdateInput(), which calls down into ciface::Wayland::KeyboardMouse::UpdateInput(). Which is implemented by this function. It is doing something with m_event_queue which is created on initialization from some wayland magic, but there's an interesting comment there:

  // As UpdateInput() can be called from the CPU, we don't want to clash with the main (UI) thread.
  // Therefore, we create a second event queue for retrieving keyboard/mouse events.
  m_display_proxy = static_cast<wl_display*>(wl_proxy_create_wrapper(m_display));
  if (!m_display_proxy)
    return false;

That's probably a good hint. And also going back up into ControllerInterface::UpdateInput():

  // TODO: if we are an emulation input channel, we should probably always lock
  // Prefer outdated values over blocking UI or CPU thread (avoids short but noticeable frame drop)
  if (!m_devices_mutex.try_lock())
    return;

  std::lock_guard lk(m_devices_mutex, std::adopt_lock);

Yet another hint

@LaserEyess LaserEyess marked this pull request as draft November 7, 2022 22:48
@r3claimer
Copy link

r3claimer commented Nov 7, 2022

@LaserEyess Not sure if it would help but I rebased Stenzek's branch to get wayland and vulkan on the no gui version for the Jelos operating system. Works both on arm and x86_64 with latest dev commits. I didnt bother adding any of the QT stuff.

https://github.com/JustEnoughLinuxOS/distribution/blob/dev/packages/games/emulators/dolphinsa/patches/wayland/000-add-wayland.patch

@LaserEyess
Copy link
Author

The QT stuff would be important here, as that's where the deadlocks are happening. I would definitely appreciate and welcome any help I can get. If something looks wrong with my rebase let me know.

@Tilka
Copy link
Member

Tilka commented Nov 8, 2022

Does this help?

Thread Sanitizer output
WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=57503)
  Cycle in lock order graph: M577962520504885248 (0x000000000000) => M3412 (0x55d9a8fe3758) => M577962520504885248

  Mutex M3412 acquired here while holding mutex M577962520504885248 in main thread:
    #0 pthread_mutex_lock /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4324 (libtsan.so.2+0x5da1f)
    #1 ControllerInterface::PlatformPopulateDevices(std::function<void ()>) <null> (dolphin-emu+0x1059f8b)
    #2 WiimoteReal::HandleWiimoteSourceChange(unsigned int) <null> (dolphin-emu+0x67855a)
    #3 (anonymous namespace)::RefreshConfig() <null> (dolphin-emu+0x657ef0)
    #4 Wiimote::Initialize(Wiimote::InitializeMode) <null> (dolphin-emu+0x658b1f)
    #5 UICommon::InitControllers(WindowSystemInfo const&) <null> (dolphin-emu+0xa9d55b)
    #6 MainWindow::InitControllers() <null> (dolphin-emu+0x3def68)
    #7 MainWindow::MainWindow(std::unique_ptr<BootParameters, std::default_delete<BootParameters> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (dolphin-emu+0x3df723)
    #8 main <null> (dolphin-emu+0x1a882a)

  Mutex M577962520504885248 previously acquired by the same thread here:
    #0 pthread_mutex_lock /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4324 (libtsan.so.2+0x5da1f)
    #1 WiimoteReal::HandleWiimoteSourceChange(unsigned int) <null> (dolphin-emu+0x678484)
    #2 (anonymous namespace)::RefreshConfig() <null> (dolphin-emu+0x657ef0)
    #3 Wiimote::Initialize(Wiimote::InitializeMode) <null> (dolphin-emu+0x658b1f)
    #4 UICommon::InitControllers(WindowSystemInfo const&) <null> (dolphin-emu+0xa9d55b)
    #5 MainWindow::InitControllers() <null> (dolphin-emu+0x3def68)
    #6 MainWindow::MainWindow(std::unique_ptr<BootParameters, std::default_delete<BootParameters> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (dolphin-emu+0x3df723)
    #7 main <null> (dolphin-emu+0x1a882a)

  Mutex M577962520504885248 acquired here while holding mutex M3412 in thread T15:
    #0 pthread_mutex_lock /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4324 (libtsan.so.2+0x5da1f)
    #1 WiimoteReal::ProcessWiimotePool() <null> (dolphin-emu+0x679306)
    #2 std::_Function_handler<void (), WiimoteReal::WiimoteScanner::ThreadFunc()::{lambda()#2}>::_M_invoke(std::_Any_data const&) <null> (dolphin-emu+0x679732)
    #3 ControllerInterface::PlatformPopulateDevices(std::function<void ()>) <null> (dolphin-emu+0x1059fc7)
    #4 WiimoteReal::WiimoteScanner::ThreadFunc() <null> (dolphin-emu+0x67ca95)
    #5 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (WiimoteReal::WiimoteScanner::*)(), WiimoteReal::WiimoteScanner*> > >::_M_run() <null> (dolphin-emu+0x67dce1)
    #6 execute_native_thread_routine /usr/src/debug/gcc/libstdc++-v3/src/c++11/thread.cc:82 (libstdc++.so.6+0xd62f2)

  Mutex M3412 previously acquired by the same thread here:
    #0 pthread_mutex_lock /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4324 (libtsan.so.2+0x5da1f)
    #1 ControllerInterface::PlatformPopulateDevices(std::function<void ()>) <null> (dolphin-emu+0x1059f8b)
    #2 WiimoteReal::WiimoteScanner::ThreadFunc() <null> (dolphin-emu+0x67ca95)
    #3 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (WiimoteReal::WiimoteScanner::*)(), WiimoteReal::WiimoteScanner*> > >::_M_run() <null> (dolphin-emu+0x67dce1)
    #4 execute_native_thread_routine /usr/src/debug/gcc/libstdc++-v3/src/c++11/thread.cc:82 (libstdc++.so.6+0xd62f2)

  Thread T15 'Wiimote Scannin' (tid=57533, running) created by main thread at:
    #0 pthread_create /usr/src/debug/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1001 (libtsan.so.2+0x670c9)
    #1 __gthread_create /usr/src/debug/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:663 (libstdc++.so.6+0xd63d9)
    #2 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)()) /usr/src/debug/gcc/libstdc++-v3/src/c++11/thread.cc:147 (libstdc++.so.6+0xd63d9)
    #3 WiimoteReal::Initialize(Wiimote::InitializeMode) <null> (dolphin-emu+0x6792b3)
    #4 Wiimote::Initialize(Wiimote::InitializeMode) <null> (dolphin-emu+0x658b27)
    #5 UICommon::InitControllers(WindowSystemInfo const&) <null> (dolphin-emu+0xa9d55b)
    #6 MainWindow::InitControllers() <null> (dolphin-emu+0x3def68)
    #7 MainWindow::MainWindow(std::unique_ptr<BootParameters, std::default_delete<BootParameters> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) <null> (dolphin-emu+0x3df723)
    #8 main <null> (dolphin-emu+0x1a882a)

SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) (/home/flacs/src/dolphin/build/Binaries/dolphin-emu+0x1059f8b) in ControllerInterface::PlatformPopulateDevices(std::function<void ()>)

@LaserEyess
Copy link
Author

Thanks for the thread sanitizer trace. The core of the issue seems to be that multiple threads are calling wl_display_prepare_read() but they're not all calling either wl_display_read_events() or wl_display_read_cancel(). According to the docs this guarantees a deadlock:

After a thread successfully called wl_display_prepare_read() it must either call wl_display_read_events() or wl_display_cancel_read(). If the threads do not follow this rule it will lead to deadlock.

The 2 (3?) threads that are calling this function are the Qt wayland threads for the UI, and whatever thread is running HotkeyScheduler::Run(). So, what is happening is that KeyboardMouse::UpdateInput() calls into wl_display_read_events() but it can never return because something in the Qt thread(s) isn't calling that, or wl_display_read_cancel(). And while in UpdateInput(), that thread has the device mutex, so of course the deadlock occurs. That can be seen in your trace as well.

What I'm most confused about is why KeyboardMouse needs to call UpdateInput() like that. AFAICT, this is a device for the actual game, not UI input for Qt, so it shouldn't really be necessary until a game starts? I'm too unfamiliar with dolphin's architecture to understand the start up process but it seems to me that that particular UpdateInput() call should be completely delayed until a game starts. Even better, the game should use a different wl_display so it isn't even worrying about what Qt is doing.

What I do know is that the last time someone rebased this, it was off of 3da6487, so somehow input handling or Qt has changed since then. There is a lot to check though

@AdmiralCurtiss
Copy link
Contributor

AdmiralCurtiss commented Nov 12, 2022

Input is checked even if no game is running so hotkeys work. This is by design.

I don't really understand why Wayland requires this -- this means that every single thread that ever calls an input function has to keep calling it in perpetuity, and all threads block eachother until they all call the input check, yeah? Is there a way to... not require this? It seems really strange and not useful at all.

I suppose you could work around it by making a worker thread (eg with a Common::WorkQueueThread) in the KeyboardMouse class and routing all function calls through that?

@LaserEyess
Copy link
Author

It is required because there are some synchronization guarantees that wayland makes, I do not fully understand the specifics. However, it doesn't need to call it perpetually, it can also just call wl_display_read_cancel() and unblock other threads. I admit I am not familiar enough with this code to really see where that could be used here, or if perhaps refactoring the code to use a separate wl_display is feasible.

@AdmiralCurtiss
Copy link
Contributor

But... why is this deadlocking then? How is any thread here possibly calling wl_display_prepare_read_queue() without calling wl_display_read_events() immediately afterwards, they only show up in one function, there's no early returns, and we don't use exceptions...?

https://wayland.freedesktop.org/docs/html/apb.html has a slightly different code snippet here (see wl_display_prepare_read_queue), does it fix anything if you use that instead...? (though I admittedly can't quite figure out what the fds and nfds parameters to poll() map to in our code here)

@AdmiralCurtiss
Copy link
Contributor

AdmiralCurtiss commented Nov 12, 2022

Also: What happens if you wrap the entire UpdateInput() and ProcessEvents() functions in a (the same) mutex?

(https://chromium.googlesource.com/external/wayland/wayland/+/refs/heads/master/src/wayland-client.c suggests that wl_display_dispatch() and wl_display_read_events() cannot be called at the same time by different threads)

@LaserEyess
Copy link
Author

But... why is this deadlocking then? How is any thread here possibly calling wl_display_prepare_read_queue() without calling wl_display_read_events() immediately afterwards, they only show up in one function, there's no early returns, and we don't use exceptions...?

Nothing in Dolphin is doing this, correct. But a wl_display is a shared resource with the entire process. The thing calling it is Qt's wayland event code.

@ghost
Copy link

ghost commented Nov 20, 2022

Any news about Wayland support on Dolphin ?

@Zopolis4
Copy link
Contributor

Zopolis4 commented Jan 5, 2023

Native wayland support (whether via this PR or another) may be the key to chromeos support for dolphin, although that would also require support from the graphics backends to do so.

@CCF100
Copy link

CCF100 commented Feb 5, 2023

Are you aware of #8727 ?

@phire
Copy link
Member

phire commented Feb 5, 2023

This is going to have merge conflicts with #11522 (sorry)

I really don't like how window_width/render_surface_width manages to thread though so much code.

It feels like there needs to be a better abstraction around surface size.

I wonder if the WindowSystemInfo API needs to be converted to something that's more pull focused than the current push focused API. Either a virtual interface that various WSI providers subclass. Or just add a bunch of std::function members to it, and providers can add lambdas to it (I've done that previously in a branch).

@LaserEyess
Copy link
Author

Are you aware of #8727 ?

Yes, I am aware, but you should read what that PR says:

This commit adds support for running on top of Wayland, which
some compositors require in order to access full resolution on HiDPI
displays (such as sway). This uses the Vulkan ICD to get a Vulkan
surface, though does not support using wayland-egl to get an EGL
surface or software rendering to a native Wayland surface.

Emphasis mine. This PR supports those.

@phire I will be honest, though I can definitely rebase this PR and make sure it compiles, I know it won't work due to the discussion above about deadlocking. My main problem is that I don't know enough about dolphin's code base nor wayland's APIs to do real improvements to this PR. If significant refactoring is required to get this merged then it is beyond me. I can keep it rebased in case someone else wants to pick it up, but I don't think I can finish it.

@LaserEyess
Copy link
Author

I don't know enough about Dolphin's internals nor wayland to finish this. If you want this feature implemented, it will have to be worked on by a core developer or someone much better with graphics/wayland than me.

This is the open feature request for wayland support on Dolphin's issue tracker: https://bugs.dolphin-emu.org/issues/11807. This is the best way to tell devs this feature is important to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

8 participants