Skip to content

Add optional service mode to see and handle UAC prompts (Streamline and empower autonomous agent workflows on Windows) #236

@bensheed

Description

@bensheed

Feature Request: Optional Service Mode for UAC / Secure Desktop Support

Summary

Add an optional installation/run mode where Windows-MCP is hosted by a privileged Windows service running under LocalSystem, with a small user-mode broker, so the MCP can see and interact with the Secure Desktop (UAC consent prompts) and other elevated UI surfaces. Today, because the server runs as a regular user-mode process, the moment a UAC dialog fires the screenshot tools capture the dimmed wallpaper and input goes nowhere — an LLM agent doing any real Windows automation hits this constantly (installs, driver changes, scheduled tasks, registry edits, service control, etc.).

This is exactly why remote-access tools like Splashtop, TeamViewer, AnyDesk, and Microsoft's own RDP work across UAC: they all ship a LocalSystem-context component. A purely user-mode MCP cannot reach parity without one.

Motivation: Unblocking Autonomous Agents

The core value of an MCP like this is letting an LLM-driven agent operate a Windows machine without a human in the loop. UAC is currently the single biggest blocker to that. Any non-trivial automation hits an elevation prompt within minutes, the agent goes blind, and the human has to stop whatever they were doing to click Yes — at which point you've lost most of the benefit of running an agent in the first place.

Concrete workflows this would unblock:

  • Coding agents setting up a dev environment from scratch: installing Visual Studio / Build Tools, the Windows SDK, Node, Python, CUDA, Docker Desktop, WSL, Git, drivers — virtually every one of these triggers UAC at least once. Today, an agent doing "set up this repo on a fresh box" stalls a dozen times in a row.
  • Testing agents that need to install, uninstall, and reinstall the software under test, toggle Windows features, edit HKLM, register COM components, or start/stop services between test runs. Reproducing customer environments overnight is currently a manual chore because every reset cycle requires a human click.
  • Build / release agents running locally rather than in CI — signing binaries, registering URL protocol handlers, installing scheduled tasks, configuring firewall rules.
  • IT / admin agents doing routine maintenance: applying updates, rotating certificates, reconfiguring services, cleaning up after failed installers. The exact tasks where "run this overnight" is the whole point.
  • Long-horizon autonomous runs generally. An agent that can work for eight hours unattended is qualitatively different from one that needs a human present for the next click. UAC is what currently forces the latter on Windows.

The parallel on other platforms is instructive: on macOS and Linux, agents handle elevation through sudo with cached credentials or a configured sudoers entry — annoying but tractable, and crucially something the agent itself can navigate. Windows is the outlier where agents simply cannot see the prompt, let alone respond to it. Service mode closes that gap.

The block / allow_with_match / allow_all policy described below means users keep full control over how autonomous they want the agent to be — you can run with block and just get visibility (the agent reads the prompt and asks you in chat), or with allow_with_match and let the agent proceed for known-good publishers, or allow_all for fully unattended runs inside a sandbox or test VM. The current state — total blindness — is the worst of all worlds because the agent can't even tell the user what it's being asked to approve.

Problem

A normal user-mode process — including Windows-MCP launched via uv — is attached to the interactive desktop (Default) of the user's session. UAC prompts render on a separate desktop object (Winlogon / the Secure Desktop) that's isolated by Windows for anti-spoofing reasons. The implications inside the current codebase:

  • _DxcamBackend, _MssBackend, and _PillowBackend all capture the calling thread's desktop, so screenshots during UAC return the dim "frozen" frame.
  • SendInput, mouse_event, and keybd_event deliver input to the calling thread's desktop, so clicks / keystrokes during UAC are dropped silently.
  • The IUIAutomation tree cannot enumerate windows on the Secure Desktop from a user-mode process.
  • Auto-starting the process via a scheduled task (a workaround some users adopt) does not change this — it changes when the process starts, not which desktop it's attached to.

Proposed Solution

A two-process design behind an opt-in installer flag (e.g. windows-mcp install --service or a separate windows-mcp-service package):

  1. windows-mcp-host (LocalSystem service)

    • Installed via pywin32's win32serviceutil.ServiceFramework or via an NSSM wrapper for users who prefer no extra deps.
    • Runs in Session 0 under LocalSystem.
    • Responsible only for privileged primitives: capturing the currently-active desktop (including Secure Desktop), injecting input there, and proxying UIA queries.
    • Uses WTSGetActiveConsoleSessionIdWTSQueryUserTokenCreateProcessAsUser to spawn the broker into the active console session as the logged-in user.
    • Uses OpenInputDesktop + SetThreadDesktop to follow the active input desktop (this is what lets it pivot to the Secure Desktop on UAC).
  2. windows-mcp (user-session broker, existing process)

    • Continues to expose MCP over stdio / SSE / HTTP exactly as today.
    • For desktop-bound operations, calls a named pipe (or local TCP with mutual auth) to the SYSTEM service instead of calling Win32 directly.
    • Falls back to the current in-process implementation if the service isn't installed, so existing users see no change.

Behavior changes

  • Snapshot, Click, Type, Drag, etc. transparently route through the service when it's present.
  • New tool: WaitForUACPrompt(timeout_ms) that returns when the Secure Desktop becomes the input desktop, exposing the consent dialog as a normal UIA tree so an agent can read the binary's name / publisher and decide whether to allow.
  • New config flag: WINDOWS_MCP_SECURE_DESKTOP_POLICY with values block (default — refuse to auto-click UAC), allow_with_match (only click if the publisher/hash matches an allowlist), allow_all (off-by-default, requires explicit env var).

Why opt-in

Running as LocalSystem is a serious privilege escalation surface and should never be silent:

  • The installer should require admin elevation and show a Windows-style consent dialog explaining what LocalSystem means.
  • The default for UAC handling should be block — the service makes UAC visible to the agent, not clickable, until the user opts in.
  • The named pipe between user broker and SYSTEM service should require a SID match against the interactive console user, so a malicious low-privilege process in the same session can't hijack it.
  • README should call out the threat model explicitly: anyone who can write to the service binary path now has SYSTEM, so the install location must be ACL'd correctly (the installer can do this automatically — write to %ProgramFiles%, not %LOCALAPPDATA%).

Alternatives Considered

  • Scheduled task with "Run with highest privileges" / "LocalSystem" — doesn't work; Session 0 isolation means a SYSTEM task started this way still can't reach the interactive desktop without the CreateProcessAsUser dance, and "highest privileges" under a user account still can't touch the Secure Desktop.
  • Sandbox / VM with auto-accept UAC policy — already recommended in the docs, but it neutralizes UAC instead of cooperating with it, and isn't viable for users automating their actual workstation.
  • Disable UAC entirely — bad security posture, breaks anything that detects UAC level (Edge, Defender, MDM tools).
  • Use existing remote-desktop infra (Splashtop/RDP) underneath — heavy, licensed, and pulls in a whole second product per host.

Additional Context

Reference implementations worth looking at for the SYSTEM-helper pattern:

  • ShareX's Secure Desktop screenshot helper
  • AutoHotkey v2's Run *RunAs and its UIA companion library's notes on elevated targets
  • The way Sysinternals' PsExec -s -i punches a process into the interactive session from SYSTEM

Happy to prototype the service shim + named pipe protocol if there's interest in accepting a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions