Implementation roots:
- Core detection entrypoint:
../crates/localpaste_core/src/detection/mod.rs - GUI highlight pipeline entrypoints:
localpaste_corekeepsmagikaas opt-in (default = []).localpaste_guiandlocalpaste_serverenablemagikaby default.localpaste_cliuseslocalpaste_coredefaults, so it is heuristic-only unless explicitly feature-enabled in downstream builds.
This keeps GUI/server detection broad by default while preserving portability for core/CLI users.
For auto-detected language (language_is_manual == false):
- If
magikafeature is enabled:- run Magika detection,
- reject non-text results,
- reject generic labels (
txt,randomtxt,unknown,empty,undefined), - normalize and return if non-empty and not
text.
- Otherwise (or if Magika is unavailable/fails/generic), run heuristic fallback.
- Normalize heuristic label and return unless empty/
text.
Auto mode is intentionally "pending detection":
- switching to auto clears the resolved language label,
- the next content edit re-runs detection,
- if detection resolves a concrete language, that value is locked (
language_is_manual = true) until explicitly switched back to auto. - API create requests that omit
language_is_manualdetect immediately and lock when detection resolves; passlanguage_is_manual: falseto start unresolved auto mode and defer detection until a later edit.
For manual language (language_is_manual == true), content edits do not re-run auto detection.
Important
Manual language selection disables automatic re-detection on edits until you switch back to auto mode.
Magika session lifecycle:
- lazy singleton (
OnceLock<Result<Mutex<magika::Session>, String>>), - guarded with
Mutexbecause Magika identify calls require&mut self, prewarm()is called in GUI/server startup paths to avoid first-save load latency.
Normalization maps legacy aliases and user-entered variants to stable labels (examples):
csharp,c#->csc++->cppbash,sh,zsh->shellpwsh,ps1->powershellyml->yamljs->javascriptts->typescriptmd->markdownplaintext,plain text,plain,txt->text
Unknown values pass through in lowercase.
Manual language picker values are defined centrally in MANUAL_LANGUAGE_OPTIONS and stored as normalized values.
Language filter matching normalizes both:
- stored language metadata,
- incoming filter value.
This preserves interoperability across legacy and current labels (for example, csharp and cs).
Search ranking also checks normalized language values to avoid losing metadata relevance as stored labels evolve.
GUI highlight resolution uses a multi-step strategy instead of a fixed name table:
- exact syntax name
- exact extension
- case-insensitive name
- normalized-name match (alphanumeric only)
- case-insensitive extension scan
- explicit fallback candidates for known mismatches/high-priority labels
- plain text
Policy:
- Keep explicit fallback mapping narrow and intentional.
- Preserve unsupported-language visibility by keeping their metadata labels even when rendering falls back to plain text.
Fallback candidate mapping lives in:
../crates/localpaste_gui/src/app/highlight/syntax.rs(syntax_fallback_candidates)
Fallback coverage tests live in:
../crates/localpaste_gui/src/app/highlight/worker.rs(resolver_tests)
Examples covered by the resolver tests include:
- fallback-to-grammar labels (for example:
typescript,powershell,sass) - metadata-only/plain-render labels (for example:
zig,kotlin,dart)
Virtual-editor highlight behavior is async and staged to avoid mid-burst visual churn while typing.
Flow:
- UI sends a highlight request keyed by paste/context (
paste_id,revision,text_len,language_hint,theme_key). - Worker coalesces queued requests and computes either:
- full render (
HighlightRender), or - changed-range patch (
HighlightPatch) when the UI base snapshot matches the worker cache base.
- full render (
- UI merges matching patches into staged/current highlight state.
- Staged highlight applies:
- immediately only when there is no current render,
- otherwise only after idle threshold.
Current policy constants (virtual editor):
- idle apply threshold:
200ms - adaptive debounce windows:
- tiny edits (
<=4changed chars,<=2touched lines):15ms - medium edits:
35ms - larger supported buffers (
>=64KB):50ms - async highlighting disabled:
0ms(synchronous/no debounce path)
- tiny edits (
- plain rendering guardrail:
>=256KBcontent
Primary implementation:
- request/stage/apply lifecycle:
../crates/localpaste_gui/src/app/highlight_flow.rs - virtual edit hint capture:
../crates/localpaste_gui/src/app/virtual_ops.rs - editor dispatch and debounce usage:
../crates/localpaste_gui/src/app/ui/editor_panel.rs
When Magika is enabled, runtime defaults to CPU execution provider:
- env var:
MAGIKA_FORCE_CPU - default:
true - falsey values (
0,false,no,off) allow runtime/provider defaults
Reference: ../.env.example
YAML auto-detection keeps single-line mappings valid (for example, key: value) while still rejecting obvious non-YAML noise when refining model output.
Primary implementation:
When touching detection/highlight behavior, validate:
- core detection tests (
localpaste_core::detection::tests), - GUI resolver/worker tests (
localpaste_gui::app::highlight::worker::resolver_tests), - GUI manual checks in dev/gui-notes.md,
- GUI perf checks in dev/gui-perf-protocol.md.