fix(ui/channels): avoid UTF-8 char-boundary panics in text truncation#1
Closed
NiuBlibing wants to merge 439 commits into
Closed
fix(ui/channels): avoid UTF-8 char-boundary panics in text truncation#1NiuBlibing wants to merge 439 commits into
NiuBlibing wants to merge 439 commits into
Conversation
…es (zeroclaw-labs#6534) - e3103c7 fix(sop): call reload() after SopEngine construction at both call sites - 032a26d test(sop): add reload contract regression tests - 157afdd docs(sop): clarify that sops_dir is required for runtime SOP loading
Native tool-capable providers already receive the tool catalog through provider-native tool specs, so the system prompt should not duplicate that catalog in prose. Thread the native-spec decision through PromptContext and keep XML/delegate prompt paths on the existing textual tool section. Tests cover both the modular Agent prompt path and the legacy prompt builder. Related zeroclaw-labs#6074 Co-authored-by: smallwhite <12741016+whtiehack@users.noreply.github.com>
…claw-labs#6533) default_config_dir() now checks ZEROCLAW_CONFIG_DIR before falling back to ~/.zeroclaw, so all seven path-field defaults (knowledge.db_path, workspace.workspaces_dir, plugins.dir, project_intel.report_dir, security_ops.estop_state_file, playbooks_dir, report_output_dir) point into the active profile when a custom config dir is set.
…s#6539) Route ACP and web dashboard direct agents through a back-channel approval mode so bare shell calls cannot bypass runtime approval by setting approved=true in tool arguments. Keep runtime-owned approved arguments aligned with approval policy for shell and cron-style command tools, including prior Always decisions and auto-approved tools.
…law-labs#6546) Treat empty effective tool sets as a no-tools turn across prompt assembly, provider request shape, and parser execution. Preserve reasoning-tag stripping while avoiding execution of tool-like output when no tools are available. Add focused regressions for native request shape, XML text preservation, prompt scaffolding, and channel protocol prompt behavior.
…abs#6114) - 60d1562 fix(provider): strip media markers in auxiliary chat_with_system calls - 5ab377b fix(provider): also strip [PHOTO:] markers in auxiliary calls - 0dd59f4 fix(runtime/context): reconcile strip-markers helper with zeroclaw-labs#6189 vision contract
Wrap OpenRouter structured chat/history system messages in a single text content block with `cache_control: {"type": "ephemeral"}` so cache-aware upstream models can receive prompt-cache breakpoints through OpenRouter.
Map non-streaming OpenRouter `usage.prompt_tokens_details.cached_tokens` into `TokenUsage.cached_input_tokens` when present, while preserving absent, empty, and zero detail handling for providers that do not report cached-token usage.
Keep user, assistant, tool, and multimodal user message shapes unchanged outside the new system-message cache marker. The live PR discussion verified the new system-message array form against `openai/gpt-4o` through OpenRouter with matching prompt-token billing to the plain-string control.
Notes:
- One-shot `chat_with_system` still uses the older system-message shape.
- Streaming sends the cached request shape but does not surface cached-token usage through `StreamEvent`.
Related zeroclaw-labs#3977
Related zeroclaw-labs#5440
…eps (zeroclaw-labs#6570) - Update all image references from Docker Hub (zeroclawlabs/zeroclaw) to GitHub Container Registry (ghcr.io/zeroclaw-labs/zeroclaw). - Add the missing `zeroclaw onboard` step to the Compose section. - Add a new "Re-authenticating after logout" section explaining how to regenerate a paircode with `zeroclaw gateway get-paircode --new`. Closes zeroclaw-labs#6393
…w-labs#6567) The v0.7.x workspace split moved most module implementations from src/** to crates/zeroclaw-*/src/**, but labeler.yml still only globs src/**. PRs that only touch crate files receive no area label. Add corresponding crates/zeroclaw-*/src/** globs alongside every existing src/** glob. Legacy src/** patterns are preserved so any remaining shim code still matches. Closes zeroclaw-labs#6359
…eroclaw-labs#6568) Two `build_channel_by_id` telegram tests run unconditionally but the corresponding dispatch arm is `#[cfg(feature = "channel-telegram")]`-gated. Since `channel-telegram` is not in the default feature set, these tests always hit the "Unknown channel" path and fail. Add `#[cfg(feature = "channel-telegram")]` to both tests, matching the existing pattern used by the voice-call tests in the same module. Closes zeroclaw-labs#6347
…rgs (zeroclaw-labs#6569) rust-analyzer's clippy check already passes --all-targets by default. Including it in extraArgs causes the argument to be duplicated, which makes cargo clippy fail with: error: the argument '--all-targets' cannot be used multiple times Remove --all-targets from extraArgs so only -- -D warnings remains. Closes zeroclaw-labs#5687
Recover the updater asset-selection behavior from zeroclaw-labs#4337 against current master. Exact-match installable release archives for the supported target, skip unusable download URLs, and fail closed for unsupported targets. Co-authored-by: rareba <rareba@users.noreply.github.com>
Recover zeroclaw-labs#4573 by preserving Gemini usageMetadata through Provider::chat(). Gemini already parsed usageMetadata in send_generate_content(), but the structured chat path used the trait default and returned usage: None. Route Gemini chat through a usage-preserving helper, keep prompt-guided tool instructions, and preserve wrapped OAuth usage metadata. Supersedes: - zeroclaw-labs#4573 by @SpectreMercury Integrated scope: - Gemini provider: structured chat returns parsed token usage from zeroclaw-labs#4573 Co-authored-by: ERROR404 <11926244+SpectreMercury@users.noreply.github.com>
Detect DuckDuckGo 403 responses and verification /wr.do? flows before result parsing so automated block pages surface actionable provider guidance instead of generic failures or empty results. Add request-path coverage for blocked statuses, verification redirects, verification form HTML, and normal empty-result handling.
zeroclaw-labs#6183) - f4d4b08 fix(multimodal): normalize image markers across agent and tool history - 2d887db review: address PR zeroclaw-labs#6183 feedback (truncation marker preservation, ch… - 4569b2f fix(providers/multimodal): preserve native tool-result JSON during im…
…_mode=partial (zeroclaw-labs#6588) - ca1ea64 Merge branch 'master' into fix/6415-tts-stream-mode-partial - b0cdae7 fix(channels): extract TTS voice reply into shared helper for stream_mode=partial
) - 5f2714e fix(channels): close Discord media send/receive gaps - c77c302 fix(channels): do not cache thread-lookup failures - fba1929 fix(channels): surface dropped Discord markers; require absolute paths - 690cef2 fix(channels): upload file in Discord MultiMessage when paragraph collapses to marker-only - 9a9da08 fix(channels): admit attachment-only Discord messages and bound thread lookup
…fix + docs fallback, Jordan trapdoor for features (zeroclaw-labs#6554) - c00e5d6 docs(skills/pr-review): refine Phase 3.5 milestone alignment with bre… - 1ae30df docs(skills/pr-review): fix scope-compare step wording; add Other typ…
…eroclaw-labs#6562) - f4270da feat(nix): add multi-instance NixOS module + test - 171e1fb feat(nix): tighten systemd hardening (DeviceAllow, MemoryDenyWriteExecute, RemoveIPC) - 49fc6c9 feat(nix): add PrivateUsers=true to harden the unit's user namespace - 39e0556 fix(nix): drop ExecStartPre chown — unit already owns its dataDir - 5d68436 refactor(nix): drop extraServiceConfig escape hatch - 9f190c7 style(nix): apply nixfmt-rfc-style - 9945432 fix(nix): drop unused `config` arg from instanceModule - 502190f style(nix): drop unused `config` arg from test machine signature - c58a8a6 docs(nix): drop stale extraServiceConfig reference from README - c8d60f2 fix(nix): expand $VAR placeholders + accept arbitrary dataDir paths - a89e8d9 docs(nix): add missing `config` arg to README quick-start lambda - f82c3a6 docs(nix): clean up stale StateDirectory prose
…evice) + peripheral wiring support (zeroclaw-labs#7045)
…st (zeroclaw-labs#7046) - 4e65163 feat(hardware): add dev-sim feature with /tmp/zc-sim-* serial allowlist - 4c82b36 address review suggestion: clarify dev-sim usage with hardware featur…
…eroclaw-labs#7023) - 4a0a3ed feat(docs): implement versioned documentation deployment and version selector - 7364dd2 feat(docs): enhance version sorting and validation in deployment workflow - 9615c2d fix(docs): replace hardcoded "master" with DEFAULT_TAG in build process - a7edd5a fix(docs): added a second checkout step - bae35ce feat(docs): implement versioned documentation deployment and shared chrome extraction - 8f2a846 refactor(docs): format scripts - eeed5be refactor(build): simplify conditional checks in extract_shared_chrome function - df2242a feat(docs): migrate documentation scripts to Rust and update deployment workflow
`dashboard::truncate` sliced `&first_line[..max]` at a raw byte index, panicking when `max` landed inside a multi-byte character (e.g. CJK session summaries). Switch to char-based truncation. Two other sites had the same unguarded byte-slice bug: - linkedin: `&text[..200]` when building the image-generation prompt - bluesky: `&message.content[..297]`, which also compared byte `.len()` against what the comment documents as a 300-character (grapheme) limit All three now count and take `chars()`, which is char-boundary safe and matches the intended character-based semantics.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
dashboard::truncatesliced&first_line[..max]at a raw byte index. Whenmaxlanded inside a multi-byte character it panicked:A repo-wide sweep for the same unguarded byte-slice pattern found two more sites with the identical bug (all other slice sites are already guarded via
is_char_boundary/char_indices().nth()/find()offsets):&text[..200]when building the image-generation prompt&message.content[..297]— additionally compared byte.len()against what the comment documents as a 300-character (grapheme) limit, so multi-byte content both mis-triggered and panickedFix
All three now use
chars().count()/chars().take(n), which is char-boundary safe and matches the intended character-based semantics.Notes / scope
200is not a platform limit — it only trims text inside an internal image prompt. Character-based is the natural reading.app.bsky.feed.postlexicon.chars()(codepoints) is panic-safe and exact for ASCII/CJK, but slightly conservative for ZWJ/skin-tone emoji. Fully exact grapheme truncation would require addingunicode-segmentation(not currently a dependency); left out of this panic fix.Testing
cargo check -p zeroclaw-tools -p zeroclaw-channels -p zerocodepasses.