This file documents important bugs, mistakes, and lessons learned during development to prevent similar issues in the future.
Auto-lock functionality was completely broken due to an integer overflow bug in the activity counter.
Location: src/gui/auth_manager.rs, start_activity_counter() method
Problematic Code:
if s.is_unlocked && s.last_activity_secs < u64::MAX as i64 {
s.last_activity_secs += 1;
}The Issue:
u64::MAXis18446744073709551615- When cast to
i64, it overflows and wraps to-1 - The condition
0 < -1is alwaysfalse - Counter never incremented, auto-lock never triggered
The Fix:
if s.is_unlocked && s.last_activity_secs < i64::MAX {
s.last_activity_secs += 1;
}- Type Confusion: Mixed use of
i64for counter but checked againstu64::MAX - Insufficient Testing: No integration tests for auto-lock timing
- Silent Failure: Condition silently failed without error or warning
- Missing Runtime Validation: No assertions or debug checks during development
- Always use matching types: If a variable is
i64, compare againsti64::MAX, notu64::MAX - Test time-based features: Features involving timers and counters need specific tests with accelerated time
- Add debug logging during development: Would have caught this immediately
- Use type-safe constants: Consider using const generics or associated constants with proper types
- Integer casting is dangerous: Be extremely careful with
ascasts, especially with MAX/MIN values
- ✅ Add comprehensive unit tests for counter increment logic
- ✅ Add integration tests for auto-lock with short timeouts
- ✅ Use clippy lints to catch suspicious casts
- ✅ Document type choices and constraints in code comments
- ✅ Add assertions in debug builds for critical invariants
// ❌ BAD - Cross-type MAX comparisons
if value < u64::MAX as i64 { } // Will overflow!
if value < u32::MAX as i16 { } // Will overflow!
// ✅ GOOD - Use matching types
if value < i64::MAX { }
if value < i16::MAX { }
// ✅ GOOD - Use TryFrom for safe conversion
if let Ok(max) = i64::try_from(u64::MAX) {
if value < max { }
}- Unit test:
test_activity_counter_increment - Unit test:
test_auto_lock_timeout_exact - Integration test:
test_auto_lock_with_frontend_event - Integration test:
test_activity_reset_prevents_lock
During Windows Hello implementation, clippy warnings were introduced but not caught immediately because quality gates weren't consistently applied during ad-hoc question responses.
Context: While responding to random user questions (not in structured OpenSpec workflow), code changes were made without running quality checks.
The Issue:
- Clippy errors were introduced in earlier changes
- Errors only discovered several questions later by user
- Quality gates exist (
cargo clippy --features "cli gui" -- -D warnings) but not consistently run - Ad-hoc questions bypassed structured quality control
User Observation:
"I found out that cargo clippy reports error and the fix happens several questions ago. I fixed the error already, but as we already have clippy as our quality gate, I suspect the failure of applying this is because of the random questions."
- Process Gap: No formal workflow for handling ad-hoc user questions
- Inconsistent Quality Checks: Quality gates only applied in structured workflows (OpenSpec)
- Random Questions Bypass: Ad-hoc questions treated differently from planned features
- No Type Classification: Didn't distinguish between information requests vs code modifications
- All code changes require quality gates: Whether in OpenSpec workflow or ad-hoc questions
- Classify questions first: Identify if answer requires code modification
- Quality gates are not optional: Must run after EVERY code change, no exceptions
- Early detection saves time: Running clippy immediately catches issues before they compound
- Process consistency: Same standards apply regardless of how the change was requested
- ✅ Add "Ad-Hoc Question Workflow" section to AGENTS.md
- ✅ Define question type classification (Information, Code Modification, Investigation)
- ✅ Document mandatory quality gate checklist by change type
- ✅ Establish "Constitution" rule: Quality gates are MANDATORY for code changes
- ✅ Include examples of good vs bad workflow execution
Must run after ANY Rust code change:
cargo clippy --features "cli gui" -- -D warnings # Linting
cargo fmt --check # Formatting
cargo check --features "cli gui" # Type check
cargo test <relevant_module> # Related testsMust run after ANY frontend change:
cd ui-sveltekit && npm run build # Build check| Question Type | Quality Gates Required? |
|---|---|
| Information request | No |
| Code modification | ✅ YES - MANDATORY |
| Investigation (with changes) | ✅ YES - MANDATORY |
| Documentation only | No (verify markdown syntax) |
Constitution: For any code modification, running quality gates is MANDATORY, not optional. Skipping quality checks violates project standards and can introduce bugs.
The Bug: One-line description
Location: File and function
Problematic Code:
// bad codeThe Fix:
// good codeRoot Cause: Why it happened
Lessons Learned: What we learned
Prevention: How to prevent it
Tests Added: List of new tests
Windows Hello biometric feature showed "Checking availability..." indefinitely instead of displaying actual status. Multiple debugging rounds were needed due to cascading type mismatches between Rust backend and TypeScript frontend.
Location: Multiple files across backend and frontend
Problem: Frontend called checkBiometricAvailable() expecting raw BiometricAvailability but backend returned CommandResponse<BiometricAvailability>.
Fix: Updated ui-sveltekit/src/lib/services/tauri.ts:
// Before
return await invoke<BiometricAvailability>('check_biometric_available');
// After
const response = await invoke<CommandResponse<BiometricAvailability>>('check_biometric_available');
if (!response.success || !response.data) {
throw new Error(response.error || 'Failed to check biometric availability');
}
return response.data;Problem: Rust enum serialized as PascalCase ("Available", "NotConfigured") but TypeScript expected snake_case ('available', 'not_configured').
Root Cause:
- Serde default serialization is PascalCase for enums
- Frontend switch statement used snake_case literals
- Mismatched values caused
defaultcase → "Checking availability..."
Fix: Added serde attribute in src/core/types.rs:
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
#[cfg_attr(feature = "gui", derive(serde::Serialize, serde::Deserialize))]
#[cfg_attr(feature = "gui", serde(rename_all = "snake_case"))]
pub enum BiometricAvailability {
Available,
NotConfigured,
DeviceNotPresent,
NotSupported,
}Problem: windows-biometric feature wasn't included in default gui feature, so it wasn't enabled during cargo tauri dev.
Fix: Updated Cargo.toml:
gui = ["dep:tauri", "dep:tauri-plugin-shell", "dep:tauri-build", "custom-protocol", "windows-biometric"]- No Type Safety Across IPC Boundary: TypeScript types are manually maintained copies of Rust types
- Silent Failures: Type mismatches don't cause compilation errors, just runtime behavior bugs
- Missing Verification: No systematic way to verify Rust and TypeScript types match
- Inconsistent Patterns: Some commands use
CommandResponse, some return raw data - Default Serde Behavior: Serde's default enum serialization doesn't match typical TypeScript conventions
- Always use
CommandResponse<T>wrapper: Provides consistent error handling across all Tauri commands - Always add
#[serde(rename_all = "snake_case")]to shared enums: Matches TypeScript naming conventions - Keep TypeScript types synchronized: Update
ui-sveltekit/src/lib/types/api.tsimmediately after Rust changes - Test at the API boundary: Run
npm run buildafter backend changes to catch type errors early - Document the pattern in AGENTS.md: Added comprehensive section on Backend-Frontend API Consistency
- Feature flags need careful management: GUI-required features should be part of the
guifeature
- ✅ Added "Backend-Frontend API Consistency" section to AGENTS.md with mandatory rules
- ✅ Added verification checklist for new Tauri commands
- ✅ Created unit tests:
tests/biometric_availability_test.rsto verify feature flag behavior - ⏳ TODO: Consider using
ts-rscrate to auto-generate TypeScript types from Rust - ⏳ TODO: Add CI check that compares TypeScript types against Rust types
- ⏳ TODO: Create a script to validate all
CommandResponseusage across codebase
tests/biometric_availability_test.rs: 4 tests verifying provider initialization and feature flag behavior
- Debugging Time: Multiple hours across several debugging rounds
- User Impact: Feature appeared completely broken ("Checking availability..." forever)
- Code Quality: Exposed systemic type safety gap in the project
- Documentation: Led to major improvements in AGENTS.md conventions
The Windows Hello biometric authentication feature was implemented without a credential storage component, resulting in a fundamentally broken authentication flow.
Location: src/services/auth_service.rs, unlock_with_biometric() method
Original Problematic Implementation:
pub async fn unlock_with_biometric(&self, message: &str) -> Result<()> {
// Verify vault is unlocked (this ensures user was recently authenticated)
if !self.is_unlocked_async().await {
return Err(VaultError::Locked);
}
let provider = self.biometric_provider.as_ref().ok_or(...)?;
provider.verify(message).await?;
Ok(())
}The Issue:
- Biometric unlock required the vault to already be UNLOCKED before using biometric
- Only verified biometric identity, but didn't actually unlock anything
- Complete logical inversion: biometric was for verification after unlock, not for unlocking
- Missing the entire credential retrieval mechanism
-
Incomplete Architecture Pattern: The biometric authentication pattern has three required components:
- ✅ Identity verification (Windows Hello API call)
- ❌ Credential storage (MISSING - no place to store PIN)
- ❌ Credential retrieval (MISSING - no way to get PIN after verification)
-
Misunderstanding the Use Case: Confused "biometric verification of already-authenticated user" with "biometric unlocking"
- Verification pattern: User → PIN unlock → Later verify still same user with biometric
- Unlock pattern: User → Biometric → Retrieve stored PIN → Unlock vault
-
Missing Design Review: Should have recognized that biometric unlock REQUIRES credential storage:
- Where is the PIN stored between sessions?
- How is it encrypted at rest?
- How is it retrieved after biometric verification?
- These questions should have been asked during initial implementation
Created complete biometric unlock architecture:
-
Added
CredentialStoremodule (src/biometric/credential_store.rs):- Uses Windows DPAPI (Data Protection API) for encryption
- Stores PIN encrypted with user's Windows credentials
- Per-vault storage using database path hash
- Only accessible to the Windows user who stored it
-
Refactored
unlock_with_biometric():
pub async fn unlock_with_biometric(&self, message: &str) -> Result<()> {
// 1. Verify biometric identity
let provider = self.biometric_provider.as_ref().ok_or(...)?;
let verified = provider.verify(message).await?;
if !verified { return Err(VaultError::BiometricFailed); }
// 2. Retrieve stored PIN (only after successful biometric)
let credential_store = self.credential_store.as_ref().ok_or(...)?;
let pin = credential_store.retrieve_pin()?;
// 3. Unlock vault with retrieved PIN
self.unlock(&pin).await?;
Ok(())
}- Added credential management:
enable_biometric_storage(pin)- Stores PIN after unlockdisable_biometric_storage()- Removes stored PINis_biometric_storage_enabled()- Checks if PIN is stored- Automatic PIN storage after successful unlock (if biometric enabled)
-
Recognize Complete Architecture Patterns: Biometric authentication is not just verification - it's a complete pattern:
- Identity verification (biometric)
- Credential storage (encrypted at rest)
- Credential retrieval (after verification)
- Authentication (using retrieved credential)
Missing ANY component makes the feature unusable.
-
Question the Flow: Ask "How does this actually work end-to-end?"
- User locks vault → reboot → unlock with biometric
- Where does the PIN come from if vault is locked?
- If it's stored, where? If it's not stored, how does biometric help?
-
Platform Patterns Have Standard Solutions:
- Windows biometric → Windows DPAPI for credential storage
- macOS biometric → macOS Keychain for credential storage
- Linux biometric → PAM/polkit patterns
These are well-established patterns that should be researched and followed.
-
Security-Critical Features Need Complete Design:
- Authentication mechanisms can't be incrementally implemented
- Must design complete flow before coding: enrollment → storage → retrieval → usage
- Each component must be secure individually and in combination
-
Test the User Flow: Walk through the actual user experience:
- Setup vault → Enable biometric → Lock → Unlock with biometric
- If this flow doesn't work without additional steps, the feature is broken
- ✅ Document biometric authentication pattern in AGENTS.md
- ✅ Add comprehensive integration tests for complete biometric flow
- ✅ Created per-vault credential storage to support multiple vaults
- ⏳ TODO: Create architecture decision records (ADRs) for authentication patterns
- ⏳ TODO: Add security design review checklist for authentication features
- ⏳ TODO: Document all platform-specific credential storage patterns
When implementing authentication or security features:
-
Architecture First: Design complete flow before writing code
- Draw sequence diagrams
- Identify all required components
- Document data storage requirements
- Plan error handling and edge cases
-
Research Platform Patterns: Don't invent authentication patterns
- How do native apps do this on this platform?
- What are the OS-provided secure storage mechanisms?
- What are the standard libraries/crates for this?
-
Security Review: Authentication features need:
- Threat model
- Storage encryption strategy
- Key management plan
- Attack surface analysis
-
Complete Testing: Test the full user journey:
- Setup flow
- Normal operation
- Error cases
- Recovery scenarios
- Security properties
tests/biometric_integration_test.rs: 7 tests for complete biometric unlock flow- Unit tests for
CredentialStore: PIN storage, retrieval, encryption roundtrip
- Development Time: Feature was partially implemented, then required complete redesign
- User Impact: Feature appeared to work but was completely non-functional for the intended use case
- Code Quality: Missing a fundamental architectural component
- Security: Original design would have required storing unencrypted PINs or keeping vault always unlocked
- Documentation: Exposed gap in documenting authentication patterns
Windows Hello biometric prompt was appearing behind the Vult application window instead of on top, making it impossible for users to interact with it.
Desktop apps MUST use IUserConsentVerifierInterop with HWND parameter, not the UWP API.
The windows-rs crate UserConsentVerifier::RequestVerificationAsync() is the UWP API that doesn't support window parenting. Desktop applications need the IUserConsentVerifierInterop::RequestVerificationForWindowAsync(HWND, message) interface to properly parent the modal to the application window.
window.set_always_on_top(true)?;
// Call Windows Hello...
window.set_always_on_top(false).ok();Result: Modal completely blocked behind window, completely inaccessible.
window.set_focus().ok();
tokio::time::sleep(tokio::time::Duration::from_millis(100)).await;
// Call Windows Hello...Result: Modal still appeared under the window.
window.set_always_on_top(false).ok();
window.request_user_attention(Some(tauri::UserAttentionType::Informational)).ok();
window.set_focus().ok();
tokio::time::sleep(tokio::time::Duration::from_millis(200)).await;
auth_manager.vault().auth().unlock_with_biometric(&message).await?;
window.set_focus().ok();Result: User confirmed this doesn't work. Window management APIs don't control Windows Hello modal z-order.
Microsoft documentation clearly states: Desktop applications MUST use IUserConsentVerifierInterop::RequestVerificationForWindowAsync(HWND, message) instead of the UWP RequestVerificationAsync(message).
Implementation:
- Add Dependencies (
Cargo.toml):
raw-window-handle = { version = "0.6", optional = true }
windows = { version = "0.58", features = [
"Win32_System_WinRT", # For IUserConsentVerifierInterop
# ... other features
]}- Extract HWND from Tauri Window (
src/gui/commands.rs):
use raw_window_handle::{HasWindowHandle, RawWindowHandle};
let hwnd = window
.window_handle()
.ok()
.and_then(|handle| match handle.as_raw() {
RawWindowHandle::Win32(win32_handle) => Some(win32_handle.hwnd.get() as isize),
_ => None,
});
if let Some(hwnd_value) = hwnd {
auth_manager.vault().auth()
.unlock_with_biometric_with_window(&message, hwnd_value).await?;
}- Use Desktop Interop API (
src/biometric/windows_hello.rs):
use windows::Foundation::IAsyncOperation;
use windows::Win32::System::WinRT::IUserConsentVerifierInterop;
async fn verify(&self, message: &str) -> Result<bool> {
let message_hstring = windows::core::HSTRING::from(message);
if let Some(hwnd_value) = self.window_handle {
let hwnd = HWND(hwnd_value as *mut _);
// Get the desktop-specific interface
let factory: IUserConsentVerifierInterop =
windows::core::factory::<UserConsentVerifier, IUserConsentVerifierInterop>()
.map_err(|_| VaultError::BiometricFailed)?;
// Use the desktop API with HWND parameter
let async_op: IAsyncOperation<UserConsentVerificationResult> = unsafe {
factory.RequestVerificationForWindowAsync(hwnd, &message_hstring)
.map_err(|_| VaultError::BiometricFailed)?
};
let result = async_op.get().map_err(|_| VaultError::BiometricFailed)?;
Ok(map_verification_result(result))
} else {
// Fallback to UWP API
let async_op = UserConsentVerifier::RequestVerificationAsync(&message_hstring)
.map_err(|_| VaultError::BiometricFailed)?;
let result = async_op.get().map_err(|_| VaultError::BiometricFailed)?;
Ok(map_verification_result(result))
}
}- Proper Desktop API: Uses Windows Desktop-specific interface instead of UWP
- Window Parenting: HWND parameter establishes parent-child relationship
- Z-Order Management: Windows automatically handles modal layering when given HWND
- No Window Choreography Needed: No delays, no focus tricks, just proper API usage
- Check for Desktop vs UWP APIs: Many Windows APIs have separate desktop variants
- Window Management APIs Don't Control System Modals: set_always_on_top, request_user_attention, etc. don't affect Windows Hello
- HWND is Critical: Desktop apps need to pass window handles to system APIs for proper parenting
- Read Microsoft Documentation: The docs clearly specify which API to use for desktop apps
- Don't Rely on Workarounds: Proper API usage is simpler and more reliable than window management hacks
- Microsoft UserConsentVerifier Docs: https://learn.microsoft.com/en-us/uwp/api/windows.security.credentials.ui.userconsentverifier
- IUserConsentVerifierInterop (Desktop API): Mentioned in "Desktop apps should use RequestVerificationForWindowAsync"
- raw-window-handle crate: https://docs.rs/raw-window-handle/
- Tauri Window Handle: https://docs.rs/tauri/latest/tauri/struct.Window.html#method.window_handle