feat: performance optimizations and test coverage improvements#2
Merged
Conversation
Apply betteralign to reorder struct fields for optimal memory alignment. This reduces memory usage by ~728 bytes across 26 structs in: - checker (4 structs) - parser (2 structs) - fixer (3 structs) - output (14 structs) - ui (3 structs)
Split cmd/check.go (642 lines) into focused files: - cmd/check.go (260 lines) - command definition and main flow - cmd/check_output.go (171 lines) - output formatting functions - cmd/check_print.go (162 lines) - text printing functions Extract shared utilities: - cmd/helpers.go (147 lines) - shared CLI helpers (CreateFilter, CountUniqueURLs, etc.) - internal/helpers/helpers.go (37 lines) - generic utilities (TruncateText, TruncateURL) Remove code duplication: - Consolidated createFilter/createFixFilter/createInteractiveFilter into CreateFilter - Consolidated countUniqueURLs/countFixUniqueURLs into CountUniqueURLs - Added FilterParserLinks and ConvertParserLinks helpers Add documentation: - Package-level docs for ui package - Function docs for state handlers and renderers in ui/app.go - Enhanced docs for checker and parser packages
- Update default settings for faster checking: - Concurrency: 10 → 50 - Timeout: 10s → 5s - MaxRetries: 2 → 1 - MaxRedirects: 10 → 5 - Optimize HTTP transport: - Increase connection pool sizes (500 idle, 50 per host) - Enable HTTP/2 with ForceAttemptHTTP2 - Reduce idle timeouts for faster connection cycling - Add --stats flag for performance statistics: - Timing breakdown (scan/parse/check phases) - Throughput metrics (URLs/sec, avg response time) - Memory usage (heap, allocations, GC cycles) - Included in JSON/YAML output when requested - Memory optimizations: - Pre-defined strings for status methods (avoid allocations) - Pre-allocated slice capacities in filter functions - Parallel file parsing with worker pool (bounded by NumCPU) - Performance improvement: 84s → 14s for 755 URLs (~6x faster)
- Add tests for internal/helpers package (100% coverage): - TruncateText: empty, whitespace, length edge cases, unicode - TruncateURL: empty, shorter/equal/longer than max - CountUniqueStrings: nil, empty, unique, duplicates - Add tests for internal/stats package (100% coverage): - Phase tracking (scan, parse, check) - Duration calculations - Throughput metrics (URLs/sec, avg response time) - FormatDuration and FormatBytes formatting - String output and JSON serialization - Add guard clause to TruncateText for maxLen < 4
- Add testdata files for parallel processing tests (5 files) - Add tests for parallel link extraction with 3+ files - Add comprehensive HTML link edge case tests - Add position tracking tests for various link types - Add code block handling tests - Add empty content and edge case tests Coverage improvements: - extractLinksParallel: 0% → 100% - Overall parser: 79.9% → 95.1%
- Remove '(coming soon)' from Homebrew section - it's now available - Update default values to match optimized settings: - concurrency: 10 → 50 - timeout: 10s → 5s - retries: 2 → 1 - Add --stats flag documentation to flags tables - Fix Go version in CI example (1.24)
Performance optimizations with benchmark validation:
checker.go:
- Replace crypto/rand with math/rand/v2 for backoff jitter (22x faster)
- Pre-allocate maps/slices in Check() deduplication
- Pre-allocate redirect chain slice (typical 1-4 hops)
- Reduce response body drain limit 1MB → 64KB
result.go:
- Combine double iteration in Summarize() into single pass
- Use struct{} for seen map to reduce memory
parser.go:
- Move refDefRegex to package level (avoid recompile per call)
- Replace bytes.Split with bytes.IndexByte (avoid allocating line slice)
- Pre-allocate links slice with estimated capacity
- Pre-allocate line index with estimated lines
cmd/helpers.go:
- Pre-allocate filter result slices with estimated ratios
output/markdown.go:
- Pre-grow string builder based on result count
Benchmark improvements:
- BenchmarkBackoffDelay: 206.9ns → 9.2ns (22x faster, 0 allocs vs 3)
- BenchmarkExtractRefDefs: 92μs → 83μs (10% faster, 55% fewer bytes)
- BenchmarkSummarize_Small: 6.8μs → 3.9μs (43% faster)
- BenchmarkBuildLineIndex: 74μs → 70μs (5% faster, 90% fewer allocs)
- BenchmarkFilterResultsWarnings: 24μs → 12μs (50% faster, 55% less memory)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces significant performance optimizations, improved test coverage, and documentation updates for the Gone dead link checker.
Changes
Performance Optimizations
crypto/randwithmath/rand/v2for jitter calculationbytes.Splitwithbytes.IndexByteTest Coverage
statsandhelperspackages (100% coverage)Documentation
--statsflag documentationBenchmark Results
BackoffDelayExtractRefDefsSummarize_SmallBuildLineIndexFilterResultsWarningsCommits
perf: optimize allocations and reduce hot path overheaddocs: update README with correct defaults and Homebrew availabilitytest: improve parser package coverage from 79.9% to 95.1%test: add comprehensive tests for stats and helpers packageschore: add gone-test to gitignoreperf: optimize performance with new defaults and stats trackingrefactor: reorganize code for better maintainabilityperf: optimize struct field alignment for better memory usageTesting