Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Store image provider cache in database to speed up application restarts #426

Merged
merged 9 commits into from
Feb 2, 2025

Conversation

tphakala
Copy link
Owner

@tphakala tphakala commented Feb 2, 2025

Persistent Caching for Thumbnail Image Metadata

This PR implements a persistent caching mechanism for thumbnail image metadata in the database, improving application performance and reducing external API calls.

Key Changes

  • Implements database-backed caching for image metadata (URL, license, author info)
  • Cache entries are automatically refreshed after 14 days
  • Graceful fallback to live fetching if cache fails
  • Memory-efficient storage with size estimation
  • Thread-safe implementation with proper error handling

Benefits

  • Reduces application startup time by avoiding redundant API calls
  • Minimizes external API usage and network traffic
  • Improves reliability by maintaining service during API outages
  • Ensures metadata freshness through automatic refresh mechanism

Technical Details

  • Uses GORM for database operations
  • Implements automatic background refresh routine
  • Includes comprehensive test suite with mock providers
  • Handles edge cases (database failures, nil store scenarios)
  • Cross-platform compatible implementation

- Introduced ImageCache model to store image metadata
- Added methods to get, save, and retrieve image cache entries
- Updated database auto-migration to include ImageCache model
- Implemented methods in datastore interface for image caching operations
…or handling

- Implemented robust image caching with database and memory support
- Added background cache refresh mechanism with configurable TTL
- Improved error handling in image fetching and caching
- Simplified image provider interface and cache management
- Added debug logging and metrics for image retrieval process
- Added mock implementations for datastore and error handling
- Introduced tests for database failure scenarios
- Implemented tests for cache refresh and nil store handling
- Enhanced test coverage for image caching functionality
- Added test cases for memory usage and size estimation
- Avoid unnecessary struct copying in image cache initialization and refresh
- Use pointer references to improve memory efficiency
- Simplified stale entry tracking in cache refresh mechanism
- Updated method signatures for GetClipsQualifyingForRemoval and UpdateNoteComment
- Reordered and clarified parameter order for better readability
- Maintained consistent method signature style in mock store implementation
… and test coverage

- Added detailed debug logging for cache refresh and mock store operations
- Improved test case for cache refresh with more comprehensive checks
- Shortened refresh intervals and delays for more responsive testing
- Added logging and error handling in cache initialization and refresh routines
- Ensured robust testing of image cache refresh mechanism
Copy link
Contributor

coderabbitai bot commented Feb 2, 2025

Walkthrough

The pull request updates the bird image caching workflow. In internal/analysis/realtime.go, the cache initialization now passes a datastore parameter, introduces a semaphore to restrict concurrent fetches, and adds improved logging. The datastore layer is expanded with new methods and a new ImageCache model, along with corresponding auto-migration support. The image provider module is reorganized to consolidate provider management, add cache refresh capabilities with TTL-based refresh logic, and enhance error handling. Test cases are updated with new mock stores, and error messages in the Wikipedia integration are made more generic.

Changes

File(s) Summary of Changes
internal/analysis/realtime.go
internal/imageprovider/imageprovider.go
Updated function signatures to accept a datastore parameter; added concurrency control (semaphore) and cache refresh logic; merged provider fields; enhanced error handling and logging in cache initialization and image retrieval.
internal/datastore/interfaces.go
internal/datastore/manage.go
internal/datastore/model.go
Expanded datastore functionality by adding new methods (GetImageCache, SaveImageCache, GetAllImageCaches), a new ImageCache struct, and including ImageCache in auto-migration.
internal/imageprovider/imageprovider_test.go Enhanced the test suite with new mock store types (mockStore, mockFailingStore) and tests covering normal and failure scenarios for image caching operations.
internal/imageprovider/wikipedia.go Modified error handling in Fetch, queryThumbnail, and queryAuthorInfo methods to return more generic, user-friendly error messages without exposing internal details.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Cache as BirdImageCache
    participant Store as Datastore
    participant Provider as Image Provider

    Client->>Cache: Request image for species
    alt Image in Memory
        Cache-->>Client: Return cached image
    else Not in Memory
        Cache->>Store: GetImageCache(species)
        alt Cache Found
            Store-->>Cache: Return cached entry
            Cache-->>Client: Return image
        else Cache Not Found
            Cache->>Provider: Fetch image for species
            Provider-->>Cache: Return fetched image
            Cache->>Store: SaveImageCache(new image)
            Cache-->>Client: Return image
        end
    end
Loading
sequenceDiagram
    participant Cache as BirdImageCache
    participant Store as Datastore
    participant Provider as Image Provider

    Cache->>Cache: startCacheRefresh routine
    Cache->>Store: GetAllImageCaches()
    alt Stale Entry Detected
        Cache->>Provider: refreshEntry(stale image)
        Provider-->>Cache: Return updated image
        Cache->>Store: SaveImageCache(updated image)
    else No Stale Entries
        Cache-->>Cache: Continue without refresh
    end
Loading

Possibly related PRs

  • Publish bird images in mqtt events #219: The changes in the main PR are related to the modifications in the initBirdImageCache function in the retrieved PR, as both involve updates to the initialization and management of the BirdImageCache.
  • refactor: improve thumbnails code, add cache size metrics for telemetry #220: The changes in the main PR are related to the modifications in the initBirdImageCache function in the retrieved PR, as both involve updates to the function signature and its handling of parameters, specifically the addition of the datastore.Interface and metrics *telemetry.Metrics.

Poem

Oh, I hop with glee in lines of code,
Where caches bloom down every node.
With semaphores keeping fetches light,
And stale entries renewed just right.
In fields of data, I cheer and play—
A rabbit’s delight in changes today! 🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (11)
internal/imageprovider/imageprovider.go (5)

35-43: Potential confusion with a single provider field.
Both SetNonBirdImageProvider and SetImageProvider overwrite the same provider field. If the intent is to handle separate providers (bird vs. non-bird), aligning the field(s) and setter methods more clearly would help prevent accidental overrides.

Also applies to: 53-60


62-67: Refresh interval constants may need environment-based configurations.
A hard-coded 1-second refresh interval and 100ms delay could be excessive in production or too slow if there are a large number of entries. Consider providing these values as configuration settings based on environment (dev vs. production).


69-94: Check for potential long-running refresh cycles.
Processing entries every second might scale poorly if the number of stale entries grows large. You could implement concurrency or a queue-based approach. Alternatively, ensure the batch size and interval remain appropriate under production loads.


183-189: Ensure Close() is idempotent.
Calling Close() multiple times can lead to a panic when closing an already-closed channel. You may want to guard it with a sync.Once or a conditional check to avoid repeated closes.


414-444: Duplicate logic between InitCache and CreateDefaultCache.
Both functions contain similar steps (instantiating BirdImageCache, loading images, starting refresh). To avoid code drift, consider refactoring one function to call the other or extract common logic into a helper function.

internal/imageprovider/imageprovider_test.go (4)

29-39: Time-based unique URL in tests can cause intermittent issues.
If tests run quickly or system time changes, the URL might not change as expected. Typically this is stable, but in some CI scenarios, injecting a short random string or increment might avoid edge cases more reliably than relying on UnixNano().


188-239: Excellent coverage of cache hits vs. misses.
Verifying fetchCounter ensures the in-memory cache is working as intended. Optionally, you could add concurrency tests to ensure correct behavior under parallel requests.


241-259: Check for more detailed error assertions.
Currently, the test only checks whether an error is returned. For more robust validation, consider verifying the specific error message or type.


412-480: Refresh routine test may be slow.
Sleeping for 5 seconds ensures you see a refresh, but it can slow down the test suite. Consider using a time mocking library or reducing the intervals for test builds to speed up automated testing.

internal/datastore/manage.go (1)

27-27: Include proper GORM tags or indexes for ImageCache.
AutoMigrate with &ImageCache{} is correct, but ensure the ImageCache model includes any needed indexes on ScientificName, especially if queries will filter on that field frequently.

internal/analysis/realtime.go (1)

327-329: Consider enhancing error handling in the goroutine.

While errors are logged, they might be important enough to track or report through metrics.

Consider tracking fetch failures through metrics:

 if _, err := birdImageCache.Get(name); err != nil {
     log.Printf("Failed to fetch image for %s: %v", name, err)
+    if metrics != nil {
+        metrics.ImageFetchFailures.Inc()
+    }
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3f88c4b and e31c5ba.

📒 Files selected for processing (7)
  • internal/analysis/realtime.go (1 hunks)
  • internal/datastore/interfaces.go (2 hunks)
  • internal/datastore/manage.go (1 hunks)
  • internal/datastore/model.go (1 hunks)
  • internal/imageprovider/imageprovider.go (5 hunks)
  • internal/imageprovider/imageprovider_test.go (8 hunks)
  • internal/imageprovider/wikipedia.go (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (14)
internal/imageprovider/imageprovider.go (1)

22-31: New BirdImage struct looks good for caching.
The newly added CachedAt field is particularly useful for detecting stale entries. Consider adding validations or an ID field if you need to query or index these entries more flexibly in the future.

internal/imageprovider/imageprovider_test.go (3)

41-122: mockStore implementation is comprehensive.
This well-structured mock provides crucial debug info. However, if test output becomes noisy, consider using a logger that can be silenced. Overall, this approach effectively emulates the datastore for tests.


151-180: mockFailingStore elegantly tests error handling.
Simulating failures in Get, Save, and GetAll helps ensure the cache fallback logic is robust. Great approach to increasing coverage on failure paths.


294-323: Memory usage test is insightful.
Measuring approximate usage is valuable for performance analysis. If available, running this test under load or with more entries might provide further assurance.

internal/datastore/model.go (1)

113-123: LGTM! Well-structured model definition.

The ImageCache model is well-designed with:

  • Appropriate field types for storing image metadata
  • Proper use of GORM tags for constraints (uniqueIndex on ScientificName)
  • Clear field naming that reflects the data purpose
internal/imageprovider/wikipedia.go (3)

151-151: LGTM! Improved error propagation.

Good change to pass through the user-friendly error from queryThumbnail instead of wrapping it.


164-165: LGTM! Better error abstraction.

Good improvement to use a generic error message instead of exposing internal error details to the user.


203-203: LGTM! User-friendly error messages.

The error messages are now more user-friendly and provide clear information about what went wrong:

  • No Wikipedia page found
  • No free-license image available
  • Image metadata not available

Also applies to: 211-211, 219-219

internal/analysis/realtime.go (2)

304-304: LGTM! Good concurrency control.

Using a semaphore to limit concurrent fetches to 5 is a good practice to prevent resource exhaustion.


312-314: LGTM! Efficient cache checking.

Good optimization to skip already cached images before launching goroutines.

internal/datastore/interfaces.go (4)

56-58: LGTM! Well-defined interface methods.

Good addition of methods for managing image cache entries with clear signatures.


874-887: LGTM! Robust implementation of GetImageCache.

Good implementation with:

  • Silent logging to reduce noise
  • Proper handling of "not found" case
  • Clear error wrapping

889-904: LGTM! Solid SaveImageCache implementation.

Good implementation with:

  • Input validation
  • Silent logging
  • Proper upsert logic using FirstOrCreate

906-914: LGTM! Clean GetAllImageCaches implementation.

Good implementation with:

  • Silent logging
  • Clear error handling

internal/imageprovider/imageprovider.go Outdated Show resolved Hide resolved
- Added comprehensive tests for concurrent image cache initialization
- Implemented robust handling of concurrent requests and initialization timeouts
- Enhanced mock image provider to simulate network latency and race conditions
- Added test cases for initialization failure and retry mechanisms
- Improved cache initialization logic with better synchronization and error handling
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
internal/imageprovider/imageprovider.go (2)

63-68: Consider increasing the refresh delay for production.

The refresh delay of 100ms between refreshing individual entries seems too aggressive for production use and might cause unnecessary load on external APIs.

Apply this diff to increase the delay:

-	refreshDelay     = 100 * time.Millisecond // Delay between refreshing individual entries (shortened for testing)
+	refreshDelay     = 1 * time.Second // Delay between refreshing individual entries

344-434: Extract maxAttempts as a constant.

For better maintainability and consistency with other constants, consider extracting the maximum number of initialization attempts as a constant.

Apply this diff to add the constant:

 const (
 	defaultCacheTTL  = 14 * 24 * time.Hour    // 14 days
 	refreshInterval  = 1 * time.Second        // How often to check for stale entries
 	refreshBatchSize = 10                     // Number of entries to refresh in one batch
 	refreshDelay     = 100 * time.Millisecond // Delay between refreshing individual entries
+	maxInitAttempts  = 3                      // Maximum number of initialization attempts
 )

 // In Get method:
-	maxAttempts := 3 // Maximum number of initialization attempts
+	maxAttempts := maxInitAttempts
internal/imageprovider/imageprovider_test.go (2)

119-169: Extract unused methods to a separate interface.

The mockStore implements many unused methods from the datastore.Interface. Consider extracting these methods to a separate interface to follow the Interface Segregation Principle.

Create a new interface in internal/datastore/interfaces.go:

// ImageCacheStore defines the interface for image cache operations
type ImageCacheStore interface {
    GetImageCache(scientificName string) (*ImageCache, error)
    SaveImageCache(cache *ImageCache) error
    GetAllImageCaches() ([]ImageCache, error)
}

// Update the BirdImageCache to use ImageCacheStore instead of Interface
type BirdImageCache struct {
    provider     ImageProvider
    dataMap      sync.Map
    metrics      *metrics.ImageProviderMetrics
    debug        bool
    store        ImageCacheStore
    logger       *log.Logger
    quit         chan struct{}
    Initializing sync.Map
}

500-661: Consider using table-driven tests for concurrency scenarios.

The concurrency tests could be more maintainable using table-driven tests, similar to how other tests in the file are structured.

Example structure:

func TestConcurrencyScenarios(t *testing.T) {
    tests := []struct {
        name           string
        fetchDelay    time.Duration
        shouldFail    bool
        numRequests   int
        expectedFetch int
        wantErr      bool
    }{
        {
            name:          "Concurrent initialization",
            fetchDelay:    200 * time.Millisecond,
            numRequests:   10,
            expectedFetch: 1,
            wantErr:      false,
        },
        {
            name:          "Initialization timeout",
            fetchDelay:    2 * time.Second,
            numRequests:   2,
            expectedFetch: 3,
            wantErr:      false,
        },
        {
            name:          "Initialization failure",
            shouldFail:    true,
            numRequests:   2,
            expectedFetch: 2,
            wantErr:      true,
        },
    }
    // ... test implementation
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e31c5ba and 817e8d5.

📒 Files selected for processing (2)
  • internal/imageprovider/imageprovider.go (5 hunks)
  • internal/imageprovider/imageprovider_test.go (7 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Analyze (go)
🔇 Additional comments (7)
internal/imageprovider/imageprovider.go (5)

70-156: Well-implemented background refresh mechanism!

The implementation includes:

  • Proper goroutine management with quit channel
  • Batch processing of stale entries
  • Robust error handling
  • Debug logging

192-309: Robust initialization and loading mechanism!

The implementation includes:

  • Proper error handling
  • Graceful fallback to empty cache on errors
  • Debug logging

311-342: Well-implemented thread-safe initialization!

The implementation includes:

  • Proper concurrency control using sync.Map
  • Robust error handling
  • Nil provider check

436-465: Well-implemented fetch and store mechanism!

The implementation includes:

  • Proper error handling
  • Debug logging
  • Metrics tracking

497-527: Well-implemented default cache creation!

The implementation includes:

  • Proper error handling
  • Debug logging
  • Background refresh initialization
internal/imageprovider/imageprovider_test.go (2)

207-342: Well-implemented basic test suite!

The tests cover:

  • Basic cache functionality
  • Error handling
  • Memory usage calculation

344-498: Comprehensive database failure test suite!

The tests cover:

  • Database failure scenarios
  • Nil store handling
  • Cache refresh functionality

internal/imageprovider/imageprovider.go Outdated Show resolved Hide resolved
- Prevent potential nil pointer dereference in cache refresh method
- Add debug logging for cases where no image provider is available
- Improve robustness of image cache refresh mechanism
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
internal/imageprovider/imageprovider.go (7)

23-32: Add field documentation for BirdImage struct.

Consider adding documentation for each field to improve code maintainability.

 type BirdImage struct {
+    // URL is the direct link to the bird image
     URL            string
+    // ScientificName is the taxonomic name of the bird species
     ScientificName string
+    // LicenseName is the name of the license under which the image is available
     LicenseName    string
+    // LicenseURL is the URL to the full license text
     LicenseURL     string
+    // AuthorName is the name of the image author/photographer
     AuthorName     string
+    // AuthorURL is the URL to the author's profile or website
     AuthorURL      string
+    // CachedAt is the timestamp when this image was last cached
     CachedAt       time.Time
 }

63-68: Make cache configuration values configurable.

Consider moving these constants to the configuration system to allow runtime adjustment.

-const (
-    defaultCacheTTL  = 14 * 24 * time.Hour    // 14 days
-    refreshInterval  = 1 * time.Second        // How often to check for stale entries
-    refreshBatchSize = 10                     // Number of entries to refresh in one batch
-    refreshDelay     = 100 * time.Millisecond // Delay between refreshing individual entries
-)
+// CacheConfig holds the configuration for the image cache
+type CacheConfig struct {
+    TTL           time.Duration
+    RefreshInterval time.Duration
+    BatchSize     int
+    RefreshDelay  time.Duration
+}

97-156: Add exponential backoff for refresh retries.

The current implementation uses a fixed delay between refresh attempts. Consider implementing exponential backoff to handle transient failures more gracefully.

 func (c *BirdImageCache) refreshStaleEntries() {
+    retryCount := 0
+    maxRetries := 3
+    baseDelay := refreshDelay
     for i := 0; i < len(staleEntries); i += refreshBatchSize {
         // ... existing code ...
         for _, scientificName := range batch {
             select {
             case <-c.quit:
                 return
-            case <-time.After(refreshDelay):
+            case <-time.After(baseDelay * time.Duration(1 << uint(retryCount))):
                 if err := c.refreshEntry(scientificName); err != nil {
+                    if retryCount < maxRetries {
+                        retryCount++
+                        i-- // Retry this entry
+                        continue
+                    }
+                }
+                retryCount = 0 // Reset for next entry
                 // ... rest of the code ...

192-198: Improve shutdown handling.

Consider adding a timeout for graceful shutdown and waiting for ongoing operations to complete.

-func (c *BirdImageCache) Close() error {
+func (c *BirdImageCache) Close(timeout time.Duration) error {
+    done := make(chan struct{})
+    go func() {
+        // Wait for ongoing operations
+        c.dataMap.Range(func(key, value interface{}) bool {
+            if _, initializing := c.Initializing.Load(key.(string)); initializing {
+                time.Sleep(100 * time.Millisecond)
+            }
+            return true
+        })
         if c.quit != nil {
             close(c.quit)
         }
+        close(done)
+    }()
+
+    select {
+    case <-done:
+        return nil
+    case <-time.After(timeout):
+        return fmt.Errorf("shutdown timed out after %v", timeout)
+    }
 }

259-284: Add retry logic for database operations.

Consider implementing retry logic with exponential backoff for transient database errors.

+func (c *BirdImageCache) saveToDBWithRetry(image *BirdImage, maxRetries int) error {
+    var err error
+    for i := 0; i < maxRetries; i++ {
+        err = c.saveToDB(image)
+        if err == nil {
+            return nil
+        }
+        if !isTransientError(err) {
+            return err
+        }
+        time.Sleep(time.Duration(1<<uint(i)) * 100 * time.Millisecond)
+    }
+    return fmt.Errorf("failed to save to database after %d retries: %w", maxRetries, err)
+}
+
+func isTransientError(err error) bool {
+    // Add logic to identify transient database errors
+    return false
+}

352-442: Improve context propagation.

Consider accepting a context parameter in the Get method to allow better cancellation control from the caller.

-func (c *BirdImageCache) Get(scientificName string) (BirdImage, error) {
+func (c *BirdImageCache) Get(ctx context.Context, scientificName string) (BirdImage, error) {
     // ... existing code ...
     
     select {
-    case <-ctx.Done():
+    case <-ctx.Done():
+        return BirdImage{}, ctx.Err()
     case <-done:
         return result, fetchErr
     }
 }

475-493: Add memory limits to prevent OOM.

Consider implementing a maximum memory limit and an eviction policy for the cache.

+const maxMemoryUsage = 1 * 1024 * 1024 * 1024 // 1GB

 func (c *BirdImageCache) MemoryUsage() int {
     totalSize := 0
     c.dataMap.Range(func(key, value interface{}) bool {
         if img, ok := value.(*BirdImage); ok {
             totalSize += img.EstimateSize()
         }
         return true
     })
+    // Evict entries if memory usage exceeds limit
+    if totalSize > maxMemoryUsage {
+        c.evictOldest(totalSize - maxMemoryUsage)
+    }
     return totalSize
 }

+func (c *BirdImageCache) evictOldest(bytesToFree int) {
+    type entry struct {
+        key      string
+        cachedAt time.Time
+        size     int
+    }
+    var entries []entry
+    
+    // Collect entries
+    c.dataMap.Range(func(key, value interface{}) bool {
+        if img, ok := value.(*BirdImage); ok {
+            entries = append(entries, entry{
+                key:      key.(string),
+                cachedAt: img.CachedAt,
+                size:     img.EstimateSize(),
+            })
+        }
+        return true
+    })
+    
+    // Sort by age
+    sort.Slice(entries, func(i, j int) bool {
+        return entries[i].cachedAt.Before(entries[j].cachedAt)
+    })
+    
+    // Evict oldest entries until we free enough space
+    freed := 0
+    for _, e := range entries {
+        if freed >= bytesToFree {
+            break
+        }
+        c.dataMap.Delete(e.key)
+        freed += e.size
+    }
+}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 817e8d5 and 79f541b.

📒 Files selected for processing (1)
  • internal/imageprovider/imageprovider.go (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)

@tphakala tphakala merged commit 42b345e into main Feb 2, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant