Skip to content

✨ Feature: Add Caching System for Images and Pages #27

@dannycab

Description

@dannycab

Feature Request

Implement caching to avoid re-downloading same images and pages across runs.

Current Behavior

  • Re-downloads images even if already present
  • Re-fetches wiki pages even if unchanged
  • Wastes bandwidth and time

Proposed Solution

from diskcache import Cache

class CachedHTTPClient:
    def __init__(self, cache_dir='.wikiaccess_cache'):
        self.cache = Cache(cache_dir)
    
    def get_page_raw(self, page_id: str):
        cache_key = f'page:{page_id}'
        if cache_key in self.cache:
            return self.cache[cache_key]
        
        content = self._fetch_from_wiki(page_id)
        self.cache.set(cache_key, content, expire=3600)
        return content

Features

  • Cache with TTL (configurable expiration)
  • Cache invalidation command
  • Conditional requests (If-Modified-Since headers)
  • Cache statistics
  • Respect Cache-Control headers

Tasks

  • Add diskcache or requests-cache dependency
  • Implement page caching
  • Implement image caching
  • Add cache configuration options
  • Add wikiaccess cache clear command
  • Add wikiaccess cache stats command
  • Add --no-cache flag for fresh fetches

Metadata

Metadata

Assignees

No one assigned

    Labels

    priority/mediumMedium priority - Important but not urgenttype/featureNew feature request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions