Cache infer_schema() results per store path

## Summary

`infer_schema()` is called on every query against a `ZarrTable`, re-reading Zarr metadata files (`.zarray`, `.zattrs`, `zarr.json`, `.zmetadata`) and re-discovering array structure each time. For remote stores (S3/GCS), this adds multiple HTTP round-trips per query. The results should be cached since Zarr store structure rarely changes during a session.

## Current Behavior

In `src/datasource/zarr.rs`, `ZarrTable::try_new()` calls `infer_schema()` once during table registration. However:

1. **Remote store metadata is fetched twice** — once during `try_new()` for schema, and again in `ZarrExec::execute()` via `discover_arrays()` for array structure (shapes, chunk sizes, coordinates)
2. **`discover_arrays()` is called per-execution** — `zarr_exec.rs:336` calls it every time a query runs against VirtualiZarr stores
3. **No cross-table caching** — If the same Zarr store is registered under different names or queried via different `ZarrTable` instances, schema inference runs independently

### Cost per `infer_schema()` call:

| Store Type | Operations | Estimated Latency |
|------------|-----------|-------------------|
| Local (v2) | Read `.zarray` + `.zattrs` per array, list directories | ~5-50ms |
| Local (v3) | Read `zarr.json` per array | ~5-50ms |
| Remote (S3/GCS) | LIST + GET per array (2-4 HTTP calls per array) | ~200-800ms |
| VirtualiZarr | Read + parse `.zmetadata` JSON | ~10-100ms |

For a store with 10 arrays, remote schema inference can take 2-8 seconds.

## Proposed Change

Add a metadata cache keyed by store path:

```rust
use std::collections::HashMap;
use std::sync::{Arc, RwLock};
use std::time::Instant;

/// Cached Zarr store metadata
pub struct ZarrMetadataCache {
    entries: RwLock<HashMap<String, CachedMetadata>>,
}

struct CachedMetadata {
    schema: SchemaRef,
    array_info: Vec<ArrayInfo>,  // shapes, chunk sizes, dtypes
    coord_values: Vec<(String, CoordValues)>,  // pre-loaded coordinate values
    statistics: Statistics,
    cached_at: Instant,
}

impl ZarrMetadataCache {
    /// Get or compute schema for a store path
    pub async fn get_or_infer(&self, path: &str) -> Result<&CachedMetadata> {
        // Check cache first
        if let Some(cached) = self.entries.read().unwrap().get(path) {
            return Ok(cached);
        }
        // Infer and cache
        let meta = infer_and_discover(path).await?;
        self.entries.write().unwrap().insert(path.to_string(), meta);
        // ...
    }

    /// Invalidate cache for a path (e.g., after data update)
    pub fn invalidate(&self, path: &str) { ... }
}
```

### Integration points:

1. **`ZarrTable`** holds an `Arc<ZarrMetadataCache>` shared across tables
2. **`ZarrExec`** receives pre-cached metadata instead of calling `discover_arrays()` per execution
3. **CLI session** creates one cache instance for the session lifetime
4. **Optional TTL** — cache entries expire after configurable duration (default: no expiry within session)

## Impact

- **Remote stores**: Eliminates 2-8 seconds of repeated HTTP calls per query
- **Interactive CLI**: Schema discovery happens once on `CREATE TABLE`, subsequent queries are instant
- **Multiple queries**: `SELECT MIN(time) FROM era5; SELECT MAX(temp) FROM era5;` — second query skips all metadata I/O
- **VirtualiZarr**: `.zmetadata` JSON parsed once instead of per-query

## Files to Modify

- `src/reader/schema_inference.rs` — Add `ZarrMetadataCache` struct, `CachedMetadata` type
- `src/datasource/zarr.rs` — `ZarrTable` holds `Arc<ZarrMetadataCache>`, passes to `ZarrExec`
- `src/physical_plan/zarr_exec.rs` — Accept cached metadata, skip `discover_arrays()` when cache hit
- `src/bin/zarr_cli/main.rs` — Create shared cache for CLI session

## Motivation

Inspired by Vortex's `CachedVortexMetadata` which avoids re-reading file footers across queries. The pattern is straightforward: metadata is read-heavy and write-rare, making it an ideal caching target. zarr-datafusion already partially caches remote store connections (`cached_remote` in `ZarrExec`), but doesn't cache the more expensive metadata discovery step.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache infer_schema() results per store path #23

Summary

Current Behavior

Cost per `infer_schema()` call:

Proposed Change

Integration points:

Impact

Files to Modify

Motivation

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Store Type	Operations	Estimated Latency
Local (v2)	Read `.zarray` + `.zattrs` per array, list directories	~5-50ms
Local (v3)	Read `zarr.json` per array	~5-50ms
Remote (S3/GCS)	LIST + GET per array (2-4 HTTP calls per array)	~200-800ms
VirtualiZarr	Read + parse `.zmetadata` JSON	~10-100ms

Cache infer_schema() results per store path #23

Description

Summary

Current Behavior

Cost per infer_schema() call:

Proposed Change

Integration points:

Impact

Files to Modify

Motivation

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Cost per `infer_schema()` call: