Skip to content

feat: add R1C1 formula format support to get_sheet_formulas#67

Open
jfenal wants to merge 6 commits into
xing5:mainfrom
jfenal:pr/r1c1-formula-format
Open

feat: add R1C1 formula format support to get_sheet_formulas#67
jfenal wants to merge 6 commits into
xing5:mainfrom
jfenal:pr/r1c1-formula-format

Conversation

@jfenal
Copy link
Copy Markdown

@jfenal jfenal commented Feb 23, 2026

Summary

Adds R1C1 formula notation support to get_sheet_formulas tool, enabling formula pattern analysis across sheets.

Features

  • A1 to R1C1 Conversion: Efficient conversion function with pre-built column lookup table (covers 1000 columns)
  • Format Parameter: New optional format parameter accepts 'A1' (default) or 'R1C1'
  • Pattern Analysis: R1C1 format normalizes relative references, making it easy to identify duplicate formula patterns across different cell positions

Use Cases

  • Formula Auditing: Identify unique formula patterns across multiple sheets
  • Duplicate Detection: Find cells using the same formula logic in different positions
  • Formula Migration: Easier formula copying/analysis when position-independent

Implementation

  • Optimized with pre-compiled regex pattern and column lookup table
  • Comprehensive unit tests with edge cases
  • Integration tests with real Google Sheets API
  • Complete documentation with examples

Example

# Get formulas in R1C1 format for pattern analysis
data = get_sheet_formulas('spreadsheet-id', ['Sheet1!B:B', 'Sheet2!C:C'], format='R1C1')

# All cells with =SUM(A1:A3) will normalize to =SUM(RC[-1]:R[2]C[-1])
# regardless of their actual position, making pattern detection trivial

Test Plan

  • Unit tests for A1 to R1C1 conversion (absolute/relative references, sheet names)
  • Integration tests with real Google Sheets API
  • Performance benchmarks
  • Backward compatibility (default format='A1' preserves existing behavior)

🤖 Created with assistance from Claude Code

jfenal and others added 6 commits February 23, 2026 18:40
Add comprehensive design document for R1C1 formula notation support:
- Problem statement and use case analysis
- API design with format parameter
- Conversion logic and performance benchmarks
- Testing strategy and success criteria
- Performance: 3.50μs per conversion (22% faster than basic approach)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add detailed step-by-step implementation plan:
- 7 tasks with TDD approach (test-first)
- Complete code examples for each step
- Unit and integration test specifications
- Documentation updates
- Success criteria and testing commands

Estimated implementation time: ~2 hours

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implement conversion function with pre-compiled regex for performance.
Handles relative, absolute, mixed references, ranges, and sheet qualifiers.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add optional format parameter ('A1' or 'R1C1') to get_sheet_formulas.
When format='R1C1', converts formulas to R1C1 notation for pattern analysis.
Includes input validation and error handling.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add integration tests using real Google Sheets API to verify:
- R1C1 conversion accuracy
- Non-formula cells preserved
- End-to-end behavior

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add documentation for new format parameter:
- Parameter description and valid values
- Example usage for both A1 and R1C1 formats
- Use case explanation for formula pattern analysis

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the get_sheet_formulas MCP tool to optionally return formulas normalized to R1C1 notation for cross-cell pattern analysis, adding supporting conversion utilities, tests, and documentation. It also introduces an additional authentication path via USER_ACCESS_TOKEN (“Token Relay Mode”).

Changes:

  • Add A1→R1C1 conversion utilities and an optional format parameter to get_sheet_formulas.
  • Add unit tests for conversion logic and mocked-format behavior tests for get_sheet_formulas.
  • Add docs/plan artifacts and a real-API integration test (conditionally skipped).

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/mcp_google_sheets/server.py Adds A1→R1C1 conversion + format param in get_sheet_formulas; also adds USER_ACCESS_TOKEN auth mode.
tests/test_formula_conversion.py Unit tests for column conversion and A1→R1C1 conversion behavior.
tests/test_get_sheet_formulas_format.py Tests for default A1 behavior, R1C1 conversion path, and invalid format validation.
tests/test_r1c1_integration.py Conditional real Google Sheets API integration tests for R1C1 output.
README.md Documents the new format parameter and adds usage examples.
docs/plans/2026-02-23-r1c1-formula-format.md Large implementation plan document for the feature.
docs/plans/2026-02-23-r1c1-formula-format-design.md Design document describing API change, approach, and testing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

COLUMN_LUT = _build_column_lut(1000)

# Pre-compile regex pattern for cell references (performance optimization)
CELL_REF_PATTERN = re.compile(r"(?:([^!]+!))?(\$)?([A-Z]+)(\$)?(\d+)")
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CELL_REF_PATTERN is broad enough to match non-cell tokens like function names that end with digits (e.g., LOG10), which would corrupt formulas during _a1_to_r1c1 conversion. Tighten the regex to require proper boundaries around A1 references (e.g., a negative lookbehind for [A-Z0-9_] before the column and/or a word boundary) so only real cell references are replaced.

Suggested change
CELL_REF_PATTERN = re.compile(r"(?:([^!]+!))?(\$)?([A-Z]+)(\$)?(\d+)")
CELL_REF_PATTERN = re.compile(
r"(?<![A-Z0-9_])(?:([^!]+!))?(\$)?([A-Z]+)(\$)?(\d+)\b(?!\()"
)

Copilot uses AI. Check for mistakes.
Comment on lines +433 to +445
# Parse the range to determine starting row/column
# Extract starting position from full_range
# Format: "Sheet1!B1:B10" or "Sheet1!B1" or "Sheet1"
range_match = re.search(r'!([A-Z]+)(\d+)', full_range)
if range_match:
start_col_letter = range_match.group(1)
start_row = int(range_match.group(2))
start_col = COLUMN_LUT.get(start_col_letter, _column_letter_to_number(start_col_letter))
else:
# If no range specified, assume starting at A1
start_row = 1
start_col = 1

Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The start position parsing for R1C1 conversion only handles ranges that begin with a cell address containing both column letters and a row number (e.g., B2). For valid A1 ranges like B:B, A:C, B2:B, or named ranges, range_match fails and the code assumes A1, which produces incorrect R1C1 offsets. Consider implementing a small A1 parser for the start of the range argument (defaulting missing row to 1 and missing column to A) and using that instead of re.search(r'!([A-Z]+)(\d+)').

Suggested change
# Parse the range to determine starting row/column
# Extract starting position from full_range
# Format: "Sheet1!B1:B10" or "Sheet1!B1" or "Sheet1"
range_match = re.search(r'!([A-Z]+)(\d+)', full_range)
if range_match:
start_col_letter = range_match.group(1)
start_row = int(range_match.group(2))
start_col = COLUMN_LUT.get(start_col_letter, _column_letter_to_number(start_col_letter))
else:
# If no range specified, assume starting at A1
start_row = 1
start_col = 1
# Parse the range to determine starting row/column.
# Use a small A1 parser that can handle ranges like "B:B", "A:C", "B2:B", etc.
# Strip sheet name if present (e.g. "Sheet1!B2:B10" -> "B2:B10").
if '!' in full_range:
_, a1_part = full_range.split('!', 1)
else:
a1_part = full_range
a1_part = a1_part.strip()
# Default to A1 if no range is specified or parsing fails.
start_row = 1
start_col = 1
if a1_part:
# Consider only the first reference segment before ":" or ",".
# Examples:
# "B2:B10" -> "B2"
# "A:C" -> "A"
# "B:B" -> "B"
first_ref = re.split(r'[:,]', a1_part, 1)[0]
ref_match = re.match(r'([A-Z]+)(\d*)', first_ref, re.IGNORECASE)
if ref_match:
start_col_letters = ref_match.group(1).upper()
row_str = ref_match.group(2)
start_col = COLUMN_LUT.get(
start_col_letters,
_column_letter_to_number(start_col_letters),
)
if row_str:
start_row = int(row_str)

Copilot uses AI. Check for mistakes.
Comment on lines +181 to +188
# Check for external OAuth token (Token Relay Mode) - highest priority
# This allows end-user authentication in containerized/OpenShift environments
if USER_ACCESS_TOKEN:
try:
print("Using external OAuth token (Token Relay Mode)")
# Create credentials from the provided access token
creds = Credentials(token=USER_ACCESS_TOKEN)
# Note: Token validation happens on first API call
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduces USER_ACCESS_TOKEN / “Token Relay Mode” authentication behavior in the server lifespan, which is unrelated to the PR title/description about R1C1 formula support. Please either remove/split this change into a separate PR, or update the PR description and documentation to cover the new authentication mode and its intended usage/security considerations.

Copilot uses AI. Check for mistakes.
Comment thread README.md
Comment on lines 204 to +206
* `sheet` (string): Name of the sheet/tab (e.g., "Sheet1").
* `range` (optional string): A1 notation (e.g., `'A1:C10'`, `'Sheet1!B2:D'`). If omitted, reads all formulas in the sheet/tab specified by `sheet`.
* `format` (optional string, default `'A1'`): Formula notation format.
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The range parameter docs include an example with a sheet-qualified A1 range ('Sheet1!B2:D') even though get_sheet_formulas also takes a separate sheet argument and unconditionally prefixes full_range = f"{sheet}!{range}". Passing a sheet-qualified range would produce an invalid Sheet1!Sheet1!... range. Update the README example(s) to reflect supported inputs (e.g., B2:D) or clarify/document a supported way to pass fully-qualified ranges.

Copilot uses AI. Check for mistakes.
Comment on lines +14 to +17
# Skip tests if no credentials available
pytestmark = pytest.mark.skipif(
not os.environ.get('SERVICE_ACCOUNT_PATH'),
reason="SERVICE_ACCOUNT_PATH not set"
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test will run (and make real external API calls) whenever SERVICE_ACCOUNT_PATH is set, which can make local/unit test runs unexpectedly slow/flaky and dependent on a mutable public spreadsheet. Consider requiring an additional explicit opt-in (e.g., RUN_INTEGRATION_TESTS=1) and/or marking it with a dedicated pytest marker so it only runs when intentionally requested.

Suggested change
# Skip tests if no credentials available
pytestmark = pytest.mark.skipif(
not os.environ.get('SERVICE_ACCOUNT_PATH'),
reason="SERVICE_ACCOUNT_PATH not set"
# Skip tests unless credentials and integration opt-in are available
pytestmark = pytest.mark.skipif(
not os.environ.get('SERVICE_ACCOUNT_PATH') or not os.environ.get('RUN_INTEGRATION_TESTS'),
reason="SERVICE_ACCOUNT_PATH or RUN_INTEGRATION_TESTS not set (integration tests disabled)"

Copilot uses AI. Check for mistakes.
Comment on lines +3 to +4
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This plan doc includes instructions addressed to a specific external tool ("For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans...") which is not actionable for repo users and is likely to become noise/outdated in documentation. Consider removing tool-specific meta-instructions and keeping the plan focused on implementation details for humans maintaining this repo.

Suggested change
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Copilot uses AI. Check for mistakes.
current_col: 1-based column number of the cell containing this formula

Returns:
Formula in R1C1 notation (e.g., "=SUM(R[-4]C[-1]:RC[0])")
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _a1_to_r1c1 docstring example shows RC[0] for a zero column offset, but the implementation emits C (and R) with no [0] when the offset is zero (e.g., RC, R[1]C). Update the example text to match actual output so callers/tests aren’t misled.

Suggested change
Formula in R1C1 notation (e.g., "=SUM(R[-4]C[-1]:RC[0])")
Formula in R1C1 notation (e.g., "=SUM(R[-4]C[-1]:RC)")

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,124 @@
"""Tests for A1 to R1C1 formula conversion."""

import pytest
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pytest is imported but not used in this test module. If this repo runs linting (flake8/ruff), this will fail; consider removing the unused import.

Suggested change
import pytest

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants