Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 53 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

GoSQLX is a **production-ready**, **race-free**, high-performance SQL parsing SDK for Go that provides lexing, parsing, and AST generation with zero-copy optimizations. The library is designed for enterprise use with comprehensive object pooling for memory efficiency.

**Requirements**: Go 1.24+


### **Production Status**: ✅ **VALIDATED FOR PRODUCTION DEPLOYMENT** (v1.6.0+)
- **Thread Safety**: Confirmed race-free through comprehensive concurrent testing
- **Performance**: 1.38M+ operations/second sustained, up to 1.5M peak with memory-efficient object pooling
Expand All @@ -27,6 +30,7 @@ GoSQLX is a **production-ready**, **race-free**, high-performance SQL parsing SD
- **Errors** (`pkg/errors/`): Structured error handling system with error codes and position tracking
- **Metrics** (`pkg/metrics/`): Production performance monitoring and observability
- **Security** (`pkg/sql/security/`): SQL injection detection with pattern scanning and severity classification
- **Linter** (`pkg/linter/`): SQL linting engine with 10 built-in rules (L001-L010) for style enforcement
- **CLI** (`cmd/gosqlx/`): Production-ready command-line tool for SQL validation, formatting, and analysis
- **LSP** (`pkg/lsp/`): Language Server Protocol server for IDE integration (diagnostics, hover, completion, formatting)

Expand All @@ -42,7 +46,7 @@ The codebase uses extensive object pooling for performance optimization:
### Token Processing Flow

1. **Input**: Raw SQL bytes → `tokenizer.Tokenize()` → `[]models.TokenWithSpan`
2. **Conversion**: Token conversion → `parser.convertTokens()` → `[]token.Token`
2. **Conversion**: Token conversion → `parser.ConvertTokensForParser()` → `[]token.Token`
3. **Parsing**: Parser consumption → `parser.Parse()` → `*ast.AST`
4. **Cleanup**: Release pooled objects back to pools when done

Expand Down Expand Up @@ -129,6 +133,14 @@ task check
task test:race
```

### Pre-commit Hooks
The repository has pre-commit hooks that automatically run on every commit:
1. `go fmt` - Code formatting check
2. `go vet` - Static analysis
3. `go test -short` - Short test suite

If a commit fails pre-commit checks, fix the issues and retry the commit.

### Security
```bash
# Run security vulnerability scan
Expand Down Expand Up @@ -181,6 +193,14 @@ go run ./examples/cmd/example.go
go install github.com/ajitpratap0/GoSQLX/cmd/gosqlx@latest
```

### Additional Documentation
- `docs/GETTING_STARTED.md` - Quick start guide for new users
- `docs/USAGE_GUIDE.md` - Comprehensive usage guide
- `docs/LSP_GUIDE.md` - Complete LSP server documentation and IDE integration
- `docs/LINTING_RULES.md` - All 10 linting rules (L001-L010) reference
- `docs/CONFIGURATION.md` - Configuration file (.gosqlx.yml) guide
- `docs/SQL_COMPATIBILITY.md` - SQL dialect compatibility matrix

## Key Implementation Details

### Memory Management (CRITICAL FOR PERFORMANCE)
Expand Down Expand Up @@ -294,6 +314,12 @@ Tests are organized with comprehensive coverage (30+ test files, 6 benchmark fil

### Component-Specific Testing
```bash
# Run a single test by name
go test -v -run TestSpecificTestName ./pkg/sql/parser/

# Run tests matching a pattern
go test -v -run "TestParser_Window.*" ./pkg/sql/parser/

# Core library testing with race detection
go test -race ./pkg/sql/tokenizer/ -v
go test -race ./pkg/sql/parser/ -v
Expand Down Expand Up @@ -602,6 +628,32 @@ JOIN posts p USING (user_id)
WHERE p.published = true;
```

### PostgreSQL Extensions (v1.6.0) - Complete ✅
```sql
-- LATERAL JOIN - correlated subqueries in FROM clause
SELECT u.name, r.order_date FROM users u,
LATERAL (SELECT * FROM orders WHERE user_id = u.id ORDER BY order_date DESC LIMIT 3) r;

-- JSON/JSONB Operators (->/->>/#>/#>>/@>/<@/?/?|/?&/#-)
SELECT data->>'name' AS name, data->'address'->>'city' AS city FROM users;
SELECT * FROM products WHERE attributes @> '{"color": "red"}';
SELECT * FROM users WHERE profile ? 'email';

-- DISTINCT ON - PostgreSQL-specific row selection
SELECT DISTINCT ON (dept_id) dept_id, name, salary
FROM employees ORDER BY dept_id, salary DESC;

-- FILTER Clause - conditional aggregation (SQL:2003)
SELECT COUNT(*) FILTER (WHERE status = 'active') AS active_count,
SUM(amount) FILTER (WHERE type = 'credit') AS total_credits
FROM transactions;

-- RETURNING Clause - return modified rows
INSERT INTO users (name, email) VALUES ('John', '[email protected]') RETURNING id, created_at;
UPDATE products SET price = price * 1.1 WHERE category = 'Electronics' RETURNING id, price;
DELETE FROM sessions WHERE expired_at < NOW() RETURNING user_id;
```

### DDL and DML Operations - Complete ✅
```sql
-- Table operations
Expand Down
95 changes: 74 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -913,36 +913,89 @@ We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guid

## Roadmap

### Phase 1: Core SQL Enhancements (Q1 2025) - v1.1.0 ✅
<div align="center">

| Phase | Version | Status | Highlights |
|-------|---------|--------|------------|
| **Phase 1** | v1.1.0 | ✅ Complete | JOIN Support |
| **Phase 2** | v1.2.0 | ✅ Complete | CTEs & Set Operations |
| **Phase 2.5** | v1.3.0-v1.4.0 | ✅ Complete | Window Functions, MERGE, Grouping Sets |
| **Phase 3** | v1.5.0-v1.6.0 | ✅ Complete | PostgreSQL Extensions, LSP, Linter |
| **Phase 4** | v1.7.0 | 🚧 In Progress | MySQL & SQL Server Dialects |
| **Phase 5** | v2.0.0 | 📋 Planned | Query Intelligence & Optimization |
| **Phase 6** | v2.1.0 | 📋 Planned | Schema Awareness & Validation |

</div>

### Phase 1: Core SQL Enhancements - v1.1.0 ✅
- ✅ **Complete JOIN support** (INNER/LEFT/RIGHT/FULL OUTER/CROSS/NATURAL)
- ✅ **Proper join tree logic** with left-associative relationships
- ✅ **USING clause parsing** (single-column, multi-column planned for Phase 2)
- ✅ **Proper join tree logic** with left-associative relationships
- ✅ **USING clause parsing** for single and multi-column joins
- ✅ **Enhanced error handling** with contextual JOIN error messages
- ✅ **Comprehensive test coverage** (15+ JOIN scenarios including error cases)
- 🏗️ **CTE foundation laid** (AST structures, tokens, parser integration points)
- ✅ **Comprehensive test coverage** (15+ JOIN scenarios)

### Phase 2: CTE & Advanced Features (Q1 2025) - v1.2.0 ✅
### Phase 2: CTE & Set Operations - v1.2.0 ✅
- ✅ **Common Table Expressions (CTEs)** with RECURSIVE support
- ✅ **Set operations** (UNION/EXCEPT/INTERSECT with ALL modifier)
- ✅ **Left-associative set operation parsing**
- ✅ **CTE column specifications** and multiple CTE definitions
- ✅ **Integration of CTEs with set operations**
- ✅ **Enhanced error handling** with contextual messages
- ✅ **~70% SQL-92 compliance** achieved

### Phase 3: Dialect Specialization (Q1 2025) - v2.0.0
- 📋 PostgreSQL arrays, JSONB, custom types
- 📋 MySQL-specific syntax and functions
- 📋 SQL Server T-SQL extensions
- 📋 Multi-dialect parser with auto-detection

### Phase 4: Intelligence Layer (Q2 2025) - v2.1.0
- 📋 Query optimization suggestions
- 📋 Security vulnerability detection
- 📋 Performance analysis and hints
- 📋 Schema validation

See [ARCHITECTURE.md](docs/ARCHITECTURE.md) for detailed system design
### Phase 2.5: Window Functions & Advanced SQL - v1.3.0-v1.4.0 ✅
- ✅ **Window Functions** - Complete SQL-99 support (ROW_NUMBER, RANK, DENSE_RANK, NTILE, LAG, LEAD, FIRST_VALUE, LAST_VALUE)
- ✅ **Window Frames** - ROWS/RANGE with PRECEDING/FOLLOWING/CURRENT ROW
- ✅ **MERGE Statements** - SQL:2003 F312 with WHEN MATCHED/NOT MATCHED clauses
- ✅ **GROUPING SETS, ROLLUP, CUBE** - SQL-99 T431 advanced grouping
- ✅ **Materialized Views** - CREATE, REFRESH, DROP support
- ✅ **Expression Operators** - BETWEEN, IN, LIKE, IS NULL, NULLS FIRST/LAST
- ✅ **~75% SQL-99 compliance** achieved

### Phase 3: PostgreSQL Extensions & Developer Tools - v1.5.0-v1.6.0 ✅
- ✅ **LATERAL JOIN** - Correlated subqueries in FROM clause
- ✅ **JSON/JSONB Operators** - All 10 operators (`->`, `->>`, `#>`, `#>>`, `@>`, `<@`, `?`, `?|`, `?&`, `#-`)
- ✅ **DISTINCT ON** - PostgreSQL-specific row selection
- ✅ **FILTER Clause** - Conditional aggregation (SQL:2003 T612)
- ✅ **Aggregate ORDER BY** - ORDER BY inside STRING_AGG, ARRAY_AGG, etc.
- ✅ **RETURNING Clause** - Return modified rows from INSERT/UPDATE/DELETE
- ✅ **LSP Server** - Full Language Server Protocol with diagnostics, completion, hover, formatting
- ✅ **Linter Engine** - 10 built-in rules (L001-L010) with auto-fix
- ✅ **Security Scanner** - SQL injection detection with severity classification
- ✅ **Structured Errors** - Error codes E1001-E3004 with position tracking
- ✅ **CLI Enhancements** - Pipeline support, stdin detection, cross-platform
- ✅ **~80-85% SQL-99 compliance** achieved

### Phase 4: MySQL & SQL Server Dialects - v1.7.0 🚧
- 🚧 **MySQL Extensions** - AUTO_INCREMENT, REPLACE INTO, ON DUPLICATE KEY
- 📋 **MySQL Functions** - DATE_FORMAT, IFNULL, GROUP_CONCAT specifics
- 📋 **SQL Server T-SQL** - TOP, OFFSET-FETCH, OUTPUT clause
- 📋 **SQL Server Functions** - ISNULL, CONVERT, DATEPART specifics
- 📋 **Dialect Auto-Detection** - Automatic syntax detection from queries
- 📋 **Cross-Dialect Translation** - Convert between dialect syntaxes

### Phase 5: Query Intelligence & Optimization - v2.0.0 📋
- 📋 **Query Cost Estimation** - Complexity analysis and scoring
- 📋 **Index Recommendations** - Suggest indexes based on query patterns
- 📋 **Join Order Optimization** - Recommend optimal join sequences
- 📋 **Subquery Optimization** - Detect and suggest subquery improvements
- 📋 **N+1 Query Detection** - Identify inefficient query patterns
- 📋 **Performance Hints** - Actionable optimization suggestions

### Phase 6: Schema Awareness & Validation - v2.1.0 📋
- 📋 **Schema Definition Parsing** - Full DDL understanding
- 📋 **Type Checking** - Validate column types in expressions
- 📋 **Foreign Key Validation** - Verify relationship integrity
- 📋 **Constraint Checking** - NOT NULL, UNIQUE, CHECK validation
- 📋 **Schema Diff** - Compare and generate migration scripts
- 📋 **Entity-Relationship Extraction** - Generate ER diagrams from DDL

### Future Considerations 🔮
- 📋 **Stored Procedures** - CREATE PROCEDURE/FUNCTION parsing
- 📋 **Triggers** - CREATE TRIGGER support
- 📋 **PL/pgSQL** - PostgreSQL procedural language
- 📋 **Query Rewriting** - Automatic query transformation
- 📋 **WASM Support** - Browser-based SQL parsing

See [ARCHITECTURE.md](docs/ARCHITECTURE.md) for detailed system design and [CHANGELOG.md](CHANGELOG.md) for version history

## Community & Support

Expand Down
Loading
Loading