Skip to content

Conversation

yossiovadia
Copy link
Collaborator

Add comprehensive OpenShift demo infrastructure including interactive Python demo, curl examples, live logging scripts, cache management, and reasoning showcase (CoT vs standard routing).

yossiovadia and others added 3 commits October 14, 2025 12:27
Add comprehensive demo toolkit for semantic router capabilities:

- Interactive demo script (demo-semantic-router.py) with menu options:
  - Single classification (cache demo with fixed prompt)
  - All classifications (10 golden prompts)
  - PII detection test
  - Jailbreak detection test
  - Run all tests

- Live log viewers:
  - live-semantic-router-logs.sh: Envoy traffic with routing decisions
  - live-classifier-logs.sh: Classification API activity

- Demo utilities:
  - curl-examples.sh: Quick classification examples
  - cache-management.sh: Cache status and clearing

- Documentation:
  - DEMO-README.md: Complete demo guide with setup instructions
  - CATEGORY-MODEL-MAPPING.md: Category to model routing reference

All scripts use dynamic URL discovery from OpenShift routes (requires oc login).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
Replace static cluster routing with ORIGINAL_DST cluster type to fix
routing bug where all requests were going to model_b_cluster.

**Problem:**
- Envoy evaluates routes BEFORE ExtProc filter runs
- Header-based routing never matched because header wasn't set yet
- All requests fell through to default route (model_b_cluster)
- Router selected Model-A but Envoy routed to Model-B

**Solution:**
- Use ORIGINAL_DST cluster with use_http_header: true
- Cluster reads x-gateway-destination-endpoint header set by ExtProc
- Routes to correct endpoint (127.0.0.1:8000 or 8001) dynamically

**Testing:**
Verified with Envoy logs showing:
- selected_model: Model-A, upstream_host: 127.0.0.1:8001 (WRONG - before fix)
- After fix: destination determined by header value

This aligns OpenShift config with local config/envoy.yaml approach.

Signed-off-by: Yossi Ovadia <[email protected]>
Add interactive test showcasing Chain-of-Thought (CoT) reasoning vs
standard routing:

- 2 reasoning-enabled examples (math, chemistry with use_reasoning: true)
- 1 reasoning-disabled example (history with use_reasoning: false)
- Summary statistics showing success rates for each mode
- Clear visual distinction between CoT and standard routing

This helps demonstrate how the semantic router intelligently routes
prompts that require multi-step reasoning vs factual queries.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Signed-off-by: Yossi Ovadia <[email protected]>
Copy link

netlify bot commented Oct 15, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 16eed3c
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68f0219acc76f80008b41055
😎 Deploy Preview https://deploy-preview-446--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link

github-actions bot commented Oct 15, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 deploy

Owners: @rootfs, @Xunzhuo
Files changed:

  • deploy/openshift/demo/CATEGORY-MODEL-MAPPING.md
  • deploy/openshift/demo/DEMO-README.md
  • deploy/openshift/demo/cache-management.sh
  • deploy/openshift/demo/curl-examples.sh
  • deploy/openshift/demo/demo-classification-results.json
  • deploy/openshift/demo/demo-semantic-router.py
  • deploy/openshift/demo/live-classifier-logs.sh
  • deploy/openshift/demo/live-semantic-router-logs.sh
  • deploy/openshift/envoy-openshift.yaml

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@yossiovadia
Copy link
Collaborator Author

5 min demo here :

@rootfs rootfs requested a review from Copilot October 15, 2025 18:07
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive OpenShift demo infrastructure to showcase semantic router capabilities in live environments. It provides interactive demonstrations of classification, routing, security features, and cache management.

  • Simplified Envoy configuration using ORIGINAL_DST with header-based routing
  • Interactive Python demo tool for testing classification and routing features
  • Real-time log viewers with syntax highlighting and event filtering
  • Cache management utilities and demo workflows

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
deploy/openshift/envoy-openshift.yaml Simplified from static clusters to ORIGINAL_DST routing using x-gateway-destination-endpoint header
deploy/openshift/demo/live-semantic-router-logs.sh Real-time log viewer for semantic router events with colored output and filtering
deploy/openshift/demo/live-classifier-logs.sh Classification API focused log viewer for direct API testing
deploy/openshift/demo/demo-semantic-router.py Interactive Python demo tool with menu-driven tests and auto-discovery
deploy/openshift/demo/demo-classification-results.json Pre-recorded test results showing classification accuracy
deploy/openshift/demo/curl-examples.sh Command-line classification examples with dynamic URL discovery
deploy/openshift/demo/cache-management.sh Cache management utilities for demo purposes
deploy/openshift/demo/DEMO-README.md Comprehensive demo guide with setup instructions and troubleshooting
deploy/openshift/demo/CATEGORY-MODEL-MAPPING.md Documentation of category-to-model routing configuration

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.


2. Run classification test (first time - no cache):
```bash
python3 deploy/openshift/demo/demo-semantic-router.py
Copy link

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These references to running the demo script multiple times are correct, but the workflow description in lines 93-97 in cache-management.sh incorrectly references 'demo-classification-test.py' instead of 'demo-semantic-router.py'

Copilot uses AI. Check for mistakes.


3. Run classification test again (second time - with cache):
```bash
python3 deploy/openshift/demo/demo-semantic-router.py
Copy link

Copilot AI Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These references to running the demo script multiple times are correct, but the workflow description in lines 93-97 in cache-management.sh incorrectly references 'demo-classification-test.py' instead of 'demo-semantic-router.py'

Copilot uses AI. Check for mistakes.

@rootfs rootfs merged commit 9ff50de into vllm-project:main Oct 16, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants