-
Notifications
You must be signed in to change notification settings - Fork 239
feat: add OpenShift demo scripts and documentation #446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add comprehensive demo toolkit for semantic router capabilities: - Interactive demo script (demo-semantic-router.py) with menu options: - Single classification (cache demo with fixed prompt) - All classifications (10 golden prompts) - PII detection test - Jailbreak detection test - Run all tests - Live log viewers: - live-semantic-router-logs.sh: Envoy traffic with routing decisions - live-classifier-logs.sh: Classification API activity - Demo utilities: - curl-examples.sh: Quick classification examples - cache-management.sh: Cache status and clearing - Documentation: - DEMO-README.md: Complete demo guide with setup instructions - CATEGORY-MODEL-MAPPING.md: Category to model routing reference All scripts use dynamic URL discovery from OpenShift routes (requires oc login). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]>
Replace static cluster routing with ORIGINAL_DST cluster type to fix routing bug where all requests were going to model_b_cluster. **Problem:** - Envoy evaluates routes BEFORE ExtProc filter runs - Header-based routing never matched because header wasn't set yet - All requests fell through to default route (model_b_cluster) - Router selected Model-A but Envoy routed to Model-B **Solution:** - Use ORIGINAL_DST cluster with use_http_header: true - Cluster reads x-gateway-destination-endpoint header set by ExtProc - Routes to correct endpoint (127.0.0.1:8000 or 8001) dynamically **Testing:** Verified with Envoy logs showing: - selected_model: Model-A, upstream_host: 127.0.0.1:8001 (WRONG - before fix) - After fix: destination determined by header value This aligns OpenShift config with local config/envoy.yaml approach. Signed-off-by: Yossi Ovadia <[email protected]>
Add interactive test showcasing Chain-of-Thought (CoT) reasoning vs standard routing: - 2 reasoning-enabled examples (math, chemistry with use_reasoning: true) - 1 reasoning-disabled example (history with use_reasoning: false) - Summary statistics showing success rates for each mode - Clear visual distinction between CoT and standard routing This helps demonstrate how the semantic router intelligently routes prompts that require multi-step reasoning vs factual queries. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]> Signed-off-by: Yossi Ovadia <[email protected]>
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
5 min demo here : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive OpenShift demo infrastructure to showcase semantic router capabilities in live environments. It provides interactive demonstrations of classification, routing, security features, and cache management.
- Simplified Envoy configuration using ORIGINAL_DST with header-based routing
- Interactive Python demo tool for testing classification and routing features
- Real-time log viewers with syntax highlighting and event filtering
- Cache management utilities and demo workflows
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
deploy/openshift/envoy-openshift.yaml | Simplified from static clusters to ORIGINAL_DST routing using x-gateway-destination-endpoint header |
deploy/openshift/demo/live-semantic-router-logs.sh | Real-time log viewer for semantic router events with colored output and filtering |
deploy/openshift/demo/live-classifier-logs.sh | Classification API focused log viewer for direct API testing |
deploy/openshift/demo/demo-semantic-router.py | Interactive Python demo tool with menu-driven tests and auto-discovery |
deploy/openshift/demo/demo-classification-results.json | Pre-recorded test results showing classification accuracy |
deploy/openshift/demo/curl-examples.sh | Command-line classification examples with dynamic URL discovery |
deploy/openshift/demo/cache-management.sh | Cache management utilities for demo purposes |
deploy/openshift/demo/DEMO-README.md | Comprehensive demo guide with setup instructions and troubleshooting |
deploy/openshift/demo/CATEGORY-MODEL-MAPPING.md | Documentation of category-to-model routing configuration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
||
2. Run classification test (first time - no cache): | ||
```bash | ||
python3 deploy/openshift/demo/demo-semantic-router.py |
Copilot
AI
Oct 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These references to running the demo script multiple times are correct, but the workflow description in lines 93-97 in cache-management.sh incorrectly references 'demo-classification-test.py' instead of 'demo-semantic-router.py'
Copilot uses AI. Check for mistakes.
|
||
3. Run classification test again (second time - with cache): | ||
```bash | ||
python3 deploy/openshift/demo/demo-semantic-router.py |
Copilot
AI
Oct 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These references to running the demo script multiple times are correct, but the workflow description in lines 93-97 in cache-management.sh incorrectly references 'demo-classification-test.py' instead of 'demo-semantic-router.py'
Copilot uses AI. Check for mistakes.
Signed-off-by: Yossi Ovadia <[email protected]>
dd1618d
to
af87950
Compare
Add comprehensive OpenShift demo infrastructure including interactive Python demo, curl examples, live logging scripts, cache management, and reasoning showcase (CoT vs standard routing).