
🚀 Dynamic text classification with continuous learning, strategic defense, and zero-downtime adaptation
- 📚 HuggingFace Organization - Pre-trained models and datasets
- 📖 Articles & Tutorials:
Adaptive Classifier is a PyTorch-based machine learning library that revolutionizes text classification with continuous learning, dynamic class addition, and strategic defense against adversarial inputs. Built on HuggingFace transformers, it enables zero-downtime model updates and enterprise-grade robustness.
- 🚀 Universal Compatibility - Works with any HuggingFace transformer model
- 📈 Continuous Learning - Add new examples without catastrophic forgetting
- 🔄 Dynamic Classes - Add new classes at runtime without retraining
- ⚡ Zero Downtime - Update models in production without service interruption
- 🎮 Strategic Classification - Game-theoretic defense against adversarial manipulation
- 🔒 Anti-Gaming Protection - Robust predictions under strategic behavior
- ⚖️ Multiple Prediction Modes - Regular, strategic, and robust inference options
- 💾 Prototype Memory - FAISS-powered efficient similarity search
- 🔬 Adaptive Neural Layer - Trainable classification head with EWC protection
- 🎯 Hybrid Predictions - Combines prototype similarity and neural network outputs
- 📊 HuggingFace Integration - Push/pull models directly from the Hub
Tested on adversarial examples from AI-Secure/adv_glue dataset:
Metric | Regular Classifier | Strategic Classifier | Improvement |
---|---|---|---|
Clean Data Accuracy | 80.00% | 82.22% | +2.22% |
Adversarial Data Accuracy | 60.00% | 82.22% | +22.22% |
Robustness (vs attack) | -20.00% drop | 0.00% drop | Perfect |
Evaluated on RAGTruth benchmark across multiple task types:
Task Type | Precision | Recall | F1 Score |
---|---|---|---|
QA | 35.50% | 45.11% | 39.74% |
Summarization | 22.18% | 96.91% | 36.09% |
Data-to-Text | 65.00% | 100.0% | 78.79% |
Overall | 40.89% | 80.68% | 51.54% |
Tested on arena-hard-auto-v0.1 dataset (500 queries):
Metric | Without Adaptation | With Adaptation | Improvement |
---|---|---|---|
Cost Savings | 25.60% | 32.40% | +6.80% |
Efficiency Ratio | 1.00x | 1.27x | +27% |
Resource Utilization | Standard | Optimized | Better |
Key Insight: Adaptive classification maintains quality while significantly improving cost efficiency and robustness across all tested scenarios.
pip install adaptive-classifier
# Clone the repository
git clone https://github.com/codelion/adaptive-classifier.git
cd adaptive-classifier
# Install in development mode
pip install -e .
# Install test dependencies (optional)
pip install pytest pytest-cov pytest-randomly
Get started with adaptive classification in under 30 seconds:
from adaptive_classifier import AdaptiveClassifier
# 🎯 Step 1: Initialize with any HuggingFace model
classifier = AdaptiveClassifier("bert-base-uncased")
# 📝 Step 2: Add training examples
texts = ["The product works great!", "Terrible experience", "Neutral about this purchase"]
labels = ["positive", "negative", "neutral"]
classifier.add_examples(texts, labels)
# 🔮 Step 3: Make predictions
predictions = classifier.predict("This is amazing!")
print(predictions)
# Output: [('positive', 0.85), ('neutral', 0.12), ('negative', 0.03)]
# Save locally
classifier.save("./my_classifier")
loaded_classifier = AdaptiveClassifier.load("./my_classifier")
# 🤗 HuggingFace Hub Integration
classifier.push_to_hub("adaptive-classifier/my-model")
hub_classifier = AdaptiveClassifier.from_pretrained("adaptive-classifier/my-model")
# Enable strategic classification for adversarial robustness
config = {'enable_strategic_mode': True}
strategic_classifier = AdaptiveClassifier("bert-base-uncased", config=config)
# Robust predictions against manipulation
predictions = strategic_classifier.predict("This product has amazing quality features!")
# Returns predictions that consider potential gaming attempts
# Add a completely new class
new_texts = [
"Error code 404 appeared",
"System crashed after update"
]
new_labels = ["technical"] * 2
classifier.add_examples(new_texts, new_labels)
# Add more examples to existing classes
more_examples = [
"Best purchase ever!",
"Highly recommend this"
]
more_labels = ["positive"] * 2
classifier.add_examples(more_examples, more_labels)
# Enable strategic mode to defend against adversarial inputs
config = {
'enable_strategic_mode': True,
'cost_function_type': 'linear',
'cost_coefficients': {
'sentiment_words': 0.5, # Cost to change sentiment-bearing words
'length_change': 0.1, # Cost to modify text length
'word_substitution': 0.3 # Cost to substitute words
},
'strategic_blend_regular_weight': 0.6, # Weight for regular predictions
'strategic_blend_strategic_weight': 0.4 # Weight for strategic predictions
}
classifier = AdaptiveClassifier("bert-base-uncased", config=config)
classifier.add_examples(texts, labels)
# Robust predictions that consider potential manipulation
text = "This product has amazing quality features!"
# Dual prediction (automatic blend of regular + strategic)
predictions = classifier.predict(text)
# Pure strategic prediction (simulates adversarial manipulation)
strategic_preds = classifier.predict_strategic(text)
# Robust prediction (assumes input may already be manipulated)
robust_preds = classifier.predict_robust(text)
print(f"Dual: {predictions}")
print(f"Strategic: {strategic_preds}")
print(f"Robust: {robust_preds}")
Detect when LLMs generate information not supported by provided context (51.54% F1, 80.68% recall):
detector = AdaptiveClassifier.from_pretrained("adaptive-classifier/llm-hallucination-detector")
context = "France is in Western Europe. Capital: Paris. Population: ~67 million."
response = "Paris is the capital. Population is 70 million." # Contains hallucination
prediction = detector.predict(f"Context: {context}\nAnswer: {response}")
# Returns: [('HALLUCINATED', 0.72), ('NOT_HALLUCINATED', 0.28)]
Optimize costs by routing queries to appropriate model tiers (32.40% cost savings):
router = AdaptiveClassifier.from_pretrained("adaptive-classifier/llm-router")
query = "Write a function to calculate Fibonacci sequence"
predictions = router.predict(query)
# Returns: [('HIGH', 0.92), ('LOW', 0.08)]
# Route to GPT-4 for complex tasks, GPT-3.5 for simple ones
Automatically predict optimal LLM settings (temperature, top_p) for different query types:
config_optimizer = AdaptiveClassifier.from_pretrained("adaptive-classifier/llm-config-optimizer")
query = "Explain quantum physics concepts"
predictions = config_optimizer.predict(query)
# Returns: [('BALANCED', 0.85), ('CREATIVE', 0.10), ...]
# Automatically suggests temperature range: 0.6-1.0 for balanced responses
Deploy enterprise-ready classifiers for various moderation tasks:
# Available pre-trained enterprise classifiers:
classifiers = [
"adaptive-classifier/content-moderation", # Content safety
"adaptive-classifier/business-sentiment", # Business communications
"adaptive-classifier/pii-detection", # Privacy protection
"adaptive-classifier/fraud-detection", # Financial security
"adaptive-classifier/email-priority", # Email routing
"adaptive-classifier/compliance-classification" # Regulatory compliance
]
# Easy deployment
moderator = AdaptiveClassifier.from_pretrained("adaptive-classifier/content-moderation")
result = moderator.predict("User generated content here...")
💡 Pro Tip: All enterprise models support continuous adaptation - add your domain-specific examples to improve performance over time.
The Adaptive Classifier combines four key components in a unified architecture:
-
Transformer Embeddings: Uses state-of-the-art language models for text representation
-
Prototype Memory: Maintains class prototypes for quick adaptation to new examples
-
Adaptive Neural Layer: Learns refined decision boundaries through continuous training
-
Strategic Classification: Defends against adversarial manipulation using game-theoretic principles. When strategic mode is enabled, the system:
- Models potential strategic behavior of users trying to game the classifier
- Uses cost functions to represent the difficulty of manipulating different features
- Combines regular predictions with strategic-aware predictions for robustness
- Provides multiple prediction modes: dual (blended), strategic (simulates manipulation), and robust (anti-manipulation)
Traditional classification approaches face significant limitations when dealing with evolving requirements and adversarial environments:
The Adaptive Classifier overcomes these limitations through:
- Dynamic class addition without full retraining
- Strategic robustness against adversarial manipulation
- Memory-efficient prototypes with FAISS optimization
- Zero downtime updates for production systems
- Game-theoretic defense mechanisms
The system evolves through distinct phases, each building upon previous knowledge without catastrophic forgetting:
The learning process includes:
- Initial Training: Bootstrap with basic classes
- Dynamic Addition: Seamlessly add new classes as they emerge
- Continuous Learning: Refine decision boundaries with EWC protection
- Strategic Enhancement: Develop robustness against manipulation
- Production Deployment: Full capability with ongoing adaptation
When using the adaptive classifier for true online learning (adding examples incrementally), be aware that the order in which examples are added can affect predictions. This is inherent to incremental neural network training.
# These two scenarios may produce slightly different models:
# Scenario 1
classifier.add_examples(["fish example"], ["aquatic"])
classifier.add_examples(["bird example"], ["aerial"])
# Scenario 2
classifier.add_examples(["bird example"], ["aerial"])
classifier.add_examples(["fish example"], ["aquatic"])
While we've implemented sorted label ID assignment to minimize this effect, the neural network component still learns incrementally, which can lead to order-dependent behavior.
For applications requiring strict order independence, you can configure the classifier to rely solely on prototype-based predictions:
# Configure to use only prototypes (order-independent)
config = {
'prototype_weight': 1.0, # Use only prototypes
'neural_weight': 0.0 # Disable neural network contribution
}
classifier = AdaptiveClassifier("bert-base-uncased", config=config)
With this configuration:
- Predictions are based solely on similarity to class prototypes (mean embeddings)
- Results are completely order-independent
- Trade-off: May have slightly lower accuracy than the hybrid approach
- For maximum consistency: Use prototype-only configuration
- For maximum accuracy: Accept some order dependency with the default hybrid approach
- For production systems: Consider batching updates and retraining periodically if strict consistency is required
- Model selection matters: Some models (e.g.,
google-bert/bert-large-cased
) may produce poor embeddings for single words. For better results with short inputs, consider:bert-base-uncased
sentence-transformers/all-MiniLM-L6-v2
- Or any model specifically trained for semantic similarity
- OpenEvolve - Open-source evolutionary coding agent for algorithm discovery
- OptiLLM - Optimizing inference proxy with 20+ techniques for 2-10x accuracy improvements
- 🐛 Issues & Bug Reports: GitHub Issues
- 💬 Discussions: GitHub Discussions
- 📖 Documentation: API Reference
- 🛠️ Contributing: CONTRIBUTING.md
- Strategic Classification
- RouteLLM: Learning to Route LLMs with Preference Data
- Transformer^2: Self-adaptive LLMs
- Lamini Classifier Agent Toolkit
- Protoformer: Embedding Prototypes for Transformers
- Overcoming catastrophic forgetting in neural networks
- RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models
- LettuceDetect: A Hallucination Detection Framework for RAG Applications
If you use this library in your research, please cite:
@software{adaptive-classifier,
title = {Adaptive Classifier: Dynamic Text Classification with Continuous Learning},
author = {Asankhaya Sharma},
year = {2025},
publisher = {GitHub},
url = {https://github.com/codelion/adaptive-classifier}
}
Made with ❤️ by Adaptive Classifier Team