Replies: 3 comments
-
|
Do you mean parsing speed in terms of token per second? |
Beta Was this translation helpful? Give feedback.
-
|
Great question on benchmarking! Here are some pointers from our experience at RevolutionAI (https://revolutionai.io): Key metrics to track:
Tools we have used:
Performance tip: If you are seeing slow TTFT, look into caching strategies. There is a recent paper called RAGCache that shows 4x TTFT improvement by caching intermediate embedding states. The key insight is storing retrieved knowledge in a tree structure across GPU/host memory. For document size impact, we typically see:
Happy to share more details if helpful! |
Beta Was this translation helpful? Give feedback.
-
|
Performance benchmarks are essential! At RevolutionAI (https://revolutionai.io) we benchmark RAG systems extensively. Key metrics:
Benchmark setup: import time
def benchmark_query(ragflow, queries, k=10):
latencies = []
for q in queries:
start = time.time()
results = ragflow.search(q, top_k=k)
latencies.append(time.time() - start)
return {
"p50": np.percentile(latencies, 50),
"p99": np.percentile(latencies, 99),
"qps": len(queries) / sum(latencies)
}Variables to test:
Would love to see official benchmarks! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi! I was wondering how I can a benchmark the performance of ragflow for a given deployment in terms of tokens/sec for given document sizes? Has anyone done something like this? Any pointers of help will be much appreciated.
Beta Was this translation helpful? Give feedback.
All reactions