Skip to content

Commit

Permalink
Merge pull request #27 from ittia-research/dev
Browse files Browse the repository at this point in the history
Update docs and infra
  • Loading branch information
etwk authored Sep 10, 2024
2 parents dcac942 + b1279d7 commit 80e6c3c
Show file tree
Hide file tree
Showing 4 changed files with 144 additions and 96 deletions.
110 changes: 26 additions & 84 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,36 @@ True, false, or just opinions? Maybe not binary, but a percentage.

Fact-checking tools to combat disinformation.

## How It Works
Extract a list of statements from given text.
For each statement search via search engine and read the top URLs.
For each hostname as one source, extract most related info from the read content.
For each sources, generate one verdict and citation with the extracted content.
Combine all verdicts of one statements into a final verdict.
Return a list of statements with verdicts, citations and others related.

## Get Started
Online demo: `https://check.ittia.net`
Online demo: https://check.ittia.net

Using existing API: https://github.com/ittia-research/check/tree/main/packages/ittia_check

Use pip package `ittia-check` to connect to API: https://github.com/ittia-research/check/tree/main/packages/ittia_check
### Self-hosting API Server
Main components:
- Check server: see docker-compose.yml
- LLM: any OpenAI compatible API, self-hosting via vllm or Ollama
- Embedding: self-hosting via Ollama or Infinity
- Rerank: self-hosting via Infinity
- Search: https://search.ittia.net

API docs: `https://check.ittia.net/docs`
### Other Tools
- Start a wiki_dpr retrieval server (ColBERTv2) for development: https://github.com/ittia-research/check/tree/main/datasets/wiki_dpr

### Search backend
- Using `search.ittia.net` for better optimization.
- API doc: `https://search.ittia.net/docs`
- Features:
- Customizable source count.
- Supports search sessions: streaming, resuming.
- Utilizes state-of-the-art search engine (currently Google).
- Using `search.ittia.net` for better optimization.
- Features:
- Customizable source count.
- Supports search sessions: streaming, resuming.
- Utilizes state-of-the-art search engine (currently Google).

## Design
Input something.
Expand All @@ -40,88 +56,14 @@ Input types:
Verdicts:
- false
- true
- tie: false and true verdicts counts are the same and above zero
- irrelevant: context processed irrelevant to the statement

## Todo
### Frontend
- [ ] API: Input string or url, output analysis
- [ ] Optional more detailed output: correction, explanation, references

### Backend
- [ ] Get list of facts from input, improve performance
- [ ] Get search results of each facts and check if they are true or false
- [ ] Get weight of facts and opinions
- [ ] Compare different search engines.
- [ ] Add support for URL input
- [ ] Performance benchmark.

LLM
- [ ] Better way to handle LLM output formatting: list, JSON.

Embedding:
- [ ] chunk size optimize

Contexts
- [ ] Filter out non-related contexts before send for verdict

Retrieval
- [ ] Retrieve the latest info when facts might change

### pipeline
DSPy:
- [ ] choose the right LLM temperature
- [ ] better training datasets

### Retrival
- [ ] Better retrieval solution: high performance, concurrency, multiple index, index editable.
- [ ] Getting more sources when needed.

### Verdict
- [ ] Set final verdict standards.

### Toolchain
- [ ] Evaluate MLOps pipeline
- https://kitops.ml
- [ ] Evaluate data quality of searching and url fetching. Better error handle.
- [ ] Use multiple sources for fact-check.

### Stability
- [ ] Stress test.

### Extend
- [ ] To other types of media: image, audio, video, etc.
- [ ] Shall we try to answer questions if provided.
- [ ] Multi-language support.
- [ ] Add logging and long-term memory.
- [ ] Integrate with other fact-check services.

### Calculate
- [ ] Shall we calculate percentage of true and false in the input? Any better calculation than items count?

### Logging
- [ ] Full logging on chain of events for re-producing and debugging.

### Checkout
- [ ] Chroma #retrieve

## Issues
- [ ] Uses many different types of models, difficult for performance optimization and maintenance.
- [ ] LLM verdict wrong contradict to context provided.

## References
### Reports
- [ ] AI-generated misinformation

### Fact-check
- https://www.snopes.com
- https://www.bmi.bund.de/SharedDocs/schwerpunkte/EN/disinformation/examples-of-russian-disinformation-and-the-facts.html

### Resources
Inference
- https://console.groq.com/docs/ (free tier)
Search and fetch:
- https://jina.ai/read

## Acknowledgements
- TPU Research Cloud team at Google
- Google Search
Expand Down
44 changes: 35 additions & 9 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,7 @@ services:
- ./infra/env.d/check
ports:
- 8000:8000
restart: always

ollama:
image: ollama/ollama
container_name: ollama
ports:
- "11434:11434"
volumes:
- /data/volumes/ollama:/root/.ollama
# Remove the GPU part belwo if not inferencing locally
deploy:
resources:
reservations:
Expand All @@ -26,6 +18,21 @@ services:
capabilities: [gpu]
restart: always

# Using vllm for LLM inference on Google TPU
# Tested on Google v4-8
vllm:
image: ittia/vllm:0.6.0-tpu
privileged: true
ports:
- "8010:8010"
shm_size: 128G
volumes:
- /mnt/cache:/root/.cache
env_file:
- ./env.d/huggingface
command: vllm serve mistralai/Mistral-Nemo-Instruct-2407 --tensor-parallel-size 4 --port 8010 --trust-remote-code --max-model-len 12288
restart: always

# Infinity supports embedding and rerank models, v2 version supports serving multiple models
infinity:
image: michaelf34/infinity:latest
Expand All @@ -46,3 +53,22 @@ services:
count: all
capabilities: [gpu]
restart: always

# Services below are not actively in use.
# Keeps here for reference.

ollama:
image: ollama/ollama
container_name: ollama
ports:
- "11434:11434"
volumes:
- /data/volumes/ollama:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
restart: always
78 changes: 78 additions & 0 deletions docs/to-do.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
## Roadmap
- [ ] Check one line of single statement.
- [ ] Check long paragraphs or a content source (URL, file, etc.)
- [ ] What's the ultimate goals?
- [ ] Long-term memory and caching.
- [ ] Fact-check standards and database.

## Work
### Frontend
- [ ] API: Input string or url, output analysis
- [ ] Optional more detailed output: correction, explanation, references

### Backend
- [ ] Get list of facts from input, improve performance
- [ ] Get search results of each facts and check if they are true or false
- [ ] Get weight of facts and opinions
- [ ] Compare different search engines.
- [ ] Add support for URL input
- [ ] Performance benchmark.

LLM
- [ ] Better way to handle LLM output formatting: list, JSON.

Embedding:
- [ ] chunk size optimize

Contexts
- [ ] Filter out non-related contexts before send for verdict

Retrieval
- [ ] Retrieve the latest info when facts might change

### pipeline
DSPy
- [ ] choose the right LLM temperature
- [ ] better training datasets

### Retrieval
- [ ] Better retrieval solution: high performance, concurrency, multiple index, index editable.
- [ ] Getting more sources when needed.

### Verdict
- [ ] Set final verdict standards.

### Toolchain
- [ ] Evaluate MLOps pipeline: https://kitops.ml
- [ ] Evaluate data quality of searching and url fetching. Better error handle.
- [ ] Use multiple sources for fact-check.

### Infra
- [ ] Stress test
- [ ] Meaningful health endpoint
- [ ] Monitoring service health

### Calculate
- [ ] Shall we calculate percentage of true and false in the input? Any better calculation than items count?

### Logging
- [ ] Full logging on chain of events for re-producing and debugging.

### Issues
- [ ] Uses many different types of models, difficult for performance optimization and maintenance.
- [ ] LLM verdict wrong contradict to context provided.

### Data
- [ ] A standard on save fact-check related data.
- [ ] Store fact-check data with standards.

### Research
- [ ] Chroma #retrieve
- [ ] AI-generated misinformation

### Extend
- [ ] To other types of media: image, audio, video, etc.
- [ ] Shall we try to answer questions if provided.
- [ ] Multi-language support.
- [ ] Add logging and long-term memory.
- [ ] Integrate with other fact-check services.
8 changes: 5 additions & 3 deletions infra/env.d/infinity
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
INFINITY_API_KEY=<CHANGE_ME>
INFINITY_LOG_LEVEL=trace
INFINITY_MODEL_ID=jinaai/jina-reranker-v2-base-multilingual

INFINITY_LOG_LEVEL=debug
INFINITY_MODEL_ID="jinaai/jina-embeddings-v2-base-en;jinaai/jina-reranker-v2-base-multilingual;"
INFINITY_MODEL_WARMUP=false
# batch size: small to save VRAM, big to improve performance
INFINITY_BATCH_SIZE=8

0 comments on commit 80e6c3c

Please sign in to comment.