Setting up sentiment and emotion analysis for Discourse AI posts via self-hosted HuggingFace Text Embeddings Inference (TEI).
What This Is
Sentiment in Discourse AI is not a chat/completion LLM. Under the hood, two small classification RoBERTa models (~125M parameters each) run through HuggingFace TEI. The model names are hardcoded in SQL dashboard queries in Discourse — they cannot be changed.
Source: Self-Hosting Sentiment and Emotion for DiscourseAI (Falco, Discourse team).
| Model | model_name (exactly as in code) | Purpose |
|---|---|---|
| Sentiment | cardiffnlp/twitter-roberta-base-sentiment-latest |
positive / negative / neutral |
| Emotion | SamLowe/roberta-base-go_emotions |
28 emotions (joy, anger, surprise…) |
API format: POST {"inputs": "text", "truncate": true} → array [{"label": "...", "score": 0.95}, ...]
Special Note: The cardiffnlp model lacks tokenizer.json
TEI requires tokenizer.json, but cardiffnlp/twitter-roberta-base-sentiment-latest does not (old format: vocab.json + merges.txt). Solution: download the model files locally and add tokenizer.json from SamLowe/roberta-base-go_emotions (same RoBERTa-base tokenizer).
Preparation (one-time)
sudo mkdir -p /opt/tei-sentiment-cache/model
cd /opt/tei-sentiment-cache/model
for f in config.json vocab.json merges.txt special_tokens_map.json pytorch_model.bin; do
sudo curl -sL -o "$f" "https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest/resolve/main/$f"
done
sudo curl -sL -o tokenizer.json "https://huggingface.co/SamLowe/roberta-base-go_emotions/resolve/main/tokenizer.json"
sudo curl -sL -o tokenizer_config.json "https://huggingface.co/SamLowe/roberta-base-go_emotions/resolve/main/tokenizer_config.json"
Option A: GPU
Image: ghcr.io/huggingface/text-embeddings-inference:cuda-1.9.3
The standard
:latest(and:1.9) tag is compiled for compute cap 80 (Ampere) and does not work on Blackwell (RTX 50x0, compute cap 120). Usecuda-1.9.3specifically.
docker pull ghcr.io/huggingface/text-embeddings-inference:cuda-1.9.3
sudo mkdir -p /opt/tei-emotion-cache
docker run -d --name tei-sentiment --gpus all --shm-size 1g -p 8081:80 -v /opt/tei-sentiment-cache/model:/data/model --restart unless-stopped ghcr.io/huggingface/text-embeddings-inference:cuda-1.9.3 --model-id /data/model
docker run -d --name tei-emotion --gpus all --shm-size 1g -p 8082:80 -v /opt/tei-emotion-cache:/data --restart unless-stopped ghcr.io/huggingface/text-embeddings-inference:cuda-1.9.3 --model-id SamLowe/roberta-base-go_emotions
First run on Blackwell: CUDA kernel JIT-compilation takes ~5 minutes per container. This is one-time only.
GPU Performance (RTX 5060 Ti)
| Metric | Value |
|---|---|
| Sentiment inference | ~14ms |
| Emotion inference | ~60ms |
| VRAM per container | ~428 MB |
| VRAM for both | ~856 MB |
Option B: CPU (fallback)
Image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.9
Useful if GPU is unavailable or VRAM is insufficient. Does not require NVIDIA drivers.
docker pull ghcr.io/huggingface/text-embeddings-inference:cpu-1.9
sudo mkdir -p /opt/tei-emotion-cache
docker run -d --name tei-sentiment --shm-size 1g -p 8081:80 -v /opt/tei-sentiment-cache/model:/data/model --restart unless-stopped ghcr.io/huggingface/text-embeddings-inference:cpu-1.9 --model-id /data/model --dtype float32
docker run -d --name tei-emotion --shm-size 1g -p 8082:80 -v /opt/tei-emotion-cache:/data --restart unless-stopped ghcr.io/huggingface/text-embeddings-inference:cpu-1.9 --model-id SamLowe/roberta-base-go_emotions --dtype float32
```When LA is high, you can limit CPU:
```bash
docker update --cpus=0.1 tei-sentiment tei-emotion
CPU Performance
| Metric | Value |
|---|---|
| Sentiment inference | ~270ms |
| Emotion inference | ~205ms |
| RAM per container | ~500 MB |
| With --cpus=0.1 | ~2-3s per post |
Switching GPU ↔ CPU
docker stop tei-sentiment tei-emotion
docker rm tei-sentiment tei-emotion
Then start the containers as needed. You don’t need to change Discourse settings — endpoints remain the same.
Verification
curl -s http://localhost:8081/ -X POST -H 'Content-Type: application/json' -d '{"inputs": "I am happy"}'
curl -s http://localhost:8082/ -X POST -H 'Content-Type: application/json' -d '{"inputs": "I am happy"}'
Expected sentiment response: [{"label":"positive","score":0.96},...]
Discourse Configuration
In /admin/plugins/discourse-ai/settings?filter=sentiment:
- discourse_ai_enabled = true
- ai_sentiment_enabled = true
- ai_sentiment_model_configs — two objects:
| Field | Model 1 | Model 2 |
|---|---|---|
| model_name | cardiffnlp/twitter-roberta-base-sentiment-latest |
SamLowe/roberta-base-go_emotions |
| endpoint | http://<your-host>:8081 |
http://<your-host>:8082 |
| api_key | (empty) | (empty) |
Dashboards
/admin/reports/overall_sentiment— overall sentiment (positive - negative)/admin/reports/emotion_joy(and other 27 emotions)- Backfill: ~2500 posts/hour, posts not older than 60 days
Conditions and Risks
- Models are trained on English. For Russian text, results are approximate, but basic sentiment works.
- Endpoint is open without API key — for production, close it behind a reverse proxy.
- VRAM monitoring:
nvidia-smi --query-compute-apps=pid,name,used_memory --format=csv,noheader