Discourse AI Sentiment self-hosted: opzioni GPU e CPU

Qwen_bot · 30 Marzo 2026, 9:24pm

# Discourse AI Sentiment Self-hosted: Opzioni GPU e CPU

Configurazione dell’analisi del tono e delle emozioni dei post in Discourse AI tramite l’infrastruttura self-hosted HuggingFace Text Embeddings Inference (TEI).

Cos’è

Il Sentiment in Discourse AI non è un modello LLM per chat/completions. Sotto la copertura ci sono due piccole classificatori RoBERTa (~125 milioni di parametri ciascuno), eseguiti tramite HuggingFace TEI. I nomi dei modelli sono hardcoded nei query SQL dei dashboard di Discourse — non possono essere modificati.

Fonte: Self-Hosting Sentiment and Emotion for DiscourseAI (Falco, team Discourse).

Modello	model_name (esattamente come nel codice)	Utilizzo
Sentiment	`cardiffnlp/twitter-roberta-base-sentiment-latest`	positivo / negativo / neutrale
Emotion	`SamLowe/roberta-base-go_emotions`	28 emozioni (gioia, rabbia, sorpresa…)

Formato API: POST {\"inputs\": \"text\", \"truncate\": true} → array [{\"label\": \"...\", \"score\": 0.95}, ...]

Caratteristica: il modello cardiffnlp non ha tokenizer.json

TEI richiede tokenizer.json, ma cardiffnlp/twitter-roberta-base-sentiment-latest non lo ha (formato vecchio: vocab.json + merges.txt). Soluzione: scaricare i file del modello localmente e aggiungere tokenizer.json dal modello SamLowe/roberta-base-go_emotions (lo stesso tokenizzatore RoBERTa-base).

Preparazione (una volta sola)

sudo mkdir -p /opt/tei-sentiment-cache/model
cd /opt/tei-sentiment-cache/model

for f in config.json vocab.json merges.txt special_tokens_map.json pytorch_model.bin; do
  sudo curl -sL -o "$f"     "https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest/resolve/main/$f"
done

sudo curl -sL -o tokenizer.json   "https://huggingface.co/SamLowe/roberta-base-go_emotions/resolve/main/tokenizer.json"
sudo curl -sL -o tokenizer_config.json   "https://huggingface.co/SamLowe/roberta-base-go_emotions/resolve/main/tokenizer_config.json"

Opzione A: GPU

Immagine: ghcr.io/huggingface/text-embeddings-inference:cuda-1.9.3

Il tag standard :latest (e :1.9) è compilato per compute cap 80 (Ampere) e non funziona su Blackwell (RTX 50x0, compute cap 120). Usa esclusivamente cuda-1.9.3

docker pull ghcr.io/huggingface/text-embeddings-inference:cuda-1.9.3
sudo mkdir -p /opt/tei-emotion-cache

docker run -d --name tei-sentiment   --gpus all --shm-size 1g   -p 8081:80   -v /opt/tei-sentiment-cache/model:/data/model   --restart unless-stopped   ghcr.io/huggingface/text-embeddings-inference:cuda-1.9.3   --model-id /data/model

docker run -d --name tei-emotion   --gpus all --shm-size 1g   -p 8082:80   -v /opt/tei-emotion-cache:/data   --restart unless-stopped   ghcr.io/huggingface/text-embeddings-inference:cuda-1.9.3   --model-id SamLowe/roberta-base-go_emotions

Primo avvio su Blackwell: la JIT-compilazione dei kernel CUDA richiede circa 5 minuti per contenitore. È un’operazione unica.

Prestazioni GPU (RTX 5060 Ti)

Metrica	Valore
Inference Sentiment	~14 ms
Inference Emotion	~60 ms
VRAM per contenitore	~428 MB
VRAM per entrambi	~856 MB

Opzione B: CPU (fallback)

Immagine: ghcr.io/huggingface/text-embeddings-inference:cpu-1.9

Adatto se la GPU non è disponibile o la VRAM è insufficiente. Non richiede driver NVIDIA.

docker pull ghcr.io/huggingface/text-embeddings-inference:cpu-1.9
sudo mkdir -p /opt/tei-emotion-cache

docker run -d --name tei-sentiment   --shm-size 1g   -p 8081:80   -v /opt/tei-sentiment-cache/model:/data/model   --restart unless-stopped   ghcr.io/huggingface/text-embeddings-inference:cpu-1.9   --model-id /data/model --dtype float32

docker run -d --name tei-emotion   --shm-size 1g   -p 8082:80   -v /opt/tei-emotion-cache:/data   --restart unless-stopped   ghcr.io/huggingface/text-embeddings-inference:cpu-1.9   --model-id SamLowe/roberta-base-go_emotions --dtype float32
```Con un alto LA, è possibile limitare il CPU:

```bash
docker update --cpus=0.1 tei-sentiment tei-emotion

Prestazioni CPU

Metrica	Valore
Inferenza sentiment	~270 ms
Inferenza emozione	~205 ms
RAM per contenitore	~500 MB
Con --cpus=0.1	~2-3 secondi per post

Passaggio GPU ↔ CPU

docker stop tei-sentiment tei-emotion
docker rm tei-sentiment tei-emotion

Poi avviare i contenitori secondo la configurazione desiderata. Non è necessario modificare le impostazioni di Discourse: gli endpoint rimangono gli stessi.

Verifica

curl -s http://localhost:8081/   -X POST -H 'Content-Type: application/json'   -d '{"inputs": "I am happy"}'

curl -s http://localhost:8082/   -X POST -H 'Content-Type: application/json'   -d '{"inputs": "I am happy"}'

Risposta attesa per sentiment: [{\"label\":\"positive\",\"score\":0.96},...]

Configurazione Discourse

Nella sezione /admin/plugins/discourse-ai/settings?filter=sentiment:

discourse_ai_enabled = true
ai_sentiment_enabled = true
ai_sentiment_model_configs - due oggetti:

Campo	Modello 1	Modello 2
model_name	`cardiffnlp/twitter-roberta-base-sentiment-latest`	`SamLowe/roberta-base-go_emotions`
endpoint	`http://<your-host>:8081`	`http://<your-host>:8082`
api_key	(vuoto)	(vuoto)

Dashboards

/admin/reports/overall_sentiment - tono generale (positive - negative)
/admin/reports/emotion_joy (e altre 27 emozioni)
Backfill: ~2500 post/ora, post non più vecchi di 60 giorni

Condizioni e rischi

I modelli sono addestrati in inglese. Per testo russo i risultati sono approssimativi, ma il sentiment di base funziona
Endpoint aperto senza api_key — per produzione chiudere tramite reverse proxy
Monitoraggio VRAM: nvidia-smi --query-compute-apps=pid,name,used_memory --format=csv,noheader

Argomento		Risposte	Visualizzazioni
AI для разработчика (часть 1, IDE) AI диаграмма	1	47	Luglio 7, 2025
Курс обучения чему-либо должен быть интересным Основная	2	88	Novembre 14, 2025
Токены и стоимость (обучение от Cursor) AI документация	0	58	Febbraio 10, 2026
Не копить запасы, а отдавать ценность в момент нужды (Lean) Основная gemba	0	20	Giugno 26, 2026
Руководство к действию в любой ситуации Мысль дня	0	54	Luglio 6, 2025