AI Agent "Forum Search": Setting up semantic search and embeddings

What’s Done

An AI agent named “Forum Search” (ID=5) has been created for semantic search across forum materials from discuss.rabkesov.ru, with the ability to enrich results from the internet.

Agent

Tools

Tool Purpose Parameters
Search Semantic + keyword search max_results=20, search_private=true
Read Read full topic content read_private=true
Researcher Deep analysis/synthesis across multiple topics max_results=10, LLM=qwen3-VL-8b
WebBrowser Browse web pages (for web enrichment) -

Behavior

  1. Searches only within the forum for each query (Search + Read + Researcher)
  2. Cites every fact with a link to the topic/post
  3. At the end of the response, asks: “Would you like to enrich the results with information from the internet?”
  4. Upon confirmation, uses WebBrowser to visit priority websites:
  5. Responds in the language of the question

How to Use

Embedding Model Replacement: nomic v1.5 → v2-moe

Problem

The model nomic-embed-text-v1.5 groups texts by language, not by meaning. Test:

Pair Cosine Similarity
RU “vLLM setup” ←→ EN “vLLM config” (same topic) 0.634
RU “vLLM setup” ←→ RU “borscht recipe” (different topics) 0.650
Gap (semantic separation) -0.016

Russian text about vLLM is closer to Russian borscht recipe than to English text about vLLM. Semantic search on the Russian-language forum using this model was not working correctly.

Solution

Switched to nomic-embed-text-v2-moe (MoE, 8x277M, 512 MB Q8_0, 100+ languages). Results:

Pair v1.5 v2-moe
RU “vLLM” ←→ EN “vLLM” (same topic) 0.634 0.924
RU “vLLM” ←→ RU “borscht” (different topics) 0.650 0.163
Gap -0.016 +0.761

Final Embedding Settings

Parameter Value
ai_embeddings_enabled true
ai_embeddings_selected_model 12 (nomic-embed-text-v2-moe)
ai_embeddings_semantic_search_enabled true
ai_embeddings_semantic_search_use_hyde true
ai_embeddings_semantic_search_hyde_agent -32 (“Content Author”)
ai_embeddings_semantic_quick_search_enabled true

The model nomic-embed-text-v2-moe must be loaded permanently in LM Studio (~488 MB VRAM). Backfilling embeddings is automatically triggered via Sidekiq after switching the model.

Limitation: Web Search

The Google tool (full web search) is unavailable — ai_google_custom_search_api_key is not configured. The agent uses WebBrowser to visit specific URLs. For full internet search, you need:

LLM Server Logs

Path: ~/.lmstudio/server-logs/YYYY-MM/YYYY-MM-DD.N.log

  • Rotated monthly in folders, with daily subfolders
  • N = rotation number for the day
  • Example: ~/.lmstudio/server-logs/2026-03/2026-03-30.1.log