Article updated 2025.11.10, see comment.
Introduction
In this article, I will explain how to deploy LM Studio as a service in an Ubuntu 25.04 environment (may also work for other versions).
LM Studio as a server not only allows you to load language models and work with them independently but also organizes an API for connecting external services. You can return to local operation by stopping the service and launching the application as usual.
You will be able to:
- work with different language models (load on demand when not in use)
- connect your applications or plugins to the API
Download
Download the AppImage (we are talking about an application for Linux), website:
Place the file in the ~/llm directory and make it executable:
chmod +x ~/llm/LM-Studio-0.3.27-4-x64.AppImage
At the time of writing, this version was current.
headless
Deployment as a service is required when you are working with it remotely. Decide immediately:
a) on the server (PC/laptop, etc.) there is graphics and the user has already logged into the UI shell at least once. Or if you work in the UI - then skip this step.
b) there is no graphics on the server or the user never logs into the UI. In this case, you will need a couple of additional commands executed once:
sudo loginctl enable-linger $USER
this command makes it possible to work with an environment without logging into a graphical interface
loginctl show-user ivan | grep Linger
this command shows the status of the setting (active or unavailable):
Linger=yes
Commands to check that nothing is interfering (the output should be some understandable status like degraded, but errors are not allowed):
systemctl --user status
systemctl --user is-system-running
systemd
If you still haven’t decided on a headless mode (without graphics), then skip this step.
But if autostart is your everything, then please create a script:
~/.config/systemd/user/lm-studio.service
And let’s assume that your LM Studio executable file is located at
%HOME/llm/LM-Studio-0.3.27-4-x64.AppImage
content:
[Unit]
Description=LM Studio Service
After=network.target
[Service]
Type=simple
ExecStart=/usr/bin/xvfb-run -a --server-args="-screen 0 1920x1080x24" %h/llm/LM-Studio-0.3.27-4-x64.AppImage --run-as-service
ExecStartPost=/bin/bash -c 'sleep 10 && exec lms server start'
Restart=always
RestartSec=10
Environment=PATH=%h/.local/bin:/usr/local/bin:/usr/bin:/bin:%h/.lmstudio/bin
Environment=DISPLAY=:99
WorkingDirectory=%h/llm
[Install]
WantedBy=default.target
The script consists of two parts:
- launching the application
- starting the server part
In essence, such a script is fragile because it consists of two parts that do not know about each other’s state. Do not use such solutions in production. And don’t use LM Studio at all, as VLLM is much faster.
Note that here a screen is emulated, so you need to additionally install:
sudo apt update && sudo apt install xvfb
Please note that systemd in this example will be installed by the user, not root.
Perform standard operations for a startup script (apply changes and enable autostart):
systemctl --user daemon-reload
systemctl --user enable --now lm-studio.service
Launch
systemctl --user status lm-studio.service
The service should now be silent, as it has not been launched yet.
systemctl --user start lm-studio.service
and now check the server status:
lms server status
it should be listening on port 1234.
Strictly speaking, you should first play with the LM Studio UI to set up the necessary parameters and load models. And also translate the listening from address 127.0.0.1 to 0.0.0.0 if you need to organize external connection to your API (potentially dangerous, so first install encryption and authorization)
CURL
Check your server endpoint:
curl -v http://127.0.0.1:1234/v1/models
you should get back the available models:
* Trying 127.0.0.1:1234...
* Connected to 127.0.0.1 (127.0.0.1) port 1234
* using HTTP/1.x
> GET /v1/models HTTP/1.1
> Host: 127.0.0.1:1234
> User-Agent: curl/8.12.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200 OK
<
{
"data": [
{
"id": "nvidia_nvidia-nemotron-nano-9b-v2",
"object": "model",
"owned_by": "organization_owner"
},
{
"id": "text-embedding-qwen3-embedding-0.6b",
"object": "model",
"owned_by": "organization_owner"
},
{
"id": "qwen/qwen3-8b",
"object": "model",
"owned_by": "organization_owner"
},
{
"id": "google/gemma-3-4b",
"object": "model",
"owned_by": "organization_owner"
}
],
"object": "list"
* Connection #0 to host 127.0.0.1 left intact
REST API
Now you can connect to your REST API using the endpoints:
GET http://127.0.0.1:1234/v1/models
POST http://127.0.0.1:1234/v1/chat/completions
POST http://127.0.0.1:1234/v1/completions
POST http://127.0.0.1:1234/v1/embeddings
