Atelier User Guide
A step-by-step walkthrough of every feature in Atelier.
Table of Contents
- Getting Started
- Deploying Your First App
- Interacting with Your App
- Updating Your App
- Build History & Rollback
- Security
- Managing Secrets
- App Storage
- Webhooks & Auto-Deploy
- LAN Subdomain Access
- Public Access (Cloudflare Tunnel)
- Pause, Resume & Archive
- Build Errors
- Crash Alerts
- Nova — AI Operations Agent
- Settings
- MCP Servers
- API Tokens
- External MCP Integration (deprecated)
- Connecting an Agent (Skills)
- Backup & Restore
- Registry Management
- User Management (Admin)
- Activity & System Logs
Getting Started
Accessing Atelier
Open your browser and navigate to your Atelier instance (e.g. http://atelier.home.arpa). You’ll see the login screen.
Registering Your Account
If this is a fresh installation with no users, you’ll see a registration form instead of the login form.
- Enter a username and password.
- Optionally enter an email address.
- Click Register.
The first user registered is automatically promoted to Admin, giving you full control over the platform.
Logging In
If an account already exists:
- Enter your username and password.
- Click Login.
You’ll be taken to the main workspace.
The Main Interface
The interface has two main areas:
- Icon sidebar (left) — compact navigation rail. From top to bottom: the Atelier logo, primary nav (Apps, Nova, and Build with AI for Developer and Admin roles), utility nav (Approvals with pending-count badge, Activity, Archived, Settings), and your user avatar (click for Change Password / Sign Out).
- Content area (right) — shows whichever section you selected. The default landing page is Apps.
Apps page (the default view) is a grid of your app cards. A toolbar runs across the top:
- Search — filter the grid by app name.
- Sort — Manual, Recent, Name, or Status.
- + Folder — create a folder to group apps into.
- Cluster summary — current CPU and memory use across the cluster.
- + Create — opens the creation menu: Deploy an Image, Clone a Git Repo, Push Your Own Code…, and Build with AI (the last shown only when the bundled agent is enabled — see Deploying Your First App).
Just below the toolbar, a row of status filter chips (All / Running / Paused / Error) narrows the grid to apps in that state.
Each app card shows:
- Display name (editable inline via the three-dot menu)
- Status badge with a colour-coded indicator:
- Green — running
- Blue — building
- Yellow — updating
- Red — error
- Grey — paused
- Build number and last-updated timestamp
- An Open ↗ button that opens the app’s URL in a new tab
- A ★ to pin the app, and a three-dot menu for per-app actions (rename, pause/resume, view in Gitea, archive)
Organising your apps
For larger fleets you can shape the dashboard, and your layout is saved on the platform (it follows you across browsers and devices):
- Folders — click + Folder to create one, then drag apps into it. Folders are collapsible sections; the ✎ and ✕ on a folder header rename or delete it (deleting a folder moves its apps back to Ungrouped — it never deletes apps).
- Pinning — click a card’s ★ to lift it into a Pinned section at the top, across folders.
- Manual order — with Sort: Manual (the default), drag a card by its ⠿ handle to reorder it, or drop it onto another card / a folder to move it. Choosing Recent, Name, or Status sorts automatically instead (dragging is disabled while a non-Manual sort, a search, or a status filter is active).
Settings menu — clicking the gear icon in the sidebar opens a submenu with Settings, Users (admin only), Registry, and a Documentation ↗ link to the in-app docs.
Optional: configure an LLM
Atelier works fully without any language model configured. Deploying images, building from your code, rollbacks, secrets, metrics, logs — none of it touches an LLM.
A configured LLM unlocks the platform’s assistive features: the Nova operations agent, automatic crash diagnosis, the supervisor’s analysis of cluster issues, post-build code review, and the written build summary. With no LLM configured these features simply stay quiet — nothing errors, nothing blocks.
To set one up, open Settings → AI → LLM Profiles, choose a provider (Anthropic, OpenAI, OpenRouter, or a local Ollama-compatible endpoint), paste in your API key, and save. You pay the provider directly; Atelier adds no markup. See LLM Profiles & Roles.
Deploying Your First App
Atelier hosts containerised apps on your own hardware: you bring a container image or code with a Dockerfile, and the platform handles everything after that — image builds, Kubernetes manifests, routing, secrets, storage, monitoring. There are several ways in, all on the + Create menu on the Apps page:
| Menu entry | What it does | Best for |
|---|---|---|
| Deploy an Image | Run an existing container image | Off-the-shelf software (nginx, Grafana, a registry image you built elsewhere) |
| Clone a Git Repo | Containerise a public Git repository that contains Dockerfile(s) | Getting an existing project hosted in one step |
| Push Your Own Code… | Create an empty per-app Git repo; every push builds + deploys | Active development — yours or your coding agent’s |
| Build with AI | Open a terminal to the bundled build agent and describe the app you want | Letting an AI agent write and ship the code for you |
Build with AI is shown to Developer and Admin users once an admin has enabled the bundled agent — see Build with AI.
Deploy an Image
Deploy a pre-built Docker image directly — no build step at all.
- On the Apps page, click + Create and choose Deploy an Image.
- Give it an app name and enter the image reference (e.g.
nginx:latestorghcr.io/user/app:v1). - Set the container port your app listens on — this is the primary port, and it’s the one your app’s URL routes to.
- Optionally add environment variables.
- Advanced (optional) — for images that aren’t a simple single-port web
server:
- Command / Args — override the image’s entrypoint/command (e.g. a specific subcommand the image needs to start).
- Persistent volumes — attach storage at a mount path, sized in GiB, that survives restarts and redeploys.
- Additional ports — expose more ports on the app’s Kubernetes Service (e.g. a dashboard on one port and an API on another). Note these are reachable in-cluster; only the primary port (step 3) is routed by the app’s public URL.
- Click Import →.
Imports are resilient: a slow-starting image gets a generous rollout window, and if a deploy fails your volumes are left intact rather than torn down — so a transient startup problem never costs you your data. Together these let you run real off-the-shelf images (e.g. a self-hosted agent gateway with a dashboard, an API port, and a data directory) straight from the UI.
Note: The image must be accessible from within the cluster. Public images from Docker Hub, GHCR, etc. work automatically.
You can also deploy any image already in the internal registry from the Registry page — its Deploy button opens this same modal pre-filled.
Clone a Git Repo
Containerise any public Git repository that ships its own Dockerfile(s).
- On the Apps page, click + Create and choose Clone a Git Repo.
- Give it an app name (kebab-case — it becomes the Kubernetes identifier).
- Enter the Git URL (public
https://repositories only). - Optionally specify a branch (defaults to the repo’s default branch).
- Click Clone & Build →.
Atelier clones the repo, copies it into a per-app repository in the internal Gitea, builds the image(s) with BuildKit, generates vetted Kubernetes manifests, and deploys. You watch the pipeline progress live in the modal.
Two requirements to know about:
- The repo must contain at least one
Dockerfile— at the root for a single-image app, or one per subdirectory (backend/Dockerfile,frontend/Dockerfile, …) for a multi-service app. The build fails with a clear error if none is found. - Text files only. Binary files (images, fonts, archives) are skipped during the copy, with a warning. For repos that need binary assets, use Push Your Own Code instead — a real
git pushcarries everything.
Push Your Own Code
The most capable path: Atelier scaffolds an empty app with its own Git repository and a push-to-build webhook. You write the code and a Dockerfile; every git push triggers a build + deploy. No AI in the loop, and it works equally well for you, your CI, or a coding agent.
Choosing + Create → Push Your Own Code… opens a guide with copy-ready commands for your instance. The flow:
-
Scaffold the app (needs a Developer-role API token):
Terminal window curl -s http://<your-portal-domain>/api/apps/scaffold \-H "Authorization: Bearer $ATELIER_API_TOKEN" \-H 'Content-Type: application/json' \-d '{"name":"my-app"}'This creates the app (status
scaffolded), an empty per-app Gitea repository, and the push webhook. The response includes theclone_url. -
Clone, add your code + a Dockerfile:
Terminal window git clone http://<your-portal-domain>/api/git/my-app.gitcd my-app# … add your app + a Dockerfile …Authenticate git with any username and your API token as the password — it goes through Atelier’s git proxy, so no separate Gitea credentials are needed.
-
Push — the build and deploy start automatically:
Terminal window git add -A && git commit -m "initial app" && git pushThe app goes
scaffolded → building → running. Every subsequent push rebuilds and redeploys.
What the build needs:
- At least one
Dockerfile— at the repo root (one image) or per subdirectory for a multi-service app (backend/Dockerfile,frontend/Dockerfile, …). The build context is the Dockerfile’s directory, soCOPYpaths must be relative to it and everything must be committed. - Each Dockerfile should
EXPOSE <port>to declare its served port (defaults to 8080 if omitted). - You never write Kubernetes YAML. Atelier generates the manifests (Deployments, Services, Ingress) and injects the app’s secrets automatically.
- Need persistent state? Write to
/data— every app gets a volume there.
Build with AI
Build with AI opens a terminal to a build agent that runs inside your cluster. You describe the app you want in plain language; the agent writes the code and ships it through the same build-and-deploy pipeline as everything else. It’s the AI on-ramp — without you leaving Atelier or wiring up your own tooling.
Turning it on (admin, once). The agent is off by default — no agent runs until you enable it. An admin goes to Settings → AI → Bundled Agent, chooses an agent and clicks Enable agent:
- Claude Code — Anthropic’s CLI agent. Paste an Anthropic API key; it’s stored only in the agent’s Kubernetes Secret inside your cluster — never in the platform database, and never readable back through the UI.
- opencode — an open-source, multi-provider CLI agent. No key needed
here; the first time you open the terminal you run
/connectto sign it in to your provider (Anthropic, OpenAI, a local model, …). The sign-in is saved on the agent’s storage volume, so it persists across restarts.
The status badge moves to running once the agent pod is up.
Using it (Developer or Admin). Click + Create → Build with AI (or the terminal icon in the sidebar). The terminal opens straight into the agent, already attached and authenticated — just describe what you want and watch it work. For example:
A todo app — React frontend, FastAPI backend, in-memory store.
The agent scaffolds the app, writes the code, pushes it, watches the build, and gives you the URL when it’s live — all the things you’d do by hand in the Push Your Own Code flow, done for you.
A few things worth knowing:
- Your session survives. Close the tab and the agent keeps working;
reopen Build with AI and you’re back in the same session where you left
off. (It’s a
tmuxsession in the agent pod.) - Everything is attributed to you. Each terminal session gets its own access token scoped to your role, so apps the agent creates show up as created by you, and a Viewer-role user gets no terminal at all (building is a Developer-and-up action).
- Your provider, your bill. The agent uses the key you supplied (or signed in with) and bills you directly — there’s no markup and no per-seat fee.
- Best on a real keyboard. The terminal works in any browser but isn’t much fun on a phone.
Disabling it. Settings → AI → Bundled Agent → Disable agent removes the agent pod and revokes every active session token. The agent’s saved sign-in is kept on its storage volume, so re-enabling picks up where you left off. Anything the agent hadn’t pushed yet is lost.
Prefer your own agent? You don’t have to use the bundled one. Any skill-aware coding agent — Claude Code, Claude Desktop, Cursor — can build apps on Atelier from your own machine through the Push Your Own Code flow; see Connecting an Agent (Skills). And if you’d rather swap a different agent image into the in-cluster terminal, that’s supported too — see Bringing your own agent image.
Interacting with Your App
The App Detail View
Click any app card on the Apps page to open its detail view. At the top you’ll see:
- App name and status badge
- Open app ▾ — a popover listing every URL the app is reachable at
- Action buttons — Pause, Re-deploy, the Auto-deploy toggle, and a menu with more options
Down the left edge is a vertical icon tab strip:
| Tab | What it shows |
|---|---|
| Build | The build workspace — live pipeline stages, BuildKit output, and the persisted build log of the current/most recent build |
| Logs | Build log and live container output (two sub-tabs, with download buttons) |
| Jobs | Job run history, schedule editing, and Run Now (legacy scheduled apps only) |
| Deploy | Three sub-tabs: History (builds with rollback), Commits (git history with rebuild), Source (repo link, clone command, Rebuild from Gitea) |
| Resources | CPU and memory usage per pod, resource limits, storage and disk usage |
| Security | Vulnerability scans and code review |
| Secrets | Environment variables |
| Public | Public exposure via Cloudflare Tunnel |
| Activity | Event log for this app |
| Inspect | Deep diagnostic view — consistency checks, pods, Deployments, Services, Ingresses, registry images, and source files (Developer/Admin users) |
Viewing Live Logs
- Click the Logs tab and switch to the Live sub-tab.
- Logs stream in real-time from your running containers.
- If your app has multiple services (e.g. frontend + backend), use the service filter to pick which one to view.
Logs auto-scroll as new entries arrive. Timestamps are shown in your local timezone. The Build sub-tab shows the persisted build transcript instead, and both offer a Download button.
Checking Resource Metrics
- Click the Resources tab.
- You’ll see CPU usage (in millicores) and memory usage (in MiB) for each pod, along with the app’s storage allocation and disk usage.
Metrics refresh every 15 seconds while the tab is open.
Viewing Running Containers
The app detail header shows a pod count indicator. This tells you how many container replicas are currently running.
Updating Your App
Your app’s source of truth is its Git repository in the internal Gitea — Atelier created it when the app was first built (Clone Repo and Push Your Own Code apps both have one). Updating the app means pushing new code:
- Open the app and go to Deploy → Source. You’ll find the repository link and a copy-ready
git clonecommand. - Clone the repo (or edit files directly in Gitea’s web UI), make your changes, and push to
main. - Trigger a rebuild:
- If Auto-deploy is on (see Webhooks & Auto-Deploy), the push itself starts the build — nothing else to do.
- Otherwise click Rebuild from Gitea on the Source panel.
The build runs from your committed Dockerfile(s) — fast, deterministic, no LLM involved — and the new version rolls out automatically.
Re-deploying Without Changes
Sometimes you want to rebuild and redeploy without any code changes (for example, after updating secrets or storage settings, or to pick up a refreshed base image).
- Click Re-deploy ↺ in the app detail header.
- Confirm. The pipeline rebuilds from the repo’s committed Dockerfile(s) and redeploys.
Apps created via Deploy an Image have no repo to rebuild from — re-importing the image (or using the Registry page’s Deploy button) is the equivalent.
Build History & Rollback
Viewing Past Builds
- Select your app and click the Deploy tab (the History sub-tab is the default).
- You’ll see a list of all builds, newest first.
- Each build shows:
- Build number (the current build is tagged “current”)
- Commit SHA — click to view the code in Gitea
- Commit message
- Timestamp
- Security scan badge — green “clean” or coloured with severity counts
Rolling Back to a Previous Build
- In the Deploy tab, find the build you want to restore.
- Click Rollback on that row.
- A confirmation appears — click Confirm to proceed.
The rollback restores the exact images and manifests from that build. It does not re-run the build pipeline — it re-applies the saved configuration.
Rebuilding from a Specific Commit
- Click the Deploy tab, then the Commits sub-tab.
- Find the commit you want to rebuild from.
- Click Re-deploy next to that commit.
- Confirm when prompted.
This triggers a direct rebuild from the code (and Dockerfiles) at that specific commit.
Security
Atelier automatically scans every built image for known vulnerabilities using Trivy.
Viewing Scan Results
- Select your app and click the Security tab.
- You’ll see scan results for each image in your app.
- Vulnerabilities are grouped by severity:
- Critical (red) — should be fixed immediately
- High (orange) — should be fixed soon
- Medium (yellow) — fix when practical
- Low (grey) — informational
Click on a severity group to expand the full CVE list with package names, installed versions, and fixed versions.
Scan Badges in Build History
Each build in the Deploy tab shows a scan badge:
- Green “clean” — no critical or high vulnerabilities
- Red with count — critical vulnerabilities found (e.g. “2C”)
- Amber with count — high vulnerabilities found (e.g. “3H”)
Click the badge to jump to the full scan results.
Triggering a Rescan
- In the Security tab, click Rescan on any image.
- A new Trivy job runs against the current image.
- Results update when the scan completes.
Security Holds
Trivy runs after the container is already deployed, so the scan can never stop a rollout from happening — it only flags it once the CVE list is in. When Block on Critical is enabled in Settings and the scan finds critical CVEs, the app is marked with a security hold:
- A red “Security hold” banner appears on the app detail page naming the build and critical count.
- A
scan.blockedevent is written to the activity feed. - The app card on the Apps page shows the same indicator.
The hold is advisory — the app keeps running and serving traffic as normal. Its purpose is to flag that a human needs to look at the findings before the next change lands.
To clear it:
- Click Override on the banner.
- Click Confirm Override.
A subsequent clean build (no criticals) also clears the hold automatically.
Code Review
If post-build code review is enabled (and an LLM is configured), an LLM reviews the code in your app’s repository in the background after each build — pure analysis, it never changes your code.
- In the Security tab, scroll down to Code Review.
- Review findings are shown as markdown with specific recommendations.
- You can trigger a new review by clicking Review.
Managing Secrets
Secrets are environment variables injected into your app’s containers at runtime.
Adding Secrets
- Select your app and click the Secrets tab.
- Enter a key and value in the input fields.
- Click Save.
Secrets are stored as Kubernetes Secrets and mounted as environment variables in your app’s containers. Your app needs to be redeployed for new secrets to take effect.
Editing Secrets
- Add a new secret with the same key — it will overwrite the existing value.
- Re-deploy your app for the change to take effect.
Deleting Secrets
- Click the delete button next to the secret you want to remove.
- The secret is removed immediately.
- Re-deploy for the change to take effect.
Note: Secret values are never shown in the UI after being saved. You’ll only see the key names.
App Storage
Each app can have a persistent volume for data that survives container restarts.
Viewing Storage
Storage size and disk usage are shown on the Resources tab of the app detail view. By default, apps get 1Gi of persistent storage mounted at /data.
Changing Storage Size
- On the Resources tab, find the Storage section.
- Enter the desired size in Gi (e.g.
5for 5 GiB). - Click Save.
Note: Storage can be increased but typically cannot be decreased (Kubernetes PVC limitation). The app needs to be redeployed for the change to take effect.
Webhooks & Auto-Deploy
Webhooks allow your app to automatically rebuild whenever code is pushed to Gitea.
Enabling a Webhook
- Select your app to open its detail view.
- In the app detail header, click the Auto-deploy ○ toggle (top-right of the header). It appears only when the app has a Gitea repository and isn’t mid-build.
- The toggle fills in (Auto-deploy ●, green) to show auto-deploy is on.
A Gitea webhook is created that triggers a rebuild whenever changes are pushed to the main branch of the app’s repository.
Apps created via Push Your Own Code get this webhook automatically at scaffold time — for them, auto-deploy is the whole point.
How It Works
- You push code to the app’s Gitea repository.
- Gitea sends a webhook notification to Atelier.
- Atelier verifies the HMAC signature and that the push was to
main. - A direct rebuild starts automatically: the platform builds your committed Dockerfile(s) as-is and rolls out the result. Fast, deterministic, no LLM involved — every push behaves like a small CI/CD pipeline.
Disabling a Webhook
- Click the Auto-deploy ● toggle again.
- The Gitea webhook is deleted and auto-deploy stops.
LAN Subdomain Access
Every app you build gets a default LAN subdomain that serves it at root —
http://<app-name>.<your-portal-domain> (e.g. http://atelier-blog.atelier.home.arpa).
This is the LAN analogue of Public Access: same “hostname at root” model, but
for traffic inside your network. No Cloudflare, no public exposure — just an
extra hostname your local DNS routes to the cluster.
Why it exists
The portal proxy at /apps/<name> strips the path prefix before forwarding
to your app. That works for relative-path apps but breaks anything that
emits root-absolute URLs — Next.js (/_next/...), Vite SSR, Astro, plain
SSR with href="/about". The browser fetches the asset path against the
portal host root instead of the app, and either 404s or returns the portal’s
own HTML.
The LAN subdomain serves the app at root, so root-absolute URLs resolve
correctly. The portal proxy at /apps/<name> keeps working unchanged — use
whichever fits the app.
How to use it
Open any app in App Detail and click Open app ▾ in the header. The popover shows every URL the app is reachable at, with the LAN subdomain marked recommended for SSR / Vite / Astro apps. Copy the URL with the clipboard button or open it in a new tab.
For agents / API consumers, GET /api/apps/{name} now returns lan_url
alongside portal_url, public_url, and in_cluster_url.
Prerequisite: wildcard DNS
The subdomain URL only resolves if you’ve configured your local DNS to point
*.<portal_domain> at your cluster’s node IP. Three recipes, picking the one
that fits how you run DNS today:
dnsmasq (recommended for homelab + multi-machine setups)
Drop a one-line config into dnsmasq:
# /etc/dnsmasq.d/atelier.conf — match the apex AND any subdomainaddress=/atelier.home.arpa/192.168.0.22That single line covers both atelier.home.arpa (the portal) and any
<app>.atelier.home.arpa. Reload dnsmasq (systemctl reload dnsmasq on
most distros) and any machine using this dnsmasq resolves all of them.
Tailscale Split DNS (if your platform is on a tailnet)
In the Tailscale admin → DNS → Search domains add atelier.home.arpa,
then in Split DNS point that domain at the tailnet IP of your cluster
node. The split-DNS resolution covers the apex and wildcard subdomains
together. Any machine on your tailnet picks up the new hostnames
automatically.
/etc/hosts does NOT work for this
/etc/hosts does not support wildcards — you’d need a per-app entry for
every new subdomain. If you’re on a single dev machine and only use one or
two app subdomains, that’s tolerable; for anything else, use dnsmasq or
Tailscale Split DNS.
What if I haven’t set up wildcard DNS?
The Ingress rule is added unconditionally. The portal proxy keeps working
exactly as before, so apps stay reachable at <portal_domain>/apps/<name>.
You only notice the LAN subdomain when you configure DNS for it — until
then, the subdomain URLs in App Detail just won’t resolve. No errors, no
broken state.
Public Access (Cloudflare Tunnel)
Atelier runs on your own infrastructure, so apps you create are not reachable from the public internet by default. The Public Access feature exposes individual apps through a Cloudflare Tunnel — outbound-only, no port-forwarding, free TLS at Cloudflare’s edge.
The setup has two parts:
- One-time, platform-wide (admin): paste tunnel credentials + a Cloudflare API token into Settings → System → Public Access.
- Per app (developer): toggle the Public tab in App Detail and pick a hostname like
blog.example.com.
Once both are configured, toggling Public on for an app is enough — Atelier creates the DNS record + tunnel routing rule via the Cloudflare API. No manual dashboard work per app.
Platform setup (admin)
Settings → System → Public Access has two paste-in fields:
1. Tunnel credentials. Either:
- The connector token from a tunnel you created in Zero Trust → Networks → Connectors → Create a tunnel (the long string after
--tokenin the install commands), or - The credentials JSON from a tunnel created via the CLI (
cloudflared tunnel create <name>— the{"AccountTag","TunnelSecret","TunnelID"}file).
Either form works; Atelier converts internally. Saving the credentials brings up the in-cluster cloudflared Deployment.
2. Cloudflare API token. Required for Atelier to drive per-app routing automatically. Mint at Cloudflare dashboard → My Profile → API Tokens → Create Token → Create Custom Token with these scopes:
| Scope | Resource | Permission |
|---|---|---|
| Account | Cloudflare Tunnel | Edit |
| Zone | DNS | Edit |
Set Zone Resources to the specific zone hosting your public hostnames (e.g. example.com). Account Resources can be “All accounts” or scoped to a specific account.
Paste the token into the Cloudflare API token field and Save. Atelier validates the token (via Cloudflare’s verify endpoint) before storing it.
The status indicator at the top of the panel reflects automation state:
- Green — “Fully automated”: both credentials and API token configured; per-app toggles propagate without any manual dashboard work.
- Amber — “Manual dashboard work needed”: only credentials configured; the tunnel runs but you’d have to add a Published Application + DNS record per app in the Cloudflare dashboard manually.
- Gray / red: tunnel not configured or unhealthy.
Per-app usage (developer)
Once the platform tunnel is set up, every app gets a Public tab in its App Detail sidebar (globe icon, between Secrets and Activity).
To expose an app publicly:
- App Detail → Public tab.
- Tick the Expose this app publicly checkbox.
- Enter a hostname under the zone your API token covers (e.g.
blog.example.com). - Click Save.
Within ~10 seconds Atelier:
- Persists the toggle to its database.
- Calls the Cloudflare API to update the tunnel’s published-application list and create a CNAME pointing the hostname at the tunnel.
- The cloudflared connector picks up the new routing config live — no pod restart.
The public URL is then live, served via Cloudflare’s edge (free HTTPS, included DDoS protection, etc.).
Hostname rules:
- Must be a valid DNS-1123 hostname (lowercase letters/digits/hyphens in dot-separated labels).
- Must be unique across apps on this platform — two apps can’t claim the same hostname.
- Must sit on a zone the platform API token has
Zone:DNS:Editaccess to.
To disable public access, untick the checkbox and Save. Atelier deletes the DNS record and removes the route from the tunnel — the public URL stops resolving within ~10 seconds. The hostname is preserved in the form so you can re-enable later without retyping.
Archiving an app also clears its public hostname and reclaims the route automatically.
What if the API token isn’t configured?
The per-app Public tab still lets you toggle and save a hostname — the setting is persisted — but Atelier shows an amber banner explaining that routes won’t propagate automatically. In this mode you’d manage Published Applications and DNS records yourself in the Cloudflare dashboard. Useful as a fallback; not the recommended path.
Apps that can’t be exposed
- Archived apps — restore the app first.
Multi-image apps
Atelier routes to the ingress-routed Service for the app (the same one it routes the in-portal URL to). For single-image apps that’s <app-name>; for multi-image apps it’s typically <app-name>-frontend. The reconciler reads the K8s Ingress to find the right Service, so it works uniformly without per-app configuration.
Pause, Resume & Archive
Pausing an App
Pausing scales your app to zero replicas — it stops running but keeps all configuration and data.
- Click the Pause button in the app detail header (or right-click the app card on the Apps page).
- The status changes to paused (grey indicator).
The app’s URL will no longer respond while paused. No compute resources are consumed.
Resuming a Paused App
- Click the Resume button.
- The app scales back to one replica and starts serving traffic again.
Archiving an App
Archiving removes all Kubernetes resources (pods, services, etc.) but preserves the app’s data in the database for future restoration.
- Click the menu and select Archive.
- Confirm in the modal dialog.
- The app disappears from the main Apps page.
Viewing & Restoring Archived Apps
- Click the Archived icon in the sidebar’s utility section.
- You’ll see a list of all archived apps.
- Click Restore to bring an app back — its latest build will be redeployed.
Permanently Deleting an App
- In the Archived apps view, click Delete Permanently.
- Confirm in the dialog. This is irreversible — all data and build history are deleted.
Build Errors
When a build fails, you’ll see a red error banner in the build view with details about what went wrong, and the app’s status turns red.
Common causes include:
- A missing or broken
Dockerfile(the build needs at least one, withCOPYpaths relative to its directory) - Missing dependencies or lockfile drift
- Docker build failures — a failing
RUNstep, an unavailable base image - Deployment timeout (image built fine but the pod won’t start)
To diagnose:
- Check the Build tab (or Logs → Build) for the full build transcript — BuildKit’s output pinpoints the failing step.
- If the image built but the rollout failed, check Logs → Live for the container’s startup output.
- You can also ask Nova — the Ask Nova → shortcut passes the error context straight into a chat.
The fix is always the same shape: correct the code or Dockerfile in your repo, push, and the app rebuilds (or click Rebuild from Gitea on Deploy → Source). The previously running version stays up while you sort it out — a failed build never takes down what was already deployed.
Crash Alerts
Atelier monitors your running apps for container crashes and provides diagnostic suggestions.
What Crash Alerts Look Like
When a container enters a failure state (CrashLoopBackOff, OOMKilled, ImagePullBackOff, etc.), an alert banner appears on the app’s card (Apps page) and in the app detail view.
The alert includes:
- Failure type (e.g. “CrashLoopBackOff”, “OOMKilled”)
- Diagnostic suggestion — an analysis generated from pod logs and Kubernetes events (requires an LLM to be configured; without one you still get the failure detection, just not the written diagnosis)
Dismissing Alerts
Once you’ve addressed the issue, click Dismiss to clear the alert. If the problem recurs, a new alert will be generated.
Tip: Click Ask Nova → on the alert banner to continue troubleshooting with the Nova agent — the diagnosis is carried into the chat for you.
Nova — AI Operations Agent
Nova is an AI assistant specialised in cluster operations and troubleshooting. It needs an LLM profile assigned to the Nova role — without one, the panel will tell you so rather than answering.
Accessing Nova
Click the Nova icon in the sidebar to open the Nova chat panel.
Asking Nova for Help
Type your question or describe a problem. Nova can:
- Diagnose why an app is crashing
- Explain error messages
- Suggest configuration changes
- Help with Kubernetes concepts
Nova has access to your cluster state and can inspect pods, logs, and events when actions are enabled.
Nova Memory
Nova can remember context across conversations. Memory entries are shown in the panel and can be:
- Viewed — see what Nova remembers
- Deleted individually — click the delete button on any memory entry
- Cleared entirely — click “Clear All Memory”
Enabling/Disabling Actions
By default, Nova can only answer questions. To let Nova inspect your cluster:
- Go to Settings → AI → Nova.
- Enable Actions.
With actions enabled, Nova can read pod logs, check deployment status, and view Kubernetes events. The same panel has a Memory toggle — turning it off puts Nova in amnesia mode (it neither reads nor writes memory).
Settings
Access settings by clicking the gear icon in the sidebar’s utility section and choosing Settings from the popover.
The Settings page is organised into five tabs:
| Tab | What’s there |
|---|---|
| Overview | Platform health, storage report, system logs |
| AI | LLM Profiles & Roles, system prompts, Nova settings |
| Automation | Crash monitoring, image update monitoring, supervisor, Longhorn volume checks, alert channels |
| Build | Pipeline profile and the analysis-stage toggles it controls |
| System | System info, MCP Servers, Public Access credentials, API Tokens, Backup & Restore |
Pipeline Profile
The pipeline profile (Settings → Build) is a single control that flips the optional analysis stages on or off together. Pick one to match how much scrutiny you want on each build — or choose Custom and set each stage individually.
What the stages are
Every build runs as a sequence of stages. The profile decides which of the optional ones are switched on:
| Stage | What it does |
|---|---|
| Build | BuildKit builds the container image(s) from your committed Dockerfile(s), and the platform generates the Kubernetes manifests. Always runs. |
| Lint | Runs a standard code-quality linter against the source that was just built — ruff (Python), eslint (Node, when the project ships an eslint config), cargo clippy (Rust), golangci-lint (Go). This is static analysis of your code: syntax errors, unused imports, obvious bugs. It is not the security scan — see the note below. Optional. |
| Deploy | Rolls the new image out to the cluster. Always runs unless the lint gate is holding it. |
| Vulnerability scan | After deploy, Trivy scans the finished image for known CVEs in OS packages and dependencies. This checks published vulnerabilities, not code style. Optional. |
| Post-build code review | An LLM reviews the app’s code for quality and posts its notes in the Security tab. Runs in the background after deploy; needs an LLM configured. Optional. |
Lint and scan are not the same thing. Lint is static code-quality analysis of the source you just built — does this code look right? The vulnerability scan is Trivy checking the finished image against the CVE databases — does this image ship any known-vulnerable packages? They sit at opposite ends of the build and a profile can run either, both, or neither.
Two gates that can hold a deploy
- Lint hold — when Block on lint findings is on, any lint finding holds the deploy before it ships. The previous version keeps running; fix the code and redeploy, or click Override to ship this build anyway (skips lint once).
- Security hold — when Block on critical is on and Trivy finds CRITICAL CVEs, the app is flagged with an advisory hold after it’s deployed. The container keeps running — the hold is a flag for you to review or override, not a rollout gate. See Security Holds.
The profiles
| Profile | Lint | Block on lint findings | Post-build code review | Vulnerability scan | Block on critical CVE |
|---|---|---|---|---|---|
| Fast | off | off | off | off | off |
| Standard (default) | on | off | on | on | off |
| Hardened | on | on | on | on | on |
| Custom | — | — | — | — | — |
- Fast —
Build → Deploy. Every optional stage is skipped. Quickest iteration; use for throwaway prototypes. - Standard (default) —
Build → Lint → Deploy → Scan + Code review (background). The analysis stages all run, but nothing holds the rollout — lint findings and CVEs are informational. - Hardened —
Build → Lint (holds on findings) → Deploy → Scan (holds on CRITICAL) → Code review. Everything on, including both holds. Production-grade. - Custom — shown when the individual toggles below don’t match any named profile. Selecting a named profile overwrites the toggles to match.
Switching profile applies the toggles first, then updates the profile label. A diff preview shows exactly which toggles are about to change before you save.
LLM Profiles & Roles
Atelier drives every AI surface — Nova, the supervisor, code review, the build summary — through the same profile-based LLM system. You save one or more named profiles (each a provider + model + credentials), then assign a profile to each role. Roles without an explicit assignment fall back to the legacy flat llm.* settings, then to the startup env-var configuration — not to any named profile.
An LLM is optional: every deploy path works without one, and unconfigured AI features simply stay quiet (see Optional: configure an LLM).
Fresh installs. A new install ships a single Default profile, already assigned to every role — but with no provider key. To turn the AI features on, open it, fill in a provider + API key, and save; every role is configured at once.
Note: Older versions of Atelier used a single flat LLM configuration. That flat form still works for backward compatibility, and on first startup Atelier migrates it into the named
Defaultprofile so it shows up in the Profiles panel like any other. TheDefaultprofile is just a regular entry in the list — it isn’t a special fallback, and clearing a role assignment does not implicitly route to it.
Profile fields
| Field | Description |
|---|---|
| Name | Human-friendly label (e.g. Haiku Fast, Local LM Studio) |
| Provider | claude, openai, openrouter, or ollama |
| Model | Model identifier (defaults per-provider if left blank) |
| API Key | Provider API key — stored, never returned by the API. ollama doesn’t need one |
| Base URL | Only used when Provider is ollama. Point at any OpenAI-compatible local endpoint — Ollama itself, LM Studio, vLLM, llama.cpp, etc. Must include /v1 — e.g. http://192.168.0.54:1234/v1. The field is silently ignored for claude, openai, and openrouter providers. |
| Max output tokens | Optional cap on response length |
| Price per 1M input tokens | Optional. When set (together with the output price), LLM usage can be costed in dollars. Both prices must be set for the calculation to run. |
| Price per 1M output tokens | Optional. Same as above but for output tokens. |
The four roles
| Role | What uses it |
|---|---|
| Build | Writes the post-build summary — the short write-up of what each build produced |
| Nova | The operations assistant in the sidebar (crash diagnosis, cluster questions, action dispatch). Also used by the crash monitor for automatic post-incident analysis |
| Supervisor | Background cluster checks and the analysis behind the proposals on the Approvals page |
| Code Review | Automated review of the app’s code after each build |
Assigning a profile to a role is what makes it take effect. For example, if you want Nova to run against a local LM Studio model but keep a hosted model for code review, you’d have two profiles — one Claude, one Ollama-style pointing at LM Studio — and assign the Ollama profile to Nova only.
Auto-assign on create
When you add a new profile, Atelier assigns it to every role that doesn’t already have an explicit assignment. This means a first-time user who creates a single profile gets a working system immediately without hunting through the role table.
To split roles across profiles:
- Add all the profiles you want.
- Use the Role Assignments table at the bottom of the LLM Profiles panel to re-point individual roles — the dropdown under each role lets you pick any profile.
- Choosing Default (env config) for a role removes the explicit assignment; that role then falls back to the startup configuration.
Role assignments are read fresh on every LLM call — no pod restart required.
Practical sizings
- Homelab / solo dev — one profile is fine. Auto-assign fills everything with it.
- Cost-conscious — put a small model (e.g. Claude Haiku, GPT-4o-mini) on Build and Supervisor, keep a capable model on Code Review.
- Local-first Nova — point Nova at Ollama / LM Studio so operational chat doesn’t leave the network.
- Experimentation — add a second profile with a different model, swap between them via the role dropdown without editing configuration files.
Diagnosing “the LLM isn’t being used”
If a role seems to ignore its assigned profile, check kubectl logs deploy/atelier-core for a line that starts No usable LLM provider could be constructed for this role. The message names the specific failure (missing API key, unreachable base URL, etc.) and the role that tried to resolve. See Troubleshooting for the common causes.
Code Review
The Post-build code review toggle lives in the Pipeline Stages group on the Build tab (set by the profile, or edit it directly under Custom). When on — and an LLM is configured — a review runs automatically in the background after every build and posts its findings to the app’s Security tab.
The review prompt is edited under AI ▸ System Prompts (see Custom Prompts); the model comes from the Code Review role assignment in LLM Profiles & Roles.
Vulnerability Scanning
| Setting | Default | Description |
|---|---|---|
| Enabled | On | Run Trivy scans after each build |
| Block on critical | Off | Flag the app with an advisory security hold when critical CVEs are found. The container keeps running — see Security Holds. |
Pre-deploy Lint
A lint stage runs after the build and before deploy, so broken source
never ships. For each language it detects in the repo it runs a standard
linter in a short-lived Job — ruff (Python), eslint (Node, only when the
project ships an eslint config), cargo clippy (Rust), golangci-lint (Go) —
against the exact commit that was built.
This is code-quality static analysis of your source, and is distinct from the Vulnerability Scanning above: lint asks does this code look right?, while the Trivy scan asks does the built image contain known CVEs? They run at different points (lint before deploy, scan after) and have independent toggles.
| Setting | Default | Description |
|---|---|---|
| Enabled | On (Standard / Hardened) | Run the lint stage before deploy |
| Block on findings | Off (Standard), On (Hardened) | When on, lint findings hold the deploy — the app shows a lint hold banner. Fix the code and redeploy, or click Override to deploy this build anyway (skips lint once). |
| Timeout | 300s | Per-language lint Job timeout |
This is the gate that catches a committed source file with a stray markdown code fence (a common copy-paste from an AI chat) before it crash-loops a pod. Lint that can’t run (e.g. a clone failure) never blocks a deploy — only real findings do.
Automation
The Automation tab gathers the background monitors:
| Setting | Description |
|---|---|
| Crash monitoring | Detect pod failures and generate diagnostic crash alerts |
| Image update monitoring | Check for upstream base image updates |
| Check interval | Hours between image update checks (default: 24) |
| Supervisor mode | off, supervised (proposals wait for your approval on the Approvals page), or autonomous |
| Supervisor interval | Minutes between supervisor sweeps (default: 15) |
| Longhorn volume checks | Volume health checks and orphaned-volume detection, with a minimum age before an orphan is proposed for cleanup |
| Alerts | Outbound alert channels per severity, routed through MCP server tools (e.g. a Slack or Telegram sender), with a Send test button |
Custom Prompts
Under AI ▸ System Prompts you can customise the code review prompt — the instructions the LLM follows when reviewing your code after a build. A Reset button restores the default.
MCP Servers
MCP (Model Context Protocol) servers extend Atelier’s AI surfaces with additional capabilities — web search, URL fetching, message sending, and more — without requiring custom code in Atelier. Each MCP server runs as a container in the cluster and exposes tools that Nova can call in chat and that the alert system can use as outbound channels.
Accessing MCP Servers
- Open Settings from the sidebar’s gear-icon popover.
- Go to the System tab and find the MCP Servers section.
You’ll see three areas: Deployed Servers (servers currently running), Available (servers you can deploy from the catalog), and an Add custom MCP server link.
Deploying from the Catalog
- In the Available section, find the server you want (e.g. Fetch or Brave Search).
- If the server requires configuration (like an API key), fill in the required fields.
- Click Deploy.
- The server status shows deploying while the container starts up.
- Once running, the status changes to running (green badge).
Note: The Fetch and Brave Search images are included with Atelier.
Deploying a Custom MCP Server
Most community MCP servers are distributed as npm or pip packages. You can deploy them directly from the UI — no Docker builds required.
From an npm or pip Package (recommended)
- Open Settings > MCP Servers.
- Click + Add MCP server.
- Select the npm package or pip package tab.
- Fill in:
- Name — a lowercase DNS label (e.g.
github-search) - Package — the npm or pip package name (e.g.
@modelcontextprotocol/server-github) - Command — the stdio binary the package installs (e.g.
mcp-server-github)
- Name — a lowercase DNS label (e.g.
- Optionally set a Display Name and add Environment Variables (e.g. API keys).
- Click Deploy.
The package is installed automatically when the container starts. The server will appear with a deploying status, changing to running once ready. First startup may take a minute while the package installs.
From a Custom Docker Image (advanced)
For MCP servers not available as npm/pip packages, you can deploy a custom Docker image:
- Build or pull the image for your cluster’s architecture (
linux/amd64orlinux/arm64), import it into the cluster:
docker save mcp-my-server:latest | gzip > /tmp/mcp-my-server.tar.gzrsync -az /tmp/mcp-my-server.tar.gz atelier@<cluster-ip>:/tmp/ssh atelier@<cluster-ip> "\ sudo k3s ctr images import /tmp/mcp-my-server.tar.gz \ && sudo k3s ctr images tag docker.io/library/mcp-my-server:latest \ registry.atelier.local/mcp-my-server:latest"- In Settings > MCP Servers, click + Add MCP server > Docker image tab.
- Enter the Name and Image reference (e.g.
registry.atelier.local/mcp-my-server:latest). - Click Deploy.
Researching a new MCP server with an LLM
Most of the work in wiring up a new MCP server is research: which npm/pip
package is the right one, what binary does it expose, what tools does it
expose, what arguments do those tools take, what environment variables does
it need. That’s exactly the kind of task an LLM is good at. Paste the prompt
below into any chat LLM (Claude, ChatGPT, etc.) with the destination filled
into the {GOAL} slot — it’ll come back with the field values for the
MCP Servers form (and, optionally, the Alerts channel form).
You are helping me configure an MCP (Model Context Protocol) server todeploy on Atelier — a self-hosted platform that runs MCP servers asKubernetes pods.
My goal: {GOAL}
(Examples: "send Slack messages from Atelier alerts", "let Nova searchGitHub issues", "send platform alerts to a Telegram chat".)
How Atelier deploys MCP servers from packages:- The user fills in a form: package name, command, environment variables.- A "universal runner" container runs `npm install -g <package>` (or `pip install` for Python), then spawns `supergateway --stdio <command> --outputTransport streamableHttp --port 3000` to expose the stdio MCP server over Streamable HTTP at /mcp.- The container is `mcp-universal-node` or `mcp-universal-python`.
Two upstream packaging quirks you should watch for and warn me about:1. Some published packages have a `bin` field pointing to a JS file that is NOT chmod +x'd in the tarball. `npm install -g` silently skips creating the /usr/local/bin/<name> symlink and the supergateway child fails with exit 127.2. Some published packages have NO `bin` field at all. Same outcome.
The safe workaround in both cases: set the Command field to node /usr/local/lib/node_modules/<package-name>/<main-or-bin-path>where the path is the value of `main` or `bin` in the package'spackage.json. For pip packages the equivalent is python -m <module>when the package doesn't install a console script.
If a target service has an OFFICIAL MCP server (published by the service'sown org or maintained inside a well-known monorepo likemodelcontextprotocol/servers), prefer it over community alternatives.
Please research and answer:
1. Recommended package — name on npm or PyPI, with a one-line justification. If there are 2-3 reasonable options, mention them and pick a default.2. Where to get credentials (e.g. "Get an API key from {service-name} dashboard → API Keys") and the exact env variable names the package reads.3. **MCP Server form** — exact values to paste into Atelier's Settings → MCP Servers → + Add (npm-package or pip-package tab): - Name: a lowercase DNS label (suggest one) - Package: the npm/pip name - Command: prefer `<bin-name>` if you're confident the bin is properly installable; otherwise the safe `node /usr/local/lib/...` or `python -m ...` form - Environment variables: KEY=value pairs4. The MCP tool(s) the server exposes — names + argument schemas. Highlight any field that's strictly-typed (e.g. "chatId must be a string, even though it looks like a number").5. If my goal is "wire up an outbound alert channel": - **Alert channel form** — values for Settings → Automation → Alerts → <severity>: - MCP server: the name from step 3 - MCP tool: the tool name from step 4 - Payload template: a JSON object using Atelier's placeholders `{title}`, `{body}`, `{severity}`, `{source}` — and respecting the strict typing you flagged.
Output format: a Markdown answer, with the form-field values in fencedcode blocks so I can copy them straight into Atelier's UI.Verify the answer against the live package. LLMs occasionally invent package names or tool names. After the LLM gives you its answer, double-check the package exists on the relevant registry — npm:
https://www.npmjs.com/package/<name>, PyPI:https://pypi.org/project/<name>/. If something doesn’t behave once deployed, Atelier’s Send test button (Settings → Automation → Alerts) and the alert-history tooltip surface the MCP server’s actual error message — often that’s the fastest way to spot a wrong argument type or a missing env var.
Enabling Servers for Nova
By default, deployed servers’ tools are not offered to Nova. To enable a server:
- Find the server in the Deployed Servers list.
- Check the Chat checkbox.
- The server’s tools are now available to Nova in chat.
When enabled, Nova discovers the server’s tools at the start of each conversation and can call them as needed. For example, with the Fetch server enabled, Nova can retrieve content from URLs while answering you. (Alert channels reference an MCP server + tool directly in the channel form and don’t need this checkbox.)
Discovering Available Tools
- Click Tools on a deployed server to expand its tool list.
- Each tool shows its name and a description of what it does.
Tool names are prefixed with the server name (e.g. mcp_fetch_fetch, mcp_brave-search_brave_web_search) to avoid conflicts.
Deleting an MCP Server
- Click the Delete button on the deployed server.
- Confirm when prompted.
- The server’s Kubernetes Deployment and Service are removed.
Available MCP Servers
| Server | Description | Required Config |
|---|---|---|
| Fetch | Fetches web content and converts it to markdown | None |
| Brave Search | Web and local search via Brave Search API | BRAVE_API_KEY |
API Tokens
Anything that talks to the Atelier API programmatically — an MCP client, an agent,
a script, a CI job — should authenticate with a persistent API token.
(A short-lived login token works on the same Authorization: Bearer … path, but
expires in 24 hours — see Getting an API Token.) Persistent
tokens are separate from your login session: they don’t expire on their own, you
can see and revoke them, and each carries a chosen role. Use one instead of copying
a login token out of your browser.
Creating a token
API tokens are managed in Settings → System → API Tokens. This page is admin-only — you must be signed in as an Admin to create or revoke tokens.
- Enter a name (e.g.
hermes-agent,claude-desktop,ci-deploy). - Pick the role the token acts as — scope it no higher than the client needs:
- Viewer — read-only (list/inspect apps, logs, metrics).
- Developer — also manage apps (deploy, redeploy, secrets, pause/resume).
- Admin — full access, including settings and token management.
- Optionally set an expiry (defaults to never; 30- or 90-day options are available for tighter hygiene).
- Click Create. The token — a string starting with
atl_— is shown once. Copy it now; it can’t be retrieved again.
Using a token
Send it as a bearer header on every request:
curl -s "$ATELIER_API_URL/api/apps" -H "Authorization: Bearer atl_…"A 401 means the token is missing, revoked, or expired; a 403 means the
token’s role is too low for that operation.
Listing & revoking
The panel lists every token by name and short prefix (atl_a1b2c3d4…) with its role,
last-used time, and expiry. Click Revoke to disable one immediately — any
client using it loses access on its next request. The full secret is never
displayed again after creation.
External MCP Integration (Claude Desktop, Cursor, etc.)
Deprecated and removed. Atelier previously shipped
atelier-mcp, a standalone binary that exposed the REST API as MCP tools for external clients (Claude Desktop, Cursor, VS Code, etc.). It is no longer built or published.The canonical way to drive Atelier from an external agent or AI client is now the Agent Skills route described in Connecting an Agent (Skills) below: portable
SKILL.mdfiles that teach any skill-aware agent to build on and operate your instance over the REST API directly — no extra binary to install or run, and no per-client MCP config. Pair a skill with a persistent API token and any agent host can manage your apps, read logs, deploy changes, and query Nova.(Unrelated: Atelier can still host MCP servers as apps for Nova and the alert system — see MCP Servers above. That is a different feature and is unaffected.)
Connecting an Agent (Skills)
Atelier ships Agent Skills — portable SKILL.md files that teach any
skill-aware agent (Hermes, Claude Code, Claude Desktop) to drive Atelier over its
REST API directly, with no extra server or binary to run. A skill is just a
Markdown document the agent reads; it pairs with an API token for
auth. The skills are the interface for agents: any agent host that can read a
skill can build for and operate your Atelier instance.
The two skills
atelier-operate— manage running apps: list/inspect, logs, metrics, pause/resume/redeploy/delete, secrets, settings, and Nova. The right skill for an operations agent that watches and tends the platform.atelier-build— deploy apps the agent authors: scaffold an app, push code + a Dockerfile through the git proxy, watch build progress stream, and wire up secrets. The platform doesn’t write code — the agent does — so “build me an app on Atelier” means the agent codes it and ships it through the Push Your Own Code flow this skill documents.
Give an agent only the skill it needs — an operator agent shouldn’t carry the authoring surface.
Getting the skills
The skills ship as a skills.tar.gz bundle with each release. Download and
unpack it:
curl -Lo skills.tar.gz https://tryatelier.blob.core.windows.net/tryatelier/latest/skills.tar.gztar xzf skills.tar.gz # -> skills/atelier-operate/SKILL.md, skills/atelier-build/SKILL.mdThen drop the relevant SKILL.md into your agent’s skills directory (e.g. for
Hermes, ~/.hermes/skills/ or via hermes skills).
Pointing the skill at your instance
Both skills read two values:
ATELIER_API_TOKEN— a persistent token (see API Tokens).ATELIER_API_URL— choose by where the agent runs:- In-cluster (the agent itself runs as an Atelier app, e.g. Hermes): use the
internal service address
http://atelier-core.atelier.svc.cluster.local:8080. Cluster DNS resolves it directly — no hosts-file or ingress changes. (Your portal hostname does not resolve inside the cluster.) - External (e.g. Claude Desktop on your laptop): a hostname or IP that resolves on that machine — your portal domain (added to its hosts file/DNS) or the node IP.
- In-cluster (the agent itself runs as an Atelier app, e.g. Hermes): use the
internal service address
Example: a Telegram-driven operator agent (Hermes)
- Mint a token — Settings → System → API Tokens → name it
hermes-agent, role Developer, expiry never. Copy it. - Install the operate skill — from the unpacked bundle, put
atelier-operate/SKILL.mdin Hermes’ skills directory. - Set the env on the Hermes deployment:
ATELIER_API_TOKEN=atl_…andATELIER_API_URL=http://atelier-core.atelier.svc.cluster.local:8080. - Ask it — from Telegram: “list my Atelier apps” / “why is notes-app unhealthy?” / “restart the metrics dashboard.” Hermes calls the API per the skill and answers in chat.
Bring-your-own-code (push to build)
For agents (or people) that write the code themselves, Atelier is the build-and-host target via a git push:
- Scaffold an empty app + repo with a push-to-build webhook —
POST /api/apps/scaffoldwith{"name": "my-app"}. The response includes aclone_url. - Clone, add your code + a Dockerfile, and push. Authenticate git with your
API token as the password (any username) — it goes through Atelier’s git
proxy, so no separate Gitea credentials are needed. Clone with a username-only
URL and let git prompt for the password (paste the token) rather than putting
it in the URL (URL-embedded credentials leak via shell history):
git clone http://x-access-token@<host>/api/git/my-app.gitFor non-interactive use (CI, an agent), supply the token via a credential helper orGIT_ASKPASSinstead of prompting. - The push triggers a direct build (build-from-source, no LLM) and
deploys it. The app goes
scaffolded → building → running.
What the build needs: at least one Dockerfile — at the repo root
(one image) or per subdirectory for a multi-service app (backend/Dockerfile,
frontend/Dockerfile, …). Each should EXPOSE <port> to set its served port
(if omitted, the port defaults to 8080). The build context is the Dockerfile’s
directory, so COPY paths must be relative to it and everything must be
committed. Atelier auto-generates the Kubernetes manifests and injects the
app’s secrets — you don’t write any YAML, and the build does not write
anything back to your repo: it builds from the commit you pushed and leaves
main untouched (no generated atelier-spec.yaml or manifests committed, and no
rebase needed before your next push).
Every push builds direct — your committed Dockerfile(s), as-is. There are no other push build modes.
This is the atelier-build skill’s flow — ideal for a coding agent like
Claude Code or Claude Desktop that authors locally and ships to Atelier.
Bringing your own agent image
The Build with AI terminal runs a bundled agent image (Claude Code plus the Atelier skills). If you’d rather run a different agent in that terminal — a different CLI agent, your own custom build, an air-gapped mirror — you can swap the image without touching anything else.
The contract a replacement image must honour is deliberately small: it just needs a shell. When a user attaches, the platform:
- sets
ATELIER_API_URLin the pod environment (the in-cluster API address); - writes a role-scoped session token to a per-user file in the session’s
home directory — your image’s shell profile reads that file and exports it
as
ATELIER_API_TOKEN(the bundled image’s profile does exactly this); - attaches the user to a
tmuxsession (so installtmux, or adjust the expectation — sessions and reconnection rely on it); - makes the Atelier skills available at
/opt/atelier/skills.
Anything your agent does against the platform goes through that token, so it’s
scoped to the attaching user’s role and attributed to them — same as the
bundled agent. To use your image, set it under Settings → AI → Bundled Agent
→ Advanced → Image override (or the agent.image platform setting) and
enable. For an air-gapped install, set the ATELIER_AGENT_IMAGE environment
variable on atelier-core instead, pointing at your in-cluster registry.
This is what keeps the bundled agent from being special-cased: it drives the platform through exactly the surface — skills, token, REST — that any external agent uses. The bundled one is just the default, not the only option.
Monitoring with Event Webhooks
The sections above let an agent drive the platform. Event webhooks are the other direction: they let the platform push events to your agent (or any HTTP endpoint), so an external monitor can watch Atelier and react. Where an alert channel delivers a notification to a chat tool via an MCP server, an event webhook POSTs a structured, signed JSON event to a URL you control — meant for programmatic reaction, not a human inbox.
How it works
You register a subscription with a URL, a shared secret, and the event types
you care about. When a matching event fires, Atelier sends a POST to your URL
with the event as the body and a signature header. Every request carries:
X-Webhook-Signature: <hex HMAC-SHA256 of the raw body, keyed with your secret>Verify that signature before trusting the payload — it’s the same scheme Atelier uses for its inbound Git webhooks. The body looks like:
{ "event_id": "evt_…", "event_type": "alert.raised", "timestamp": "2026-06-15T14:30:00Z", "severity": "critical", "source": { "app_name": "my-app", "namespace": "atelier-apps" }, "message": "Node 'node-1' at 95% pod capacity (104/110)", "details": { "dedup_key": "pod_capacity:node-1:critical", "source": "supervisor" }}Registering a subscription
Manage event webhooks in the UI under Settings → Automation → Event webhooks (admin only): register a receiver URL + shared secret, pick event types (or all), enable/disable, send a test event, and see each subscription’s last delivery result. The same operations are available on the admin API with an Admin API token:
| Method & path | Purpose |
|---|---|
POST /api/event-webhooks | Register { "url", "secret", "event_types": ["*"] } (returns { "id" }). ["*"] subscribes to everything; otherwise list exact types. |
GET /api/event-webhooks | List subscriptions (secrets are never returned). |
POST /api/event-webhooks/{id}/test | Send a synthetic signed event and report the delivery status — handy for confirming your receiver and signature check. |
PATCH /api/event-webhooks/{id} | { "enabled": false } to pause without deleting. |
DELETE /api/event-webhooks/{id} | Remove a subscription. |
Delivery is best-effort: a non-2xx response is retried a few times with backoff,
then recorded (last_status / last_error on the subscription) and dropped, so
a slow or unreachable monitor never holds up the platform.
Which events fire
Build lifecycle. Every build emits build.started, then exactly one
terminal event:
build.succeeded— built and live.detailscarrysource_shaandbuild_id, so you can fetch the build (GET /api/apps/{name}/builds/{id}).build.failed— a hard build or deploy/rollout failure (details.source_sha).build.held— the pre-deploy lint gate held the deploy under the Hardened profile. This is not a failure: the prior version keeps running and an operator can override. Treat it differently frombuild.failed.
All three carry source_sha. (Problems before the build proper — a missing
Dockerfile, a failed clone or image import — don’t emit build events; they
surface as the app’s error status + the activity feed.)
Alerts. Event webhooks also ride the platform’s alert stream, so you
receive the same conditions the supervisor raises — alert.raised, plus
app.crash, app.oom, and image.cve_hold (a post-deploy CVE hold) inferred
from the alert. These fire at the same cadence as alert delivery
(de-duplicated), and snoozing an alert in the UI also quiets its webhook.
(Subscribe to ["*"] for everything, or list exact types — e.g.
["build.failed", "app.crash"] — to react only to what you care about.)
This is exactly how the Hermes operator agent monitors Atelier: it stands up a signed webhook receiver and subscribes, so platform problems reach it without a human relaying them.
Backup & Restore
Protect your platform data by creating downloadable backup archives and restoring from them when needed.
Creating a Backup
- Open Settings from the sidebar’s gear-icon popover.
- Scroll to the Backup & Restore section.
- Click Create Backup.
- A progress bar shows the current step (snapshotting database, archiving each repo).
- When complete, click Download backup to save the
.tar.gzarchive.
The backup includes:
- SQLite database — all app definitions, build records, settings, Nova memory, and events.
- Gitea repositories — the source code for every app.
Restoring from a Backup
Warning: Restoring replaces all platform data. This action cannot be undone.
- Open Settings from the sidebar’s gear-icon popover.
- In the Backup & Restore section, click Choose File and select a backup archive.
- Click Restore.
- Read the confirmation warning carefully, then click Yes, restore from backup.
- The platform restores Gitea repositories and writes the backup database.
- The platform automatically restarts to apply the restored database.
After the restart, all apps, settings, and history from the backup will be in place. Container images will need to be rebuilt by redeploying apps.
What’s Not Included
- Container images — these are rebuilt from the Gitea source code when you redeploy.
- App persistent volumes — user data in PVCs is not included in the backup.
- Platform secrets — API keys and provider credentials should be re-entered in Settings after restoring to a new server.
Registry Management
Browse and manage container images stored in the internal registry.
Accessing the Registry
Open the Settings popover (gear icon in the sidebar) and choose Registry.
Browsing Images
The registry page shows all stored images grouped by repository. Each entry shows:
- Repository name (e.g.
myapp-backend,myapp-frontend) - Tags (e.g.
latest,build-1,build-2)
Deploying from the Registry
- Click Deploy next to any image.
- The Import modal opens pre-filled with the image reference.
- Set the port and click Deploy.
Deleting Images
- Click the delete button next to a specific image tag.
- The image tag is removed from the registry.
Garbage Collection
After deleting images, the underlying storage isn’t freed immediately. Click Run GC to trigger garbage collection and reclaim disk space.
Disk-usage monitoring
The registry fills up over time as every build pushes new image tags. The platform now watches the registry’s disk and alerts before it’s a problem — a warning as it approaches full and a critical alert near capacity — through your configured alert channels, so you can reclaim space (or expand the volume) before it breaks builds. A push that fails because the registry is genuinely out of space is reported as a clear “registry full” error rather than a confusing build failure.
User Management (Admin)
Admin users can manage other users on the platform.
Accessing User Management
Open the Settings popover (gear icon in the sidebar) and choose Users (only visible to admins).
Creating a User
- Click Add User.
- Enter a username, password, and optionally an email.
- Select a role:
- Viewer — read-only access (view apps, logs, metrics)
- Developer — can create, update, and deploy apps
- Admin — full access including user management and settings
- Click Create.
Changing a User’s Role
- Find the user in the list.
- Use the role dropdown to select a new role.
- The change takes effect immediately.
Activating / Deactivating Users
- Toggle the Active switch for the user.
- Deactivated users cannot log in but their account is preserved.
Deleting a User
- Click the delete button next to the user.
- Confirm when prompted. The user account is permanently removed.
Changing Your Password
- Click your avatar at the bottom of the sidebar.
- Select Change Password.
- Enter your current password and the new password.
- Click Save.
Forgotten Password / Locked Out
If you’ve forgotten your password and can’t log in, reset it from the server:
# Interactive — reads password from stdin (preferred)kubectl exec -it -n atelier deployment/atelier-core -- \ atelier-core reset-password --user <username>This requires kubectl access to the cluster (i.e. SSH access to the server). The command validates the new password against the standard policy and updates the hash directly in the database.
For other recovery scenarios (Gitea admin, K10 dashboard, expired JWTs), see the Troubleshooting guide.
Activity & System Logs
App Activity Log
Each app tracks significant events. View them in the Activity tab within the app detail view.
Events include:
- App created
- Build triggered / completed / failed
- Deployment rollback
- Secrets updated
- Webhook configured
- App paused / resumed / archived
Global Activity Feed
Click the Activity icon in the sidebar’s utility section to see events across all apps in a single timeline.
System Logs
System-level events from background services are available in Settings > System Logs.
These cover:
- Crash monitor — pod failure detection and recovery
- Image monitor — base image update checks
- Background tasks — scan jobs, code review runs
Quick Reference
| Action | Where |
|---|---|
| Deploy a container image | Apps page > + Create > Deploy an Image |
| Clone a Git repo | Apps page > + Create > Clone a Git Repo |
| Push your own code | Apps page > + Create > Push Your Own Code… |
| View live logs | App detail > Logs tab > Live |
| Rollback a build | App detail > Deploy tab > History > Rollback |
| Rebuild from source | App detail > Deploy tab > Source > Rebuild from Gitea |
| Toggle auto-deploy on push | App detail header > Auto-deploy toggle |
| Add secrets | App detail > Secrets tab |
| View scan results | App detail > Security tab |
| Pause/Resume | App detail header buttons |
| Configure LLM | Sidebar gear icon > Settings > AI |
| Choose a pipeline profile | Sidebar gear icon > Settings > Build |
| Deploy MCP server (catalog) | Settings > System > MCP Servers > Deploy |
| Deploy custom MCP server | Settings > System > MCP Servers > + Add MCP server (npm/pip package or Docker image) |
| Enable MCP tools for Nova | Settings > System > MCP Servers > Chat checkbox |
| Create an API token | Settings > System > API Tokens (admin only) |
| Review supervisor proposals | Sidebar > Approvals icon |
| Ask Nova for help | Sidebar > Nova icon |
| View archived apps | Sidebar > Archived icon |
| Global activity feed | Sidebar > Activity icon |
| Manage users | Sidebar gear icon > Users (admin only) |
| Registry browser | Sidebar gear icon > Registry |
| Documentation | Sidebar gear icon > Documentation ↗ |