tabby Command: Examples, Options, and Usage

常用示例

Start Tabby server

tabby serve --model [StarCoder-1B] --device cuda

Start server with chat model

tabby serve --model [StarCoder-1B] --chat-model [Qwen2-1.5B-Instruct] --device cuda

Run with CPU

tabby serve --model [StarCoder-1B] --device cpu

Run via Docker

docker run -it --gpus all -p 8080:8080 -v $HOME/.tabby:/data tabbyml/tabby serve --model [StarCoder-1B] --device cuda

Specify port

tabby serve --model [StarCoder-1B] --port [8080]

说明

tabby is a self-hosted AI coding assistant that provides code completion, inline edits, and chat capabilities. Unlike cloud-hosted alternatives, Tabby runs entirely on your own infrastructure, giving you full control over models, data, and costs. The serve command starts the Tabby API server, which exposes an OpenAPI-compatible interface for IDE extensions and other clients. The server supports multiple code completion models including StarCoder, CodeLlama, and CodeGen families. Tabby is optimized for consumer-grade GPUs and supports NVIDIA CUDA for Linux/Windows and Apple Metal for macOS. CPU-only mode is available for environments without GPU acceleration, though with reduced performance. Data is stored in ~/.tabby by default, including model weights, configuration, and indexed code repositories. The server provides a web UI at the configured port for administration, model management, and repository indexing.

参数

--model _name_: Code completion model to use (e.g., StarCoder-1B, CodeLlama-7B).
--chat-model _name_: Conversational AI model for chat features (e.g., Qwen2-1.5B-Instruct).
--device _type_: Hardware acceleration: cuda (NVIDIA GPU), metal (Apple M1/M2), cpu.
--port _port_: Port to expose the API server. Default: 8080.
--help: Display help information.

FAQ

What is the tabby command used for?

tabby is a self-hosted AI coding assistant that provides code completion, inline edits, and chat capabilities. Unlike cloud-hosted alternatives, Tabby runs entirely on your own infrastructure, giving you full control over models, data, and costs. The serve command starts the Tabby API server, which exposes an OpenAPI-compatible interface for IDE extensions and other clients. The server supports multiple code completion models including StarCoder, CodeLlama, and CodeGen families. Tabby is optimized for consumer-grade GPUs and supports NVIDIA CUDA for Linux/Windows and Apple Metal for macOS. CPU-only mode is available for environments without GPU acceleration, though with reduced performance. Data is stored in ~/.tabby by default, including model weights, configuration, and indexed code repositories. The server provides a web UI at the configured port for administration, model management, and repository indexing.

How do I run a basic tabby example?

Run `tabby serve --model [StarCoder-1B] --device cuda` in a terminal, then adjust file names, paths, flags, or remote targets for your system.

What does --model _name_ do in tabby?

Code completion model to use (e.g., StarCoder-1B, CodeLlama-7B).

tabby 命令

常用示例

说明

参数

FAQ

相关命令