OpenMOSS/MOSS-TTS Installation Guide

⚡

Automated Install (Recommended)

Quick installation instructions for OpenMOSS/MOSS-TTS. This is the fastest way to complete project installation and setup.

Install via curl

curl -fsSL https://hexmos.com/ipm-install | bash && 
ipm i OpenMOSS/MOSS-TTS

Install via PowerShell

iwr https://hexmos.com/ipm-install-ps -UseBasicParsing | iex; 
ipm i OpenMOSS/MOSS-TTS

Install via npx

npx @hexmos/ipm i OpenMOSS/MOSS-TTS

Prerequisites

Python

language

Version: >=3.10

PyTorch

library

Version: >=2.0

CUDA

toolkit

Version: >=12.8

Conda

package manager

Version: >=23.1

uv

package manager

Version: >=0.1.14

llama.cpp

library

Version: >=0.9.0

onnxruntime-gpu

library

tensorrt

library

Manual Installation Methods

Environment Setup using Conda

conda create -n moss-tts python=3.12 -y

conda activate moss-tts

git clone https://github.com/OpenMOSS/MOSS-TTS.git

cd MOSS-TTS

pip install --extra-index-url https://download.pytorch.org/whl/cu128 -e ".[torch-runtime]"

Environment Setup using uv

git clone https://github.com/OpenMOSS/MOSS-TTS.git

cd MOSS-TTS

uv venv --python 3.12 .venv

source .venv/bin/activate

uv pip install --torch-backend cu128 -e ".[torch-runtime]"

Optional: Install FlashAttention 2 (using Conda/pip)

git clone https://github.com/OpenMOSS/MOSS-TTS.git

cd MOSS-TTS

pip install --extra-index-url https://download.pytorch.org/whl/cu128 -e "build-backend=setuptools.build_meta --config-settings=""config_settings.build_backend.cpp_args=["-fno-matmul]"" --config-settings=""config_settings.build_backend.common_args=["-fno-matmul]"" --config-settings=""config_settings.build_backend.cxx_args=["-fno-matmul]"" --no-build-isolation -e .[torch-runtime,flash-attn]"

Optional: Install FlashAttention 2 (using uv)

git clone https://github.com/OpenMOSS/MOSS-TTS.git

cd MOSS-TTS

uv venv --python 3.12 .venv

source .venv/bin/activate

uv pip install --torch-backend cu128 -e ".[torch-runtime,flash-attn]"

llama.cpp Backend (Torch-Free Inference)

pip install -e "build-backend=setuptools.build_meta --config-settings=""config_settings.build_backend.cpp_args=["-fno-matmul]"" --config-settings=""config_settings.build_backend.common_args=["-fno-matmul]"" --config-settings=""config_settings.build_backend.cxx_args=["-fno-matmul]"" --no-build-isolation -e .[llama-cpp-onnx]"

huggingface-cli download OpenMOSS-Team/MOSS-TTS-GGUF --local-dir weights/MOSS-TTS-GGUF

huggingface-cli download OpenMOSS-Team/MOSS-Audio-Tokenizer-ONNX --local-dir weights/MOSS-Audio-Tokenizer-ONNX

cd moss_tts_delay/llama_cpp && bash build_bridge.sh /path/to/llama.cpp && cd ../..

python -m moss_tts_delay.llama_cpp --config configs/llama_cpp/default.yaml --text "Hello, world!" --output output.wav

SGLang Backend (Accelerated Inference)

git clone https://github.com/OpenMOSS/sglang.git

cd sglang && pip install -e ./python[all]

pip install nvidia-cudnn-cu12==9.16.0.29

huggingface-cli download OpenMOSS-Team/MOSS-TTS --local-dir weights/MOSS-TTS

huggingface-cli download OpenMOSS-Team/MOSS-Audio-Tokenizer --local-dir weights/MOSS-Audio-Tokenizer

python scripts/fuse_moss_tts_delay_with_codec.py --model-path weights/MOSS-TTS --codec-model-path weights/MOSS-Audio-Tokenizer --save-path weights/MOSS-TTS-Delay-With-Codec

sglang serve --model-path weights/MOSS-TTS-Delay-With-Codec --delay-pattern --trust-remote-code

Post Installation Steps

To run the basic usage example, ensure you have the necessary Python environment activated and the MOSS-TTS repository cloned and navigated into.
The example Python script can be run directly after installation to generate audio samples.

← Back to library All Installation Guides