Features Audiobooks Podcasts Workflow Requirements Pricing Contact
Customise
Appearance
Accent Color
Professional AI Audio Production · On-Device

Your words,
broadcast quality.

Phonara Studio turns books, scripts, and podcast episodes into distribution-ready audio — ACX-compliant MP3 audiobooks and podcast episodes, generated entirely on your machine.

Buy Phonara Studio V1
Phonara Studio — Chapter 12 · The Signal
✓ 97% — Validated ⚠ 83% — Review ⚡ Generating… MP3 · ACX Ready
100%
On-Device
4
Voice Channels
192k
ACX MP3 Bitrate
44.1
kHz Output
−20
dB RMS Target

A complete production studio
in a single application.

📖
Smart Import
Import ePub, DOCX, Google Docs, or plain text. Phonara detects chapters automatically and shows word counts so you choose exactly what to produce.
🎤
Voice Cloning
Clone any voice from a 5–30 second reference clip. Up to 4 independent speaker channels per project for multi-voice narration and dialogue.
Full Automation
Set Save & Continue and walk away. Phonara generates, validates, saves, and advances through every chapter overnight without intervention.
🔍
Whisper Validation
Every paragraph is transcribed by OpenAI Whisper and compared word-for-word. Green, yellow, and red badges instantly surface lines that need a re-take.
🎙
Podcast Studio
Scaffold episodes with Intro, Main, and Outro segments. Pin show defaults once and reuse them across every episode automatically.
🎛
Timeline Editor
Visual waveform timeline with clip trimming, drag-to-reorder, multi-track layout, undo/redo, and real-time playback position tracking.
📡
Distribution Ready
Download produces ACX-compliant MP3 (192kbps) for audiobooks or 128kbps for podcasts — no DAW, no post-processing, no extra steps.
🔒
Fully Local
Everything runs on your hardware. No cloud API keys, no usage fees, no internet connection required after initial model download.
🎨
11 Themes
Sapphire, Midnight, Burgundy, Emerald, Plum, and six more. Dark and light mode. Your workspace, your aesthetic.

ACX-compliant.
Upload and go.

Phonara handles the entire technical spec automatically. Every download meets the Audible/ACX audio requirements out of the box.

Format MP3 · 192kbps CBR · Mono
Sample Rate 44.1 kHz
RMS Level −20 dB  (ACX: −23 to −18)
Peak −3 dBFS max
Room Tone 1s open · 1s close
Compatible ACX · Audible · KDP · Findaway
ACX Compliance Check
RMS Level−20.1 dB ✓
Peak Level−3.2 dBFS ✓
Noise Floor−62 dB ✓
Bitrate192 kbps CBR ✓
✓ Ready to submit to ACX
🎙 The Signal Podcast — Episode 12
🎵
Intro
Show default · 45 seconds
✓ Pinned
Main Content
Custom · 24 minutes 18 seconds
✓ Done
🎵
Outro
Show default · 30 seconds
✓ Pinned
⚡ Combine Episode MP3 · 128kbps · −16 dB RMS

Intro. Main. Outro.
Combined. Done.

Phonara manages show-level defaults so your intro and outro record once and reuse forever. Every episode is one click from a submission-ready MP3.

📌
Auto-pin show defaults
First recorded intro/outro becomes the show standard automatically
🔗
Smart episode library
All episodes indexed by show with segment done-states preserved
📦
Distribution-ready MP3
128kbps CBR · 44.1kHz · −16 dB RMS — upload directly to any host

From manuscript
to marketplace.

Five steps from import to a file ready to upload to ACX, Audible, Spotify, or any podcast host.

01
Load model & create voice
Select your device in Settings, load the AI model, and upload a 5–30 second reference clip to clone a voice. The speaker tab turns green when ready.
02
Import your content
Drop an ePub, DOCX, Google Doc, or text file. Phonara detects chapters, shows word counts, and lets you select exactly which sections to produce.
03
Parse & review
Text is split into paragraph cards. Edit inline, adjust pauses, assign speakers. Every detail is editable before a single byte of audio is generated.
04
Generate & validate
Click Generate All. Whisper validates every paragraph in real time. Green, yellow, and red badges surface issues. Regenerate any line with one click.
05
Download & distribute
Click Download. Receive an ACX-compliant MP3 — resampled, loudness-normalized, room tone added — ready to upload without any further processing.

Expressive syntax,
inline.

Embed non-verbal sounds and pronunciation overrides directly in your text. No post-processing. No audio editor.

Non-Verbal Sounds
[laughter] [sigh] [confirmation-en] [question-en] [question-ah] [question-oh] [surprise-ah] [surprise-oh] [surprise-wa] [dissatisfaction-hnn]
"You're serious?" she said. [question-oh] He laughed. [laughter] "Completely."
CMU Pronunciation Override
The algorithm at Theranos[TH EH R AH N OW S] had been designed to deceive. She worked at MIT[EH M AY T IY] for three years before leaving.
Use the CMU Pronouncing Dictionary to find phoneme strings for any English word. Copy the uppercase sequence in brackets immediately after the word.

The right format,
every time.

📖
Audiobook
MP3 · 192kbps CBR
Sample Rate44.1 kHz
RMS Level−20 dB
Peak−3 dBFS
Room Tone1s open + close
PlatformsACX · Audible · KDP
🎙
Podcast
MP3 · 128kbps CBR
Sample Rate44.1 kHz
RMS Level−16 dB
Peak−3 dBFS
Room ToneNone
PlatformsSpotify · Apple · Buzzsprout
Internal / Preview
Float32 · 24 kHz
UsageBrowser playback
Voice Refsvoice_temp/*.wav
ValidationWhisper input
NoteNot for distribution
Converts toMP3 on download

What you need
to get started.

Phonara runs entirely on your hardware. An NVIDIA GPU dramatically speeds up generation, but CPU mode is fully supported for all platforms.

Windows
Recommended
Full CUDA GPU support · Fastest generation
Linux
Fully Supported
CUDA GPU + CPU · Native performance
macOS
Supported
CPU mode · Apple Silicon MPS planned
Component
Minimum
Recommended
Notes
🐍
Python
Runtime
3.10
3.11
3.12 supported · 3.9 or earlier will not work
🎞
FFmpeg
Audio encoding
6.0
6.1+
Required for MP3 export · libmp3lame must be included
🔍
OpenAI Whisper
Validation engine
any
latest
Installed via pip · Auto-downloads model weights on first use
🔥
PyTorch
AI framework
2.1
2.8 + CUDA 12.8
CPU builds supported · CUDA 11.8+ for GPU acceleration
💾
GPU (NVIDIA)
Optional · strongly recommended
8 GB VRAM
16+ GB VRAM
RTX 3080+ ideal · CPU fallback ~10–20× slower
🖥
RAM
System memory
8 GB
16+ GB
More RAM helps with long chapters and multi-track sessions
💿
Disk Space
Model + project storage
10 GB
20+ GB
~5 GB for model weights · remaining for your episode library
Python Libraries — installed automatically by setup script
flaskWeb server
flask-corsCORS headers
omnivoiceTTS model
openai-whisperValidation
soundfileAudio I/O
scipyResampling
numpyAudio math
torchAI runtime
torchaudioAudio models
transformersHuggingFace
huggingface-hubModel downloads
ebooklibePub parsing
python-docxDOCX parsing
beautifulsoup4HTML parsing
requestsHTTP client

Up and running
in minutes.

Our platform setup scripts handle every dependency automatically — Python, PyTorch with the right CUDA version, FFmpeg, Whisper, and all libraries. One script, done.

1
Download & Extract
Download phonara-studio-windows.zip and extract it to your preferred location — e.g. C:\PhonaraStudio\
2
Run the setup script
Right-click setup_windows.bat and choose Run as administrator. The script will:
  • Check for Python 3.11 and install via winget if missing
  • Install FFmpeg via winget (required for MP3 export)
  • Create an isolated virtual environment in venv\
  • Detect your GPU and install the correct CUDA PyTorch build
  • Install all Python dependencies from requirements.txt
3
Launch Phonara Studio
Double-click start_phonara.bat. The app opens at http://localhost:7000 in your browser automatically.
setup_windows.bat
:: [1/6] Check / install Python 3.11
python --version >nul 2>&1 || (
  winget install --id Python.Python.3.11 --silent
)

:: [2/6] Install FFmpeg
ffmpeg -version >nul 2>&1 || (
  winget install --id Gyan.FFmpeg --silent
)

:: [3/6] Create virtual environment
python -m venv venv
call venv\Scripts\activate.bat

:: [4/6] Detect GPU — correct PyTorch build
nvidia-smi >nul 2>&1 && (
  pip install torch torchaudio ^
    --extra-index-url https://download.pytorch.org/whl/cu128
) || (
  pip install torch torchaudio ^
    --index-url https://download.pytorch.org/whl/cpu
)

:: [5/6] Install all Python dependencies
pip install -r requirements.txt

:: [6/6] Create project folders
mkdir templates episodes voices voice_temp media shows

echo Setup complete — run start_phonara.bat
First Launch Note — On the very first launch, Phonara downloads the OmniVoice AI model weights from HuggingFace (~5 GB). This happens once and is cached to your models/ folder. Subsequent launches start in seconds with no internet connection required.

Professional audio.
Your machine. Your rules.

No subscriptions. No cloud fees. No API keys. Run on your GPU, produce ACX-ready audiobooks and podcast-ready episodes, and keep every file.

Buy Phonara Studio V1

Windows · Linux · macOS  ·  NVIDIA GPU recommended  ·  CPU supported