UnaMentis - Open Source Voice AI Tutoring Platform

Why UnaMentis?

Existing voice AI tools struggle with extended educational sessions. UnaMentis was built from the ground up to enable natural, flowing conversations that can last 60-90+ minutes without degradation.

🎙️

Voice-Native Design

Built for voice from day one. Natural interruption handling, turn-taking logic, and voice activity detection create fluid conversations.

🔌

Provider Agnostic

Swap STT, TTS, and LLM providers without code changes. Support for OpenAI, Anthropic, ElevenLabs, Deepgram, and self-hosted options.

📚

Curriculum System

UMLCF (UnaMentis Markup Language Curriculum Format) is a purpose-built specification for conversational AI tutoring with voice-optimized content.

⚡

Low Latency

Sub-500ms median end-to-end turn latency through careful architecture, prefetching, and intelligent routing.

🏠

Self-Hosted Ready

Run your own STT, TTS, and LLM servers for privacy-first deployments. Full support for Ollama, llama.cpp, Piper, and more.

📊

Full Observability

Real-time telemetry, cost tracking per provider, latency metrics, and thermal monitoring for production deployments.

Architecture Overview

Mobile Application (iOS in development, Android planned)

SwiftUI Views

Session UI

Curriculum Navigator

Analytics Dashboard

↓

Core Business Logic

SessionManager

CurriculumEngine

TelemetryEngine

PatchPanel Router

↓

Service Layer (Actor-based)

STT Services

TTS Services

LLM Services

VAD Service

↓

Infrastructure

Audio Engine

Core Data

URLSession

CoreML (VAD)

Explore Full Architecture

Supported Providers

UnaMentis supports a wide range of providers for each component of the voice pipeline. Mix and match based on your needs for quality, cost, latency, or privacy.

Speech-to-Text

Cloud AssemblyAI Universal
Cloud Deepgram Nova-3
Cloud OpenAI Whisper
Device Apple Speech
Device GLM-ASR-Nano
Self-Hosted GLM-ASR Server

Text-to-Speech

Cloud ElevenLabs Flash/Turbo
Cloud Deepgram Aura-2
Device Apple TTS
Self-Hosted Piper
Self-Hosted VibeVoice

Large Language Models

Cloud OpenAI GPT-4o / 4o-mini
Cloud Anthropic Claude 3.5
Self-Hosted Ollama
Self-Hosted llama.cpp
Self-Hosted vLLM
Device llama.cpp (experimental)

Voice Activity Detection

Device Silero VAD (CoreML)
Device TEN VAD
Device WebRTC VAD

UMLCF Curriculum Format

UMLCF (UnaMentis Markup Language Curriculum Format) is a JSON-based curriculum specification designed specifically for conversational AI tutoring.

Voice-Native

Every text field has optional spoken variants optimized for TTS pronunciation.

Standards-Grounded

Maps to IEEE LOM, LRMI, SCORM, xAPI, QTI, CASE, and Open Badges.

Tutoring-First

Built for natural conversation with stopping points and misconception handling.

Unlimited Depth

Topics can nest to arbitrary depth for complex subject matter.

{
  "umlcf_version": "1.0.0",
  "curriculum": {
    "id": "intro-machine-learning",
    "title": "Introduction to Machine Learning",
    "topics": [{
      "id": "supervised-learning",
      "title": "Supervised Learning",
      "learningObjectives": [
        "Explain the difference between supervised and unsupervised learning",
        "Identify when to use classification vs regression"
      ],
      "transcript": {
        "text": "Let's start with supervised learning...",
        "spokenText": "Let's start with supervised learning.",
        "stoppingPoints": [{
          "afterParagraph": 2,
          "comprehensionCheck": "Can you give me an example of a labeled dataset?"
        }]
      }
    }]
  }
}

Learn About UMLCF

Server Components

UnaMentis includes a Python-based management server and web dashboard for monitoring, curriculum management, and analytics.

Management Server

Async HTTP server with WebSocket support for:

Remote logging aggregation from iOS clients
Real-time metrics streaming
Resource monitoring (CPU, memory, thermal)
Idle state management

Python 3.11+ / aiohttp / asyncio

Curriculum Database

Storage and retrieval for UMLCF curricula:

File-based storage for development
PostgreSQL support for production
Search and filtering by metadata
Topic hierarchy navigation

PostgreSQL / File-based / JSON

Web Dashboard

React-based administration interface:

Curriculum browsing and management
Session analytics visualization
Real-time metrics streaming
Provider health monitoring

Next.js / React / TypeScript

Technical Stack

Mobile Client (iOS)

Swift 6.0 with strict concurrency
SwiftUI for all views
AVFoundation for audio
CoreML for on-device ML
Core Data for persistence

Backend

Python 3.11+
aiohttp for async HTTP
PostgreSQL (production)
Next.js dashboard
WebSocket for real-time

Performance

<500ms median turn latency
<1000ms P99 latency
90+ minute session stability
<50MB memory growth
Thermal monitoring

Quality

119+ automated tests
Real-over-mock testing
SwiftLint / SwiftFormat
Accessibility-first UI
Comprehensive docs

Team

UnaMentis is developed and maintained by a dedicated team committed to making high-quality AI tutoring accessible to everyone.

RA

Richard Amerman

Founder and Project Lead

CG

Cy Goerdt

Partner

Ready to Get Started?

UnaMentis is fully open source and ready for contribution. Whether you want to use it, extend it, or help improve it, we welcome you.

Quick Start Guide View on GitHub