Voice AI Tutoring for Everyone

An open source mobile platform for extended voice-based educational conversations. iOS app in development for initial release. Android planned. Designed for 90+ minute sessions with sub-500ms latency.

<500ms Turn Latency
90+ Minute Sessions
100% Open Source

Why UnaMentis?

Existing voice AI tools struggle with extended educational sessions. UnaMentis was built from the ground up to enable natural, flowing conversations that can last 60-90+ minutes without degradation.

🎙️

Voice-Native Design

Built for voice from day one. Natural interruption handling, turn-taking logic, and voice activity detection create fluid conversations.

🔌

Provider Agnostic

Swap STT, TTS, and LLM providers without code changes. Support for OpenAI, Anthropic, ElevenLabs, Deepgram, and self-hosted options.

📚

Curriculum System

UMLCF (UnaMentis Markup Language Curriculum Format) is a purpose-built specification for conversational AI tutoring with voice-optimized content.

Low Latency

Sub-500ms median end-to-end turn latency through careful architecture, prefetching, and intelligent routing.

🏠

Self-Hosted Ready

Run your own STT, TTS, and LLM servers for privacy-first deployments. Full support for Ollama, llama.cpp, Piper, and more.

📊

Full Observability

Real-time telemetry, cost tracking per provider, latency metrics, and thermal monitoring for production deployments.

Architecture Overview

Mobile Application (iOS in development, Android planned)
SwiftUI Views
Session UI
Curriculum Navigator
Analytics Dashboard
Core Business Logic
SessionManager
CurriculumEngine
TelemetryEngine
PatchPanel Router
Service Layer (Actor-based)
STT Services
TTS Services
LLM Services
VAD Service
Infrastructure
Audio Engine
Core Data
URLSession
CoreML (VAD)

Supported Providers

UnaMentis supports a wide range of providers for each component of the voice pipeline. Mix and match based on your needs for quality, cost, latency, or privacy.

Speech-to-Text

  • Cloud AssemblyAI Universal
  • Cloud Deepgram Nova-3
  • Cloud OpenAI Whisper
  • Device Apple Speech
  • Device GLM-ASR-Nano
  • Self-Hosted GLM-ASR Server

Text-to-Speech

  • Cloud ElevenLabs Flash/Turbo
  • Cloud Deepgram Aura-2
  • Device Apple TTS
  • Self-Hosted Piper
  • Self-Hosted VibeVoice

Large Language Models

  • Cloud OpenAI GPT-4o / 4o-mini
  • Cloud Anthropic Claude 3.5
  • Self-Hosted Ollama
  • Self-Hosted llama.cpp
  • Self-Hosted vLLM
  • Device llama.cpp (experimental)

Voice Activity Detection

  • Device Silero VAD (CoreML)
  • Device TEN VAD
  • Device WebRTC VAD

UMLCF Curriculum Format

UMLCF (UnaMentis Markup Language Curriculum Format) is a JSON-based curriculum specification designed specifically for conversational AI tutoring.

Voice-Native

Every text field has optional spoken variants optimized for TTS pronunciation.

Standards-Grounded

Maps to IEEE LOM, LRMI, SCORM, xAPI, QTI, CASE, and Open Badges.

Tutoring-First

Built for natural conversation with stopping points and misconception handling.

Unlimited Depth

Topics can nest to arbitrary depth for complex subject matter.

{
  "umlcf_version": "1.0.0",
  "curriculum": {
    "id": "intro-machine-learning",
    "title": "Introduction to Machine Learning",
    "topics": [{
      "id": "supervised-learning",
      "title": "Supervised Learning",
      "learningObjectives": [
        "Explain the difference between supervised and unsupervised learning",
        "Identify when to use classification vs regression"
      ],
      "transcript": {
        "text": "Let's start with supervised learning...",
        "spokenText": "Let's start with supervised learning.",
        "stoppingPoints": [{
          "afterParagraph": 2,
          "comprehensionCheck": "Can you give me an example of a labeled dataset?"
        }]
      }
    }]
  }
}

Server Components

UnaMentis includes a Python-based management server and web dashboard for monitoring, curriculum management, and analytics.

Management Server

Async HTTP server with WebSocket support for:

  • Remote logging aggregation from iOS clients
  • Real-time metrics streaming
  • Resource monitoring (CPU, memory, thermal)
  • Idle state management
Python 3.11+ / aiohttp / asyncio

Curriculum Database

Storage and retrieval for UMLCF curricula:

  • File-based storage for development
  • PostgreSQL support for production
  • Search and filtering by metadata
  • Topic hierarchy navigation
PostgreSQL / File-based / JSON

Web Dashboard

React-based administration interface:

  • Curriculum browsing and management
  • Session analytics visualization
  • Real-time metrics streaming
  • Provider health monitoring
Next.js / React / TypeScript

Technical Stack

Mobile Client (iOS)

  • Swift 6.0 with strict concurrency
  • SwiftUI for all views
  • AVFoundation for audio
  • CoreML for on-device ML
  • Core Data for persistence

Backend

  • Python 3.11+
  • aiohttp for async HTTP
  • PostgreSQL (production)
  • Next.js dashboard
  • WebSocket for real-time

Performance

  • <500ms median turn latency
  • <1000ms P99 latency
  • 90+ minute session stability
  • <50MB memory growth
  • Thermal monitoring

Quality

  • 119+ automated tests
  • Real-over-mock testing
  • SwiftLint / SwiftFormat
  • Accessibility-first UI
  • Comprehensive docs

Team

UnaMentis is developed and maintained by a dedicated team committed to making high-quality AI tutoring accessible to everyone.

RA

Richard Amerman

Founder and Project Lead

CG

Cy Goerdt

Partner

Ready to Get Started?

UnaMentis is fully open source and ready for contribution. Whether you want to use it, extend it, or help improve it, we welcome you.