Our Philosophy: Experience-Driven AI Development
UnaMentis is built with AI-assisted development from the ground up, but this is not "vibecoding." The project founder brings over 30 years of experience in technology, the majority spent as a developer, with years contributing to open source projects. Every architectural decision, every tool selection, and every quality standard is informed by decades of real-world software engineering experience.
The goal is ambitious: use AI not just to move faster, but to approach the quality and review standards achieved by a thoughtful, attentive human developer. We have deep respect for that standard. A skilled developer bringing full attention to code review, architecture decisions, and quality assurance is not easily replicated by any single tool. However, when six or seven layers of AI-driven tools and processes work together, each with overlapping review and complementary perspectives, the cumulative effect can begin to approximate that level of rigor. We are documenting an ongoing experiment in what becomes possible when deep experience guides this layered approach with intention.
A Living Story
This page documents an evolving journey. We use Claude Code as our primary development partner, supplemented by a carefully chosen ecosystem of AI-powered tools. As new capabilities emerge and our understanding deepens, we continuously review and adapt our approach. What you read here represents our current state, our intentions, and our commitment to improvement.
AI handles the repetitive, error-prone aspects of software development while human experience guides architecture, quality standards, and the creative problem-solving that makes UnaMentis unique. The combination enables a small team to build and maintain a sophisticated, multi-platform voice AI tutoring system.
Shift Left
Catch issues at commit time, not in production. AI helps enforce quality standards before code ever leaves the developer's machine.
Automate Everything
Humans should not do what machines can do better. Every manual quality check becomes an automated gate.
Measure Continuously
You cannot improve what you do not measure. AI-powered observability gives us real-time insight into code quality and performance.
Learn from Data
DORA metrics and quality dashboards guide engineering decisions, creating a feedback loop that continuously improves our process.
AI Tools We Use
Our AI-assisted development workflow combines multiple specialized tools, each chosen for its strength in a specific domain. Together, they form a comprehensive system that touches every aspect of our development process.
Claude Code
Primary Development Partner
Our primary AI coding assistant for:
- Code generation and refactoring
- Architecture design and review
- Documentation writing
- Test creation and debugging
- Cross-platform development (iOS, Web, Server)
CodeRabbit
Automated PR Review
AI-powered code review on every pull request:
- Language-specific analysis (Swift, Python, TypeScript)
- Concurrency safety checks for Swift 6
- Security vulnerability detection
- Architecture diagram generation
- Free for open source projects
Intelligent Automation
CI/CD & Quality Gates
Automated quality enforcement:
- Pre-commit hooks for linting and formatting
- Renovate for dependency management
- CodeQL for security analysis
- Gitleaks for secrets detection
- DevLake for DORA metrics
AI-Assisted Development Workflow
The Code Quality Initiative
To achieve enterprise-grade quality with a small team, we implemented a systematic 5-phase Code Quality Initiative. Each phase builds on the previous, creating layers of automated protection that catch issues progressively earlier in the development cycle.
The Impact
This infrastructure enables a team of 2 people to maintain quality standards typically requiring 10+ engineers, while preserving the agility and velocity that makes small teams effective. Every commit passes the same quality checks. Every PR gets reviewed by AI. Every deployment is monitored.
Key Achievements
| Capability | Status | Impact |
|---|---|---|
| Pre-commit quality gates | Implemented | Issues caught before commit |
| Automated dependency management | Implemented | Zero manual dependency tracking |
| 80% code coverage enforcement | Implemented | CI fails below threshold |
| Performance regression detection | Implemented | Automated latency monitoring |
| Security scanning | Implemented | Secrets, CodeQL, dependency audits |
| Feature flag lifecycle | Implemented | Safe rollouts with cleanup tracking |
| DORA metrics & observability | Implemented | Engineering health visibility |
| AI-powered code review | Implemented | Every PR reviewed by CodeRabbit |
Phase 1: Foundation
The foundation phase automates existing manual quality gates across iOS, Server, and Web components. The goal: make quality enforcement invisible and unavoidable.
Pre-Commit Hooks
A unified hook system runs automatically before every commit, catching issues before they enter the repository:
Swift
SwiftLint (strict mode) and SwiftFormat validation ensure consistent iOS code style.
Python
Ruff handles linting and format checking, replacing Black and Flake8 with a faster, unified tool.
JavaScript/TypeScript
ESLint and Prettier enforce consistent web code formatting across React and Next.js components.
Secrets
Gitleaks scans all files for accidentally committed API keys, passwords, or tokens.
Dependency Automation (Renovate)
Manual dependency tracking is eliminated. Renovate handles everything automatically:
- Schedule: Updates run Mondays before 6am, minimizing disruption
- Grouping: iOS, Python, and npm dependencies grouped separately for focused review
- Auto-merge: Security patches, patch updates, and dev dependencies merge automatically
- Manual review: Major version updates and breaking changes require human approval
Coverage Enforcement
Code coverage is not a suggestion. It is a gate. The iOS build fails if coverage drops below 80%. This is enforced automatically in CI, with coverage extracted from Xcode result bundles.
Phase 2: Enhanced Quality Gates
Phase 2 extends quality enforcement with nightly testing, performance regression detection, and comprehensive security scanning.
Nightly End-to-End Testing
Every night at 2am UTC, comprehensive end-to-end tests run against the full system:
- iOS E2E tests with real API keys (from GitHub Secrets)
- Latency regression tests using the provider comparison suite
- Full voice pipeline validation
- Automatic GitHub issue creation on failure with "nightly-failure" label
Performance Regression Detection
Voice applications live and die by latency. Our latency test harness ensures we never ship a slower release:
End-to-end turn latency median target
99th percentile latency ceiling
Regression warning threshold
CI blocks at this regression
Multi-Layer Security Scanning
Security is not a single check. It is a layered defense:
| Scanner | Purpose | Schedule |
|---|---|---|
| Gitleaks | Secrets detection (full git history) | Every PR + weekly |
| CodeQL | Static analysis (Swift, Python, JavaScript) | Every PR + weekly |
| pip-audit | Python dependency vulnerabilities | Every PR + weekly |
| npm audit | JavaScript dependency vulnerabilities | Every PR + weekly |
Phase 3: Feature Flag System
Feature flags enable safe experimentation. New features can be developed, deployed, and tested in production without affecting all users. If something goes wrong, we flip a switch instead of rolling back a deployment.
Self-Hosted Unleash Infrastructure
We run our own feature flag system using Unleash, giving us full control over flag management without subscription costs:
Unleash Server
Port 4242: Core flag management and administration interface.
Unleash Proxy
Port 3063: Edge proxy for client SDK connections with caching.
iOS SDK
Actor-based service with SwiftUI view modifier for seamless integration.
Web SDK
React context and hooks (useFlag, useFlagVariant) for web components.
Flag Lifecycle Management
Feature flags have a lifecycle. Forgotten flags become technical debt. Our automated audit system tracks every flag from creation to cleanup:
- Flags have target removal dates and designated owners
- Weekly automated scans detect overdue flags (90 day maximum age)
- CI creates GitHub issues for flags approaching expiration
- PR comments highlight any flag changes for review
Phase 4: Observability & DORA Metrics
You cannot improve what you do not measure. Phase 4 provides visibility into quality trends and engineering health through industry-standard DORA metrics.
DORA Metrics (Apache DevLake)
We track the four key metrics that distinguish elite engineering teams:
| Metric | What It Measures | Elite Target |
|---|---|---|
| Deployment Frequency | How often code ships to production | Multiple per day |
| Lead Time for Changes | Commit to production duration | Less than 1 hour |
| Change Failure Rate | Deployments causing failures | 0-15% |
| Mean Time to Recovery | Incident to resolution duration | Less than 1 hour |
Quality Dashboard
Daily automated metrics collection provides ongoing visibility:
- CI/CD success rates across iOS, Server, and Web pipelines
- Pull request metrics (count, average size, review time)
- Bug metrics (open count, closed in 30 days, age distribution)
- Trend analysis with 90-day retention for pattern detection
Phase 5: AI-Powered Code Review
The final phase brings AI into the review process. Every pull request receives automated analysis from CodeRabbit, configured for maximum issue detection with language-specific rules for our tech stack.
Review Configuration
CodeRabbit is configured in "assertive" mode for comprehensive coverage:
Swift Reviews
- Swift 6.0 concurrency safety verification
- Actor isolation violation detection
- Sendable conformance checks
- Data race identification in async code
- Memory leak and retain cycle detection
- Force unwrap usage analysis
Python Reviews
- Async/await usage patterns
- Exception handling completeness
- Type hint coverage
- Security vulnerability scanning
- aiohttp-specific best practices
TypeScript/React Reviews
- React hook dependency arrays
- Server/client component boundaries
- Accessibility (a11y) compliance
- Next.js App Router patterns
- Type safety enforcement
CI/CD Reviews
- GitHub Action version pinning
- Permissions scope verification
- Secrets handling review
- Cache configuration optimization
- Workflow efficiency suggestions
Cost: Free for Open Source
CodeRabbit provides this enterprise-grade AI review capability free for open source projects. The same service costs $24-30 per seat per month for private repositories, making this a significant value for the UnaMentis project.
Results & The Road Ahead
The Code Quality Initiative is an ongoing journey. Phases 1-4 are complete, with Phase 5 actively in progress. Here is where we stand and where we are heading:
Current Quality Gates
| Gate | Threshold | Enforcement |
|---|---|---|
| Code Coverage | 80% minimum | CI fails if below |
| Latency P50 | 500ms | Warns at +10%, fails at +20% |
| Latency P99 | 1000ms | Warns at +10%, fails at +20% |
| Lint (all languages) | Zero violations | Pre-commit hook blocks |
| Secrets Detection | Zero findings | Pre-commit + CI blocks |
| Feature Flag Age | 90 days maximum | Weekly audit creates issues |
| Security Vulnerabilities | Zero critical/high | Security workflow blocks |
Planned Advanced Features
Mutation Testing
Proves tests catch bugs, not just hit lines. Using Muter (Swift), mutmut (Python), and Stryker (Web).
Voice Pipeline Resilience
Network degradation simulation: high latency, packet loss, disconnection handling, graceful degradation.
Contract Testing
Ensures iOS client and Server API stay in sync using Pact. Deferred until APIs stabilize.
Predictive Alerts
Move from reactive to proactive: detect performance degradation before it impacts users.
This Story Continues
AI-assisted development is not a destination. It is an evolving practice. As new tools emerge and our understanding deepens, we will continue to push the boundaries of what a small team can accomplish with intelligent automation. This page will be updated as our journey continues.