A.L.F.R.E.D — James Strohm

Overview

A.L.F.R.E.D is an AI assistant with three entry points — a voice CLI, a PySide6 desktop GUI, and a FastAPI REST API — all sharing one Supabase backend. The project demonstrates end-to-end backend engineering: API design, containerization, CI/CD, test coverage, and cloud deployment.

The system uses a RAG pipeline backed by Supabase + pgvector for persistent semantic memory, a multi-provider LLM fallback chain (Claude 3.5 Sonnet → GPT-4o-mini), and integrations with Google Calendar, weather APIs, file management, and system monitoring.

Architecture

Voice CLI
python main.py

PySide6 GUI
python -m ui.app

FastAPI
python -m api.run

↓

Command Router
Memory / Services / LLM

→

LLM + RAG
Claude 3.5 → GPT-4o-mini

↓

Supabase + pgvector
Semantic Memory · Conversations · OpenAI Embeddings

REST API

Thin HTTP layer over existing logic — all endpoints delegate to brain.py and memory_manager.py with zero duplication. Auto-generated OpenAPI docs available.

Method	Endpoint	Description
POST	/chat	Send a message, get AI response
GET	/chat/history	Retrieve conversation history
POST	/memories	Store a key-value memory
GET	/memories/{key}	Recall a specific memory
DELETE	/memories/{key}	Forget a memory
POST	/memories/search	Semantic vector search
GET	/system/health	Health check + Supabase status
WS	/ws/chat	Real-time WebSocket chat with auth handshake

Key Features

FastAPI REST + WebSocket

9 endpoints with Pydantic validation, auto-generated Swagger docs, CORS middleware, and a WebSocket endpoint for real-time chat authenticated via an initial auth message instead of URL query params. Deployed on Railway via Docker.

Multi-Provider LLM + RAG

Claude 3.5 Sonnet (via OpenRouter) with GPT-4o-mini fallback. RAG pipeline injects semantically relevant memories and recent facts into every query via pgvector.

Semantic Memory

Supabase PostgreSQL + pgvector for cloud-backed vector search with OpenAI text-embedding-3-small. Full CRUD: remember, recall, forget, list, and semantic search.

Production Infrastructure

Docker containerization, GitHub Actions CI (ruff lint + pytest), pinned dependencies, pyproject.toml packaging, and 65+ unit tests with full mock coverage.

Voice Interface + GUI

Google STT input with ElevenLabs/pyttsx3 TTS fallback chain. PySide6 GUI with dual waveform visualization, system dashboard, and JARVIS-inspired dark theme.

Service Integrations

Google Calendar (natural language scheduling), OpenWeather with caching, a secure file manager with path validation and explicit delete confirmation, and real-time system monitoring.

Challenges & Solutions

API Without Duplication: The FastAPI layer needed to expose existing logic without rewriting it. Used synchronous handlers (plain def) so FastAPI auto-offloads to a thread pool, matching the existing sync codebase with zero refactoring.
Import Isolation: Desktop dependencies (PySide6, PyAudio, numpy) can't install in a slim Docker container. Moved GUI/voice imports to lazy loading so the API server starts cleanly with only backend dependencies.
Memory Relevance: Not all past context is useful. Tuned pgvector similarity threshold (0.5) and capped RAG context at ~500 tokens, combining semantic search with the 5 most recently updated facts.
Provider Reliability: API outages are unpredictable. Built a fallback chain (Claude → GPT-4o-mini) and triple TTS fallback (ElevenLabs → pyttsx3 → text-only) for resilience.
CI Without Secrets: Tests needed to run without API keys. All external calls are mocked, and CI injects dummy environment variables so module-level client initialization succeeds.

Tech Stack

Python FastAPI Supabase pgvector Docker GitHub Actions OpenRouter (Claude 3.5) OpenAI API PySide6 ElevenLabs Google Calendar API Railway pytest ruff