PyJarvis - AI Assistant

A Python implementation of the Jarvis text-to-voice assistant with animated digital face, LLM integration, and speech recognition.

Architecture

Modular Python application following Clean Architecture, SOLID principles, and Design Patterns

Components

pyjarvis_shared: Shared types, messages, and centralized configuration
pyjarvis_core: Core domain logic (text analysis, TTS processors, audio processing, animations)
pyjarvis_service: Service layer for text-audio processing (TCP IPC)
pyjarvis_cli: CLI application to send text to service
pyjarvis_ui: Desktop UI with animated robot face (Pygame)
pyjarvis_llama: LLM integration with speech recognition (Ollama + RealtimeSTT)

Components Relationship

Requirements

Python 3.10+
FFmpeg (for MP3 processing) — see the "Install FFmpeg" section below
Optional: Ollama (for LLM features)
See requirements.txt for all dependencies

Installation

pip install -r requirements.txt

Quick Start

Full architecture (recommended)

Start the service in terminal 1:

python -m pyjarvis_service

Start the UI in terminal 2:

python -m pyjarvis_ui

Send text from terminal 3:

python -m pyjarvis_cli "Hello, I am Jarvis"

You can also specify the language manually:

python -m pyjarvis_cli "Hola, soy Jarvis" --language es
python -m pyjarvis_cli "Olá, eu sou Jarvis" --language pt-BR

LLM + Speech Recognition (optional)

Start Ollama (if using local LLM):

ollama serve

Start the service and UI as above.
Start the LLM CLI in another terminal:

python -m pyjarvis_llama

In the LLM CLI you can:

Type messages and press Enter for text input
Use /m to record audio from microphone (press Enter to stop)
Use /lang <code> to change recognition language (STT)
Use /persona <name> to change AI persona
Automatic language detection: The system automatically detects the language of LLM responses and uses the correct TTS voice

Standalone UI

The UI can run standalone but is much more useful connected to the service:

python -m pyjarvis_ui

Project Structure

Top-level layout (canonical):

pyJarvis/
├── README.md
├── requirements.txt
├── pyjarvis_shared/
├── pyjarvis_core/
├── pyjarvis_service/
├── pyjarvis_cli/
├── pyjarvis_ui/
├── pyjarvis_llama/
├── audio/
├── assets/
├── models/
└── docs/

pyjarvis_shared/: AppConfig, message types and shared utilities
pyjarvis_core/: TextAnalyzer, AnimationController, TTS factory and processors
pyjarvis_service/: IPC server and TextProcessor orchestration
pyjarvis_ui/: Pygame-based UI (FaceRenderer, AudioPlayer, ServiceClient)
pyjarvis_llama/: LLM CLI, Ollama client, STT recorder, personas

Features

Animated robot face with lip-sync and emotion-driven effects
Multiple TTS engines (Edge-TTS default, gTTS available)
Automatic language detection for TTS (Portuguese, English, Spanish) using langdetect with heuristic fallback
STT integration via RealtimeSTT / Whisper models
LLM support via Ollama and configurable AI personas
Multi-language support: Automatic detection and manual override for TTS language selection
TCP-based IPC; UI registers for broadcast updates from service

Configuration

All runtime configuration is centralized in pyjarvis_shared/config.py (AppConfig):

TTS processor selection (tts_processor)
Audio output directory and auto-delete behavior
Edge-TTS voice mapping (edge_tts_voices)
STT model and language
Ollama base URL and model
Language detection: Uses langdetect library with heuristic fallback for automatic language detection

Language Detection

PyJarvis includes automatic language detection for TTS voice selection:

Primary method: Uses langdetect library for accurate language detection
Fallback: Heuristic-based detection using language-specific patterns and keywords
Supported languages: Portuguese (pt-BR), English (en-US), Spanish (es-ES, es-MX, es-AR, etc.)
Manual override: You can specify the language manually using the --language flag in CLI

The system automatically detects the language of:

Text sent via CLI (if no language is specified)
LLM responses in the interactive CLI (automatically detected and sent to TTS with correct language code)

Language Codes

Portuguese: pt, pt-BR, portuguese
English: en, en-US, en-GB, english
Spanish: es, es-ES, es-MX, es-AR, spanish, español

Text-to-Speech (Quick notes)

Edge-TTS (Microsoft) is the default high-quality engine. Voice mapping is configurable by language.
gTTS (Google) is supported as a fallback (requires internet and FFmpeg for MP3→WAV conversion).
Automatic language detection ensures the correct voice is used for each language.

Install FFmpeg

PyJarvis uses FFmpeg (via pydub) to convert MP3 audio (e.g., produced by gTTS) to WAV.

Windows options:

Chocolatey (recommended):

choco install ffmpeg

winget:

winget install ffmpeg

Manual download:
1. Download from: https://www.gyan.dev/ffmpeg/builds/
2. Extract the ZIP and add the bin folder to your PATH (e.g. C:\ffmpeg\bin).
3. Restart terminal.

Verify installation:

ffmpeg -version

If you prefer not to install FFmpeg, use a TTS engine that emits WAV directly.

Edge-TTS Voice Mapping

Configure voices in pyjarvis_shared/config.py (example):

from pyjarvis_shared import AppConfig
config = AppConfig()
config.edge_tts_voices = {
    "pt-br": "pt-BR-HumbertoNeural",
    "pt": "pt-BR-FranciscaNeural",
    "en": "en-US-AriaNeural",
    "en-us": "en-US-GuyNeural",
    "es": "es-ES-ElviraNeural",
    "es-es": "es-ES-ElviraNeural",
    "es-mx": "es-MX-DaliaNeural",
    "es-ar": "es-AR-ElenaNeural"
}

Available Voices

Portuguese (pt-BR) example voices

pt-BR-FranciscaNeural - female (padrão)
pt-BR-HumbertoNeural - male
pt-BR-AntonioNeural - male
pt-BR-BrendaNeural - female
pt-BR-DonatoNeural - male
pt-BR-ElzaNeural - female
pt-BR-FabioNeural - male
pt-BR-GiovannaNeural - female
pt-BR-JulioNeural - male
pt-BR-LeilaNeural - female
pt-BR-LeticiaNeural - female
pt-BR-ManuelaNeural - female
pt-BR-NicolauNeural - male
pt-BR-ThalitaNeural - female
pt-BR-ValerioNeural - male
pt-BR-YaraNeural - female

English (en-US) example voices

en-US-AriaNeural - female (padrão)
en-US-GuyNeural - male
en-US-JennyNeural - female
en-US-AmberNeural - female
en-US-AnaNeural - female (child)
en-US-AshleyNeural - female
en-US-BrandonNeural - male
en-US-ChristopherNeural - male
en-US-CoraNeural - female
en-US-ElizabethNeural - female
en-US-EricNeural - male
en-US-JacobNeural - male
en-US-JaneNeural - female
en-US-JasonNeural - male
en-US-MichelleNeural - female
en-US-MonicaNeural - female
en-US-NancyNeural - female
en-US-RogerNeural - male
en-US-SaraNeural - female
en-US-TonyNeural - male

Spanish (es-ES) example voices

es-ES-ElviraNeural - female (padrão, Bright, Clear)
es-ES-AlvaroNeural - male (Confident, Animated)
es-ES-AbrilNeural - female
es-ES-ArabellaMultilingualNeural - female (Cheerful, Friendly, Casual, Warm, Pleasant)
es-ES-ArnauNeural - male
es-ES-DarioNeural - male
es-ES-EliasNeural - male
es-ES-EstrellaNeural - female
es-ES-IreneNeural - female (Curious, Cheerful)
es-ES-IsidoraMultilingualNeural - female (Cheerful, Friendly, Warm, Casual)
es-ES-LaiaNeural - female
es-ES-LiaNeural - female (Animated, Bright)
es-ES-NilNeural - male
es-ES-SaulNeural - male
es-ES-TeoNeural - male
es-ES-TrianaNeural - female
es-ES-VeraNeural - female
es-ES-XimenaNeural - female

Spanish (es-MX) example voices

es-MX-DaliaNeural - female (padrão)
es-MX-DaliaMultilingualNeural - female
es-MX-BeatrizNeural - female
es-MX-CandelaNeural - female
es-MX-CarlotaNeural - female
es-MX-CecilioNeural - male
es-MX-GerardoNeural - male
es-MX-JorgeNeural - male
es-MX-JorgeMultilingualNeural - male
es-MX-LarissaNeural - female
es-MX-LibertoNeural - male
es-MX-LucianoNeural - male
es-MX-MarinaNeural - female
es-MX-NuriaNeural - female
es-MX-PelayoNeural - male
es-MX-RenataNeural - female
es-MX-YagoNeural - male

Spanish (es-AR) example voices

es-AR-ElenaNeural - female (Bright, Clear)
es-AR-TomasNeural - male

Spanish (es-CO) example voices

es-CO-SalomeNeural - female
es-CO-GonzaloNeural - male

Other Spanish regional voices

es-CL-CatalinaNeural - female (Chile)
es-CL-LorenzoNeural - male (Chile)
es-PE-AlexNeural - male (Peru)
es-PE-CamilaNeural - female (Peru)
es-US-AlonsoNeural - male (US Spanish)
es-US-PalomaNeural - female (US Spanish)
es-UY-MateoNeural - male (Uruguay)
es-UY-ValentinaNeural - female (Uruguay)
es-VE-PaolaNeural - female (Venezuela)
es-VE-SebastianNeural - male (Venezuela)

For a complete list of all available Spanish voices, run:

edge-tts --list-voices | grep "^es-"

Testing / Verification

Quick test flow:

Start the service: python -m pyjarvis_service (should listen on 127.0.0.1:8888)
Start the UI: python -m pyjarvis_ui (window should open and attempt to connect)
Send text via CLI: python -m pyjarvis_cli "Hello, I am Jarvis"

Expected: service processes text, generates an audio file in ./audio/, broadcasts a VoiceProcessingUpdate, and the UI plays the audio while animating the face.

Testing checklist (manual):

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
assets		assets
diagrams		diagrams
docs		docs
pyjarvis_cli		pyjarvis_cli
pyjarvis_core		pyjarvis_core
pyjarvis_llama		pyjarvis_llama
pyjarvis_service		pyjarvis_service
pyjarvis_shared		pyjarvis_shared
pyjarvis_ui		pyjarvis_ui
tests		tests
.gitignore		.gitignore
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt
service_logs.txt		service_logs.txt
start_ai.ps1		start_ai.ps1
start_service.ps1		start_service.ps1
start_ui.ps1		start_ui.ps1
test_audio_flow.ps1		test_audio_flow.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyJarvis - AI Assistant

Architecture

Components

Components Relationship

Requirements

Installation

Quick Start

Full architecture (recommended)

LLM + Speech Recognition (optional)

Standalone UI

Project Structure

Features

Configuration

Language Detection

Language Codes

Text-to-Speech (Quick notes)

Install FFmpeg

Edge-TTS Voice Mapping

Available Voices

Portuguese (pt-BR) example voices

English (en-US) example voices

Spanish (es-ES) example voices

Spanish (es-MX) example voices

Spanish (es-AR) example voices

Spanish (es-CO) example voices

Other Spanish regional voices

Testing / Verification

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PyJarvis - AI Assistant

Architecture

Components

Components Relationship

Requirements

Installation

Quick Start

Full architecture (recommended)

LLM + Speech Recognition (optional)

Standalone UI

Project Structure

Features

Configuration

Language Detection

Language Codes

Text-to-Speech (Quick notes)

Install FFmpeg

Edge-TTS Voice Mapping

Available Voices

Portuguese (pt-BR) example voices

English (en-US) example voices

Spanish (es-ES) example voices

Spanish (es-MX) example voices

Spanish (es-AR) example voices

Spanish (es-CO) example voices

Other Spanish regional voices

Testing / Verification

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages