AI & Software Integration

Memory Mirror

Multimodal Voice Assistant Pipeline

Executive Summary

Developed a fully localized, multimodal voice assistant during an intensive university sprint project in early 2026. The system eliminates cloud reliance by integrating local Speech-to-Text (STT) algorithms directly with Large Language Models (LLMs) to create a natural, interactive dialogue interface.

Pipeline Architecture

  • Programmed the core logic and API bridges using Python.
  • Integrated open-source LLMs for localized natural language processing, ensuring low-latency response generation.
  • Engineered a custom Speech-to-Text pipeline to accurately capture and transcribe voice commands in real-time.

Hardware Integration

The physical build required syncing multi-channel microphone arrays with visual processing units. A major technical hurdle involved managing the processing payload to ensure the hardware did not thermal throttle while running the LLM locally alongside audio capture algorithms.

Hardware Setup Python Terminal Logs Microphone Array Integration Software Pipeline Diagram