cross-posted from: https://lemmit.online/post/225981
This is an automated archive made by the Lemmit Bot.
The original was posted on /r/homeassistant by /u/janostrowka on 2023-07-19 12:49:02.
Hopefully this will come in handy for our Year of the Voice.
TL;DR: Justin Alvey replaces Google Nest Mini PCB with ESP32 custom PCB which he’s open-sourcing. Shows demo of running LLM voice assistant paired with Beeper to send and receive messages.
Tweet text thread (I would also highly recommend checking out the video demos on Twitter):
I “jailbroke” a Google Nest Mini so that you can run your own LLM’s, agents and voice models. Here’s a demo using it to manage all my messages (with help from @onbeeper) 📷 on, and wait for surprise guest! I thought hard about how to best tackle this and why
After looking into jailbreaking options, I opted to completely replace the PCB. This let’s you use a cheap ($2) but powerful & developer friendly WiFi chip with a highly capable audio framework. This allows a paradigm of multiple cheap edge devices for audio & voice detection…
& offloading large models to a more powerful local device (whether your M2 Mac, PC server w/ GPU or even "tinybox"!) In most cases this device is already trusted with your credentials and data so you don’t have to hand these off to some cloud & data need never leave your home
The custom PCB uses @EspressifSystem's ESP32-S3 I went through 2 revisions from a module to a SoC package with extra flash, simplifying to single-sided SMT (< $10 BOM) All features such as LED’s, capacitive touch, mute switch are working, & even programmable from Arduino (/IDF)
For this demo I used a custom “Maubot” with my @onbeeper credentials (a messaging app which securely bridges your messaging clients using the Matrix protocol & e2e encryption) which runs locally serving an API
I’m then using GPT3.5 (for speed) with function calling to query this
Fro the prompt I added details such as family & friends, current date, notification preferences & a list additional character voices that GPT can respond in. The response is then parsed and sent to @elevenlabsio
I've been experimenting with multiple of these, announcing important messages as they come in, morning briefings, noting down ideas and memos, and browsing agents. I couldn’t resist - here's a playful (unscripted!) video of two talking to each other prompted to be AI’s from "Her
I’m working on open sourcing the PCB design, build instructions, firmware, bot & server code - expect something in the next week or so. If you don't want to source Nest Mini's (or shells from AliExpress) it's still a great dev platform for developing an assistant! Stay tuned!