10
LLM Assistant for Markdown Documents
(quokk.au)
Welcome to Free Open-Source Artificial Intelligence!
We are a community dedicated to forwarding the availability and access to:
Free Open Source Artificial Intelligence (F.O.S.A.I.)
Rag is an outdated mechanism full agentic workflow is much better imo. I've written my own custom thing that uses a matrix account, pi, a vector embedder via local ollama, and vector store of chroma, the agent has custom tools to query the vector store, run bash etc. I have my logseq notes sync to my server via syncthing and I have a file watcher that updates the vector store as my notes change. The agent can edit notes like its any file. I then simply have a matrix client I can communicate to the agent with. I have it so the fuel what her looks for "/sydney" (that's my agents name) and it will send a message via matrix to get the agent to go look at that file/note and make changes as requested via the command. Its kinda openclawish but a lot less context heavy and doesn't run forever unless triggered.
This sounds really cool. I hadn't heard of a vector embedder / vector store before. Definitely need to look into those.
Do you have a big GPU to run local ollama ?
So I do inference over api to open routers I wish I had the GPU or an apple to run a decent LLM locally. Embedding is very cheap comparatively, I use a clip embedding model so I can have images and text in the same vectorspace.
I'm out of my depth here but trying to piece this together.
If I understand correctly the first component of this workflow is to use an inference API (like huggingface or so) to convert each file from your notes into semantic vectors and store them in chromadb, ready to be used in future prompts.
Are you using any software to do that or have you written some code to load the files from disk, call the API, and store the response?
So my notes are just a directory of thousands of MD files. I wrote some code that watches the files in this dir to see when anything changes and when it does it will do the following:
My ai agent is a separate component (just another docker container, with the notes dir mounted as a volume) using pi which uses an LLM via remote api (openrouters). I have a custom tool for that agent where the agent can write a text search that returns the top n most semantically similar chunks of text (along with some metadata notably the filename and line numbers where this chunk came from). The vectors are never seen by the LLM they exists purely for the search ranking. The agent also has file editing capabilities so it can then go read that file or modify that file like any coding agent. The agent also has a tool to send messages via matrix.
I have a service that watches a specific matrix chat and if a message is recieved does 1 of 2 things: Option 1: if an agent is already running it will pass the message into the existing agent as a user message. Option 2: if no agent is running it will start a new agent instance and pass the message into the agent as the user message. This agent manager service is the same docker image that runs the agent. This is the same docker container that runs the agent when the agent finishes running it takes and final agent output and sends that to the matrix chat as the agents matrix user.
I got an agent to write all this code so its probably dodgy as shit with all sorts of security holes hence I haven't published it on github (security through obscurity etc etc lol).
I also have a searxng instance running accessible to the agent via MCP. And I have a chrome MCP allowing the agent to do things from inside a virtual chrome browser.