LLM Assistant for Markdown Documents (quokk.au)

submitted 1 week ago by fizzle@quokk.au to c/fosai@lemmy.world

8 comments fedilink hide all child comments

I keep a lot of notes in markdown files, and I'd like an LLM to assist.

I regularly use Open WebUI with with inference routed through huggingface. Open WebUI kind of has this functionality like you can upload a markdown file and prompt it to improve it in whatever way, but of course that's a fairly clunky workflow.

I really want something built into the editor, that can use RAG to consider other files in context.

I also don't want to be locked in to a specific LLM or provider, I'd like to be able to link it to OpenRouter or similar.

top 8 comments

sorted by: hot top new old

[-] Smorty@lemmy.blahaj.zone 2 points 1 week ago

what volume of markdown files are we talking?

also, just so i understand this right:
you are looking fir a markdown editor which has a chat window on the side which can look at other files to assist in writing.
is that correct?

which editor do you use right now for editing the files? does it need to support vim-movements? (if u dont know what that is, it doesnt matter)

what exactly would the LM be assisting in? should it be to just read files and respond, or edit them itself aswell, or suggest edits?

suggestion for under 200 files

depending on the amount of files, a simple index and read-tool functionality might be enough. Here is how you would create such an index:

LM looks at each md file
Then generates a one-liner about the content, like:

Static functions and variables in Godot, using autoload scripts and scenes (e.g. loading screens, overlays, save systems, anything permanently loaded)

store that into a file alongside each files path, perhaps like this:

~/Documents/
  file.md (one-liner here)
  another_file.md (one-liner here)
  topic/
    file_about_topic.md (one-liner here)
[...]

These three steps can be done using any coding agent you got lying around using this prompt

Goal: FILE_INDEX.txt which contains all files in <folder with the documents> in tree form (reduce redundant dir paths) with a concise and keyword-heavy one-liner about the files content.

FILE_INDEX.md format:

~/SomeDir/SomeDirBelowIt/
  a_file_in_that_dir.md (description)
  another_file.md (description)
  dir_inside_it/
    file_in_that_dir.md (description)

How: First use `tree` to see the directories contents, then use subagents to delegate the generation of the descriptions. They should out them into files themselves. An agent can summarize 5 files at mist in one go. Tell each agent exactly what name the output file should have. Finally, after having received all descriptions with their file names, combine the files all into one final FILE_INDEX.txt

This index can then be did into any agent to let it find files quicker, without having to hope for good chunking settings in a RAG pipeline.

all this was written by a human, even if it might not seem as such.

[-] fizzle@quokk.au 1 points 1 week ago

you are looking fir a markdown editor which has a chat window on the side which can look at other files to assist in writing.

Yes, although I'm interested in what other solutions might be around.

I use a variety of editors. Whichever suits my mood in a given moment. Sometimes helix which is a TUI, sometimes gnome's text editor, sometimes zettlr which is a more "zettelkasten" type knowledge base manager. I'm flexible. If there was a text editor that had this type of functionality built in then that's what I'd use for tasks where I wanted this type of assistance. I don't care about vim short cuts.

I appreciate your proposing a solution. It still feels like there's a yawning chasm in my knowledge. As in, what platform or software can I use to implement tools like you propose? I haven't been able to figure out how to get open webui to provide that sort of functionality, although I suspect that it can.

I think I'm asking, what is the software that connects my selection of markdown files to the LLM. I know you can upload a bunch of text files to open webui but you can't edit in place. it would be like upload edit download upload edit download upload ad nauseum.

[-] muntedcrocodile@hilariouschaos.com 2 points 1 week ago

Rag is an outdated mechanism full agentic workflow is much better imo. I've written my own custom thing that uses a matrix account, pi, a vector embedder via local ollama, and vector store of chroma, the agent has custom tools to query the vector store, run bash etc. I have my logseq notes sync to my server via syncthing and I have a file watcher that updates the vector store as my notes change. The agent can edit notes like its any file. I then simply have a matrix client I can communicate to the agent with. I have it so the fuel what her looks for "/sydney" (that's my agents name) and it will send a message via matrix to get the agent to go look at that file/note and make changes as requested via the command. Its kinda openclawish but a lot less context heavy and doesn't run forever unless triggered.

[-] fizzle@quokk.au 2 points 1 week ago

This sounds really cool. I hadn't heard of a vector embedder / vector store before. Definitely need to look into those.

Do you have a big GPU to run local ollama ?

[-] muntedcrocodile@hilariouschaos.com 1 points 1 week ago

So I do inference over api to open routers I wish I had the GPU or an apple to run a decent LLM locally. Embedding is very cheap comparatively, I use a clip embedding model so I can have images and text in the same vectorspace.

[-] fizzle@quokk.au 1 points 1 week ago

I'm out of my depth here but trying to piece this together.

If I understand correctly the first component of this workflow is to use an inference API (like huggingface or so) to convert each file from your notes into semantic vectors and store them in chromadb, ready to be used in future prompts.

Are you using any software to do that or have you written some code to load the files from disk, call the API, and store the response?

[-] muntedcrocodile@hilariouschaos.com 1 points 1 week ago

So my notes are just a directory of thousands of MD files. I wrote some code that watches the files in this dir to see when anything changes and when it does it will do the following:

Splits the file into chunks with some overlap on each side it does something like 300token chunks with a 25token overlap this is done by loading the model tokeniser via the huggiface python library and using the huggingface chunker (this happens locally).
I send each chunk to my local ollama instance that converts it to a semantic vector (just another local docker container)
I then delete all semantic vectors in chromadb for that file and create new entries for the updated file.
If /sydney is contained within the file it sends a message to the matrix chat as the user saying "read and follow the instructions provided by the /sydney command" the agent manager will then get this message and pass it off to an agent to handle. All this happens locally.

My ai agent is a separate component (just another docker container, with the notes dir mounted as a volume) using pi which uses an LLM via remote api (openrouters). I have a custom tool for that agent where the agent can write a text search that returns the top n most semantically similar chunks of text (along with some metadata notably the filename and line numbers where this chunk came from). The vectors are never seen by the LLM they exists purely for the search ranking. The agent also has file editing capabilities so it can then go read that file or modify that file like any coding agent. The agent also has a tool to send messages via matrix.

I have a service that watches a specific matrix chat and if a message is recieved does 1 of 2 things: Option 1: if an agent is already running it will pass the message into the existing agent as a user message. Option 2: if no agent is running it will start a new agent instance and pass the message into the agent as the user message. This agent manager service is the same docker image that runs the agent. This is the same docker container that runs the agent when the agent finishes running it takes and final agent output and sends that to the matrix chat as the agents matrix user.

I got an agent to write all this code so its probably dodgy as shit with all sorts of security holes hence I haven't published it on github (security through obscurity etc etc lol).

I also have a searxng instance running accessible to the agent via MCP. And I have a chrome MCP allowing the agent to do things from inside a virtual chrome browser.