libre

9836 readers

3 users here now

Welcome to libre

A comm dedicated to the fight for free software with an anti-capitalist perspective.

The struggle for libre computing cannot be disentangled from other forms of socialist reform. One must be willing to reject proprietary software as fiercely as they would reject capitalism. Luckily, we are not alone.

libretion

Resources

Free Software, Free Society provides an excellent primer in the origins and theory around free software and the GNU Project, the pioneers of the Free Software Movement.
Switch to GNU/Linux! If you're still using Windows in $CURRENT_YEAR, flock to Linux Mint!; Apple Silicon users will want to check out Asahi Linux.

Rules

Be on topic: Posts should be about free software and other hacktivst struggles. Topics about general tech news should be in the technology comm or programming comm. That doesn't mean all posts have to be serious though, memes are welcome!
Avoid using misleading terms/speading misinformation: Here's a great article about what those words are. In short, try to avoid parroting common Techbro lingo and topics.
Avoid being confrontational: People are in different stages of liberating their computing, focus on informing rather than accusing. Debatebro nonsense is not tolerated.
All site-wide rules still apply

Artwork

Xenia was meant to be an alternative to Tux and was created (licensed under CC0) by Alan Mackey in 1996.
Comm icon (of Xenia the Linux mascot) was originally created by @ioletsgo
Comm banner is a close up of "Dorlotons Degooglisons" by David Revoy (CC-BY 4.0) for Framasoft

founded 3 years ago

MODERATORS

[email protected]

paperless-ngx - self-hosted document scanning/management (hexbear.net)

submitted 2 months ago by [email protected] to c/[email protected]

4 comments fedilink hide all child comments

I spun up an instance of paperless-ngx on my Docker host a couple days ago, and just yesterday got my document scanner configured to send things to its Consume folder. So far I'm beyond impressed and I wish I'd learned about it much sooner! I run a FreeNAS server which has collected a lot of important documents in its 10 years of life... all of them arranged in folders as best as I could. Fuck folders, tags are the way.

It was easier than I expected to get the container running and tell it to watch a folder on the FreeNAS share. So I have a decade of pseudo-organized archives to import? Click and drag the folder, and it's done. Amazing.

The automatic tagging seems OK so far. If I'm working on several documents of a similar provenance it starts suggesting appropriate tags after I manually tag about 10 or so. I'll be interested to see how it does as I train it more.

I was never going to pay for a service like this, even though I really needed it. Finding out about paperless has been a revelation for me, haha. And on top of that it's the most "just works" of anything I've tried self-hosting so far. Easy to set up, and it seems feature-rich with a good UI. What's not to love? penguin-love

Anyone else out there using paperless-ngx and have any tips or tricks to share? Things you wish you knew before?

https://github.com/paperless-ngx/paperless-ngx

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 4 points 2 months ago* (last edited 2 months ago) (1 children)

I was looking to set this up, but can you manually upload documents? I get why it's there, but I never like using consume folders that then delete the content until I've been able to confirm everything is fine. I don't plan to actually scan anything, but want to organize the files I already have.

Edit: it sounds like you can based on what you said about importing?

[–] [email protected] 3 points 2 months ago* (last edited 2 months ago) (1 children)

I'm away from a terminal rn, but there's an "Upload files" button on the Dashboard if memory serves, which presumably lets you send files thru the web interface. I haven't used it yet.

Edit: Forgot to say, my original document archive, the Consume folder, and the Media folder all live on the same NAS share, which is mounted on my main PC. So to import all those documents I'm just literally cutting and pasting files from within my file manager. I'm only using the web interface to manage what's been consumed.

I had the same reservations, but the way it seems to work out of the box is, nothing is deleted. Anything it can consume will get turned into a pdf, OCRd, and indexed. Then the new PDF and the original file get moved to the media folder you specify. Original is always retained and can be downloaded to your client machine at any time alongside the new PDF.

I have mine set to apply an inbox tag to anything newly consumed. I also saved a custom view which filters in any document with that tag. I'm using that View as my work queue so to speak. The inbox tag is removed after I give the file a once-over. Makes it easy to see which documents still need my attention.

[–] [email protected] 2 points 2 months ago (1 children)

Thanks for sharing! That sounds perfect since it keeps the original file or else lets me manually upload and delete myself.

[–] [email protected] 1 points 2 months ago

You're welcome, happy to help.

One other thing I should note, default behavior is to rename your original files to match the ID number assigned by paperless. I'm not sure if this can be changed... I had a few reservations about this but I accepted it too - I can let it do its thing and all it'll cost me is being super careful about db backups.