this post was submitted on 24 Oct 2023
22 points (100.0% liked)
Programming
13373 readers
1 users here now
All things programming and coding related. Subcommunity of Technology.
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
? This is how Postgres stores data, as documents, on the local filesystem:
There are hundreds, even thousands, of documents in a typical Postgres database. And similar for other databases.
But anyway, the other side of the issue is more problematic. Converting relational data to, for example, a HTTP response.
Yep... it's pretty easy to write a query on a moderately large database that returns 1kb of data and takes five minutes to execute. You won't have that issue if your 1kb is a simple file on disk. It'll read in a millisecond.
Those are not documents for a definition of document that works with the rest of your comment. If by document you mean "any kind of data structure", then yes, those are documents. But then the term becomes meaningless, as literally anything is a document.
Sure, but then finding that document takes 5 minutes because you need to read a few million files first.
Yep — that is what I mean by documents, and it's what I meant all along. The beauty of documents is how simple and flexible they are. Here's a URL (or path), and here's the contents of that URL. Done.
No, because you can't store "literally anything" in a Postgres database. You can only store data that matches the structure of the database. And the structure is also limited, it has to be carefully designed or else it will fall over (e.g. if you put an index on this column, inserts will be too slow, if you don't have an index on that column selects will be too slow, if you join these two tables the server will run out of memory, if you store these columns redundantly to avoid a join the server will run out of disk space...)
Sure - you can absolutely screw up and design a system where you need to read millions of files to find the one you're looking for, but because it's so flexible you should be able to design something efficient easily.
I'm definitely not saying documents should be used for everything. But I am saying they should be used a lot more than they are now. It's so easy to just write up the schema for a few tables and columns, hit migrate, and presto! You've got a data storage system that works well. Often it stops working well a year later when users have spent every day filling it with data.
What I'm advocating is stop, and think, should this be in a relational database or would a document work better? A document is always more work in the short term, you need to carefully design every step of the process... but in the long term it's often less work.
Almost everything I work with these days is a hybrid - with a combination of relational and document storage. And often the data started in the relational database and had to be moved out because we couldn't figure out how to make it performant with large data sets.
Also, I'm leaning more and more into using sqlite, with multiple relational databases instead of just a single database. Often I'm treating that database as a document. And I'm not alone, Sqlite is very widely used. Document storage is very widely used. They're popular because they work and if you are never using them, then I'd suggest you're probably compromising the quality of your software.
But that content is meaningless, because you just saved an arbitrary data structure. It’s not as if you can do anything with those postgres files. Or those possibly multi GB MSSQL
.mdf
,.ldf
,.ndf
documents. That’s data (a word that’s imo far clearer than document) stored in a very specific way that you need to know the exact structure of to make any sense of. It’s not usable directly in any way. Not "Done."Yes you can. You can either add space for what you need to store, or you can, again, store e.g. a JSON blob.
Or don’t, and it will only be as slow ass a NoSQL Database …
It’s the opposite, a document db is far easier in the short term, that’s why everyone jumped on them before seeing the limitations.
Yeah, a relational DB is harder because you have to have a good design, that allows you to do what you actually want to do. And if you none of your devs are good at SQL, then probably a document db is better. And yes, sometimes, you need nothing but a document DB. But I still heavily disagree that most of the time you want one.