this post was submitted on 07 Apr 2024
146 points (96.8% liked)

Linux

48039 readers
775 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

cross-posted from: https://discuss.tchncs.de/post/13814482

I just noticed that eza can now display total disk space used by directories!

I think this is pretty cool. I wanted it for a long time.

There are other ways to get the information of course. But having it integrated with all the other options for listing directories is fab. eza has features like --git-awareness, --tree display, clickable --hyperlink, filetype --icons and other display, permissions, dates, ownerships, and other stuff. being able to mash everything together in any arbitrary way which is useful is handy. And of course you can --sort=size

docs:

  --total-size               show the size of a directory as the size of all
                             files and directories inside (unix only)

It also (optionally) color codes the information. Values measures in kb, mb, and gb are clear. Here is a screenshot to show that:

eza --long -h --total-size --sort=oldest --no-permissions --no-user

Of course it take a little while to load large directories so you will not want to use by default.

Looks like it was first implemented Oct 2023 with some fixes since then. (Changelog). PR #533 - feat: added recursive directory parser with `--total-size` flag by Xemptuous

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 8 points 7 months ago (2 children)

I just tested this and the reported sizes with eza -l --total-size are wrong for me. I compare it to du --human-readable --apparent-size --all --max-depth 1 and with opening properties in my Dolphin filemanager. Some are way off. In example du and Dolphin report for a certain projects folder of mine "149M", while eza reports "184M".

[–] [email protected] 11 points 7 months ago (2 children)

this looks like one is using the SI 1000-based units, instead of the binary 1024-based. im pretty sure du has a --si option.

the B (for bytes) is omitted, so it each is ambiguous to whether its MiB (mebibytes -- binary) or MB (megabytes -- SI).

i may be wrong on the technicals but u get the jist.

[–] [email protected] 15 points 7 months ago

The difference is too large for that. 184 MB is 176 MiB not 149.

[–] [email protected] 8 points 7 months ago (1 children)

No, the difference is way too high to explain it like this, there is no way that 1024 vs 1000 base could explain an increase of approx. "35M" for a "149M" directory. Other folders are much closer like "20K" and "20K" or =or "44M" vs "45M". Also as said Dolphin filemanager reports the same output as du. I even tested du with --si option, which power of 1000 instead 1024 (I'm pretty sure eza does it correctly with 1024, so this is not necessary option to compare anyway).

[–] [email protected] 6 points 7 months ago (1 children)

No, @[email protected] is correct.

I just did a test using dd - I created 100 files of exactly 1 MiB each (1048576 bytes). du reported the size as "100M" as expected, whereas eza reported it as "105M" - which is what you'd get if you divided 104857600 by 1000000 (= 104.8576 or 105M if you round it off).

[–] [email protected] 6 points 7 months ago (2 children)

He is wrong, as I explained it multiple times that this is not the issue here. Install eza and compare to du and possibly some other application that reports the directory size. The difference in filesize cannot be explained by 1000 vs 1024 base. Do the math if you don't believe me.

eza is reporting false directory size for me, unless there is an explanation.

[Desktop]$ du --human-readable --apparent-size --all --max-depth 1 ./trampoline
518     ./trampoline/src
148M    ./trampoline/target
1,1M    ./trampoline/doc
8       ./trampoline/.gitignore
26K     ./trampoline/.git
330     ./trampoline/Cargo.toml
2,1K    ./trampoline/Cargo.lock
149M    ./trampoline
[Desktop]$ du --human-readable --apparent-size --all --max-depth 1 --si ./trampoline
518     ./trampoline/src
155M    ./trampoline/target
1,2M    ./trampoline/doc
8       ./trampoline/.gitignore
27k     ./trampoline/.git
330     ./trampoline/Cargo.toml
2,2k    ./trampoline/Cargo.lock
157M    ./trampoline
[Desktop]$ eza -l --total-size --no-permissions --no-user ./trampoline
2,1k 25 Feb 21:36 Cargo.lock
330  4 Mär 09:21 Cargo.toml
1,1M  5 Apr 12:34 doc
518  5 Apr 12:49 src
183M  4 Apr 20:26 target

And for reference Dolphin the filemanager of KDE Plasma reports 149,1 MiB (156.366.443) , which aligns with du without using --si option. Even the one folder "target" is at 183M with eza (which is the biggest folder in that directory anyway).

[–] [email protected] 9 points 7 months ago* (last edited 7 months ago)

I was talking about the 1000 vs 1024 issue, do the dd test yourself and it's easy to verify that he was right.

As for the specific descrepancy that you're seeing, lots of things can throw off a file size calculation - symlinks, sparse files, reflinks, compression etc. Since you're the only one with access to your files, you'll need to investigate and come to a conclusion yourself (and file a bug report if necessary).

[–] [email protected] 5 points 7 months ago

Could it be this AND block size vs actual used size?

[–] [email protected] 2 points 7 months ago* (last edited 7 months ago) (1 children)

hmm I didn't think to actually test the results. But now that i do, I get same sort of descrepency.

How about this?

eza --long -h --total-size --sort=size --no-permissions --no-user --no-time -a --blocksize --binary

that works in a couple test directories with the column Blocksize.

Also it might (??) be ignoring according to your gitignore if that is relevant? Or behaving differently wrt symlinks?

Seems like the default behavior should be whatever is most expected, standard and obvious. Or else give user a hint.

I find this in the repo, is t relevant?: bug: Inconsistent Size Display in `exa` Command for Large Files (1024 vs. 1000 Conversion) · Issue #519.

don't forget eza --version. I find it is not updated quickly in every distro. See changelog; it looks like there might have been a relevant update as recently as [0.18.6] - 2024-03-06. Actual my system is only updated to 0.17.3 now that I check this too.

[–] [email protected] 2 points 7 months ago (1 children)

With --binary option I get size of 174Mi in eza. Experimenting with some other options didn't help. If something is ignored (maybe gitignore), then it would be that du AND Dolphin filemanager would ignore those files, and eza would not. Which its hard to believe for me. I also deleted the .gitignore and .git files/folder to see if it makes any difference and no, it did not.

The only thing I can think of is maybe something going on with link files, but no idea how or what to test for here.

[–] [email protected] 2 points 7 months ago (1 children)

well I guess a way to test would be to create a new directory and copy or create some files into it rather than using a working directory where there are unknown complexities. IIRC dd can create files according to parameters.

Start with a single file in a normal location and see how to get it to output the correct info and complicate things until you can find out where it breaks.

That's what I would do, but maybe a dev would have a more sophisticated method. Might be worth while to read the PR where the feature was introduced.

Also kind of a shot in the dark but do you have an ext4 filesystem? I have been dabbling with btrfs lately and it leads to some strange behaviors. Like some problems with rsync. Ideally this tool would be working properly for all use cases but it's new so perhaps the testing would be helpful. I also noticed that this feature is unix only. I didn't read about why.

it would be that du AND Dolphin filemanager would ignore those files, and eza would not. Which its hard to believe for me.

Although only 1 of various potential causes, I don't think it is implausible on its face. du probably doesn't know about git at all right? If nautilus has a VCS extension installed I doubt it would specifically ignore for the purposes of calculating file size.

I have found a lot of these rust alternatives ignore .git and other files a little too aggressively for my taste. Both fd (find), and ag (grep) require 1-2 arguments to include dotfiles, git-ignored and other files. There are other defaults that I suppose make lots of sense in certain contexts. Often I can't find something I know is there and eventually it turns out it's being ignored somehow.

[–] [email protected] 5 points 7 months ago* (last edited 7 months ago) (1 children)

About the gitignore stuff of Rust tools: Its the opposite for my results, in that eza has bigger size. And the fact that the independent program Dolphin filemanager aligns with the output of the standard du tool (for which I don't have a config file I think) speaks for being the more correct output.

Ok so I found it: Hardlinks

$ \ls -l
total 9404
-rwxr-xr-x 2 tuncay tuncay 4810688  5. Apr 10:47 build-script-main
-rwxr-xr-x 2 tuncay tuncay 4810688  5. Apr 10:47 build_script_main-947fc87152b779c9
-rw-r--r-- 1 tuncay tuncay    2298  5. Apr 10:47 build_script_main-947fc87152b779c9.d

$ md5sum *
6ce0dea7ef5570667460c6ecb47fb598  build-script-main
6ce0dea7ef5570667460c6ecb47fb598  build_script_main-947fc87152b779c9
68e78f30049466b4ca8fe1f4431dbe64  build_script_main-947fc87152b779c9.d

I went down into the directories and compared some outputs until I could circle it down (is it called like that?). Look at the number 2, which means those files are hardlink. Their md5 checksum are identical. So its what I was thinking all along, some kind of linking weirdness (which in itself is not weird at all). So eza is not aware of hardlinks and count them as individual files, which is simply wrong, from perspective of how much space those files occupy. The file exists once on the disk and requires only one time space.

EDIT: BTW sorry that my replies turned your news post into a troubleshooting post. :-(

[–] [email protected] 2 points 7 months ago (1 children)

For my part I think all this troublefinding and troublesolving is a great use of a thread. :D Especially if it gets turned into a bug report and eventually PR. I had a quick look in the repo and I don't see anything relevant but it could be hidden where I can't see it. Since you've already gone and found the problem it would be a shame to leave it here where it'll never be found or seen. Hope you will send to them.

I also reproduce the bug by moving an ISO file into a directory then hardlinking it in the same dir. Each file is counted individually and the dir is 2x the size it should be! I can't find any way to fix it.

The best I can come up with is to show the links but it only works when you look at the linked file itself:

$ eza --long -h --total-size --sort=oldest --no-permissions --no-user --no-time --tree --links LinuxISOs
Links Size Name
    1 3.1G LinuxISOs
    2 1.5G ├── linux.iso
    2 1.5G └── morelinux.iso

If you look further up the filetree you could never guess. (I will say again that my distro is not up to date with the latest release and it is possible this is already fixed.)

This should be an option. In dua-cli, another one of the other rust terminal tools I love, you can choose:

$ dua  LinuxISOs
      0   B morelinux.iso
   1.43 GiB linux.iso
   1.43 GiB total

$ dua --count-hard-links LinuxISOs
   1.43 GiB linux.iso
   1.43 GiB morelinux.iso
   2.86 GiB total
[–] [email protected] 4 points 7 months ago (1 children)

BTW I actually did a bug report. :-) -> https://github.com/eza-community/eza/issues/923

So nothing wasted. Without your post I would not be curious to test this and who knows, maybe it gets fixed or an option to handle it.

[–] [email protected] 2 points 7 months ago (1 children)

Nice! I'm sure they will appreciate your thorough report.

I wonder if they also plan to make an option about crossing filesystem boundaries. I have seen it commonly in this sort of use case.

Maybe all this complexity this is the reason why total dir size has not previously been integrated into this kind of tool. (Notable exception: lsd if you are interested.) I really hope the development persists though because being able to easily manipulate so many different kinds of information about the filesystem without spending hours/days/weeks/years creating bespoke shell scripts is super handy.

[–] [email protected] 1 points 7 months ago

I used lsd before switching to exa. BTW I was the one who suggested to integrated hyperlink option to lsd. :-) Not saying it wouldn't be added otherwise, but at least it sped up the process back then.^^

On the topic of filesystem boundaries, this is something I always have in mind. Hardlinks in example cannot be on two different drives, by its nature. It's an option I use with my own du script often: -x . It helps to keep things in check and simplifies some things, as I have a complex of multiple drives that are mounted into my home from various places, plus the symbolic links from Steam and so on. Avoiding other filesystems is part of my "workflow", if you so will and would like to see such an option as well.

I just noticed exa has an option -H or --links to list each file's number of hard links. So at least it is aware of hardlinks and could calculate the total size correctly.