373
Who is using my file? (thelemmy.club)
you are viewing a single comment's thread
view the rest of the comments
[-] oats@piefed.zip 124 points 3 days ago

Well actshly, rm removes the inode, not the file. If it's still in use it'll stay on the disk until the last fd is closed.

  • with most file systems that are usual on linux
[-] sol6_vi@lemmy.makearmy.io 3 points 1 day ago

I keep seeing things about inodes lately. Can you ELI5 for a novice? Where are the inodes? Are they in the room with us now? πŸ‘€

[-] oats@piefed.zip 4 points 1 day ago

Extremely simplified:

Your file system consists of a whole lot of blocks to write data to. Let's say you have a block size of 512kB, so a 4MB file would span 8 blocks. A 3.7MB file would span 8 blocks, too, as the remaining space can't get used otherwise.

Now to get what file exists on which blocks, there's a large index table, consisting of a number of index nodes (shortened to inode). Each inode saves multiple data fields of a file, like its name, owner, creation data, and the files blocks.

If you link a file to a second name (hard link) a second inode will get created that points to the same blocks.

That's about it. Used to be important to chose the right inode size and count on filesystem creation for the average data you'll save on the filesystem, as inodes have a fixed count, and the index table takes disk space, too. Too many inodes and you waste space that you could use for precious data, too few inodes and you can't save new files even when you have free data blocks. With growing disk sizes people just went with massive indexes, who cares about a few wasted megs.

Modern filesystems (like ext3 and up) introduced journals, which complicate things.

[-] sol6_vi@lemmy.makearmy.io 2 points 1 day ago

Thank you so much for taking the time to explain it. Actually makes sense.

[-] DaGammla@lemmy.ml 89 points 3 days ago

Yep, it's a smart system.
For the user, the file is immediately no longer in the tree, so for them it's considered done.
The OS should handle all the hassle, not the user.

[-] mogoh@lemmy.ml 61 points 3 days ago

I think "consider it done" puts it well.

[-] oats@piefed.zip 12 points 2 days ago

"I'll get to it, eventually" would ruin the meme but be more fitting, in my opinion.

Had multiple occasions where people fought against filling disks and just couldn't see why. Well, that 10 gig log file you deleted two weeks ago? It's 20 gig now, and still being written to.

lsof shows stuff like that.

[-] Redjard@reddthat.com 7 points 2 days ago* (last edited 2 days ago)

Nothing new can open it immediately.
So it's effectively deleted with old references slowly phasing out.

Zombie files are an issue though. A while back I had a huge zombie file on a tmpfs which was filling all my ram. So I built a tool to track it down and traced it to a konsole instance with a killed tab that previously had billions of lines of stdout history.

https://github.com/redjard/zombie-file-list

zombie-file-list

Lists Linux files that are still opened in a process but were deleted. These "zombie files" use up space and inodes but are hard to find.

I wrote this because my /tmp tmpfs was taking up 32GB of ram despite the files inside summing to only 3MB.

Usage:

zombie-file-list <path to filesystem>

Note:

  • This command is designed to be run on filesystem roots, not paths in general.
  • Sizes are apparent sizes, e.g. on ext4 the actual sizes are rounded up to the next 4KiB
[-] tal@lemmy.today 2 points 2 days ago* (last edited 2 days ago)

I wrote this because my /tmp tmpfs was taking up 32GB of ram despite the files inside summing to only 3MB.

Note that tmpfs doesn't force its contents to remain in memory


the kernel can move stuff there to swap space if it needs to do so.

Ramfs is the filesystem that keeps things locked in memory:

https://www.kernel.org/doc/Documentation/filesystems/ramfs-rootfs-initramfs.txt

With ramfs, there is no backing store.  Files written into ramfs allocate
dentries and page cache as usual, but there's nowhere to write them to.
This means the pages are never marked clean, so they can't be freed by the
VM when it's looking to recycle memory.

ramfs and tmpfs:
----------------

One downside of ramfs is you can keep writing data into it until you fill
up all memory, and the VM can't free it because the VM thinks that files
should get written to backing store (rather than swap space), but ramfs hasn't
got any backing store.  Because of this, only root (or a trusted user) should
be allowed write access to a ramfs mount.

A ramfs derivative called tmpfs was created to add size limits, and the ability
to write the data to swap space.  Normal users can be allowed write access to
tmpfs mounts.  See Documentation/filesystems/tmpfs.txt for more information.
[-] Redjard@reddthat.com 1 points 2 days ago

Maybe. It's been a while so I don't know 100% this was put to the test, but I wanna say the system has a weird kernel which leads to it not swapping out tmpfs properly.

But ordinarily you should be right, this would simply ruin the stats visually until something forced it to swap out, since konsole shouldn't be accessing it.

[-] Dave@lemmy.nz 1 points 2 days ago* (last edited 2 days ago)

Could you have restarted to allow the OS to clean it up?

[-] Redjard@reddthat.com 7 points 2 days ago

I could have, but the system wasn't set up to restart without downtime, and the server was also remote and not easily accessible.
It did acutally die due to a poweroutage some months later and took 2 days to get restarted.
So yeah sometimes restarting is way more undesirable than loosing access to 32GB of ram. I would have just eaten that cost otherwise until a more opportune chance to restart.

Besides, restarting to fix a problem is equivalent to giving up on understanding the issue, learning new stuff, and maybe finding a way better solution or preventing the type of error entirely.
I get not finding the motivation when your software is working against you and learning is ultimately fruitless like on windows, or not having the time in the moment to figure it out properly, but a perfectly good bug on a linux system when you have time is prime real-estate to grow your skills and find fulfillment.

[-] Dave@lemmy.nz 1 points 2 days ago

Ah that makes sense. I had considered it might be a server but you mentioned a Konsole tab so my mind decided it must have been local machine.

Crazy it took 2 days to restart the server!

[-] Redjard@reddthat.com 2 points 2 days ago

There was a dedicated person on call, but it happened to be when they were away.
The Konsole was left running from a local access, with a while true loop of a service status command. When that service was stopped later, the while loop started rerunning the script every second, filling the buffer with error messages.
The tab was then killed remotely, but the Konsole window left running. Process ram usage went down but the file remained on tmpfs, which is not counted as ram usage so wasn't noticed.
Then it took some time to notice the ram usage mismatch so noone thought of that konsole incident.

[-] tal@lemmy.today 29 points 2 days ago* (last edited 2 days ago)

You can also still access the file as long as there's a process that still has it open. I have, in the past, "undeleted" a file or two doing that.

$ echo foo > bar
$ tail -f bar
foo

In another terminal:

$ pidof tail
1525534
$ ls -l /proc/1525534/fd|grep bar
lr-x------ 1 tal tal 64 Jun 16 06:42 3 -> /home/tal/bar
$ rm bar
$ ls -l /proc/1525534/fd|grep bar
lr-x------ 1 tal tal 64 Jun 16 06:42 3 -> /home/tal/bar (deleted)
$ cat /proc/1525534/fd/3
foo
$ cat /proc/1525534/fd/3 > bar-recovered
$ cat bar-recovered
foo
$

That is, the /proc entry for tail's file descriptor 3 there looks kinda like a symlink, but the kernel doesn't actually make it behave in quite the same way as a normal symlink.

That being said, getting back to the original point about unlinking not being able to remove the directory entry...it won't sit there blocking you from putting a new directory entry there with the same name, the way Windows file semantics mandate.

EDIT: Also, what rm removes is the directory entry rather than the inode. The inode sticks around as long as the file data is there. You can have multiple directory entries for an inode, or none at all, but file data will have an inode associated with it.

$ touch a
$ sudo ln -T a b
$ stat -c %i a
216538023
$ stat -c %i b
216538023

Same inode, different directory entries.

https://en.wikipedia.org/wiki/Inode

inode persistence and unlinked files

An inode may have no links. An inode without links represents a file with no remaining directory entries or paths leading to it in the filesystem. A file that has been deleted or lacks directory entries pointing to it is termed an 'unlinked' file.

Such files are removed from the filesystem, freeing the occupied disk space for reuse. An inode without links remains in the filesystem until the resources (disk space and blocks) freed by the unlinked file are deallocated or the file system is modified.

Although an unlinked file becomes invisible in the filesystem, its deletion is deferred until all processes with access to the file have finished using it, including executable files which are implicitly held open by the processes executing them.

Do modern file systems actually remove it from disk? It's been a while since I did any forensics and didn't do much of it, but I remember being able to batch restore files from inodes as long as that part of the disk hadn't been overwritten. That's why you're supposed to overwrite disks with random data if you want to data gone.

[-] oats@piefed.zip 9 points 2 days ago

Nah, they just throw away the block markings, absolutely.

Overwriting a SSD is difficult as well, better encrypt the drive and trash the key when you decommission.

[-] spicehoarder@lemmy.zip 1 points 2 days ago
[-] oats@piefed.zip 1 points 2 days ago

You mean like in your kitchen? Too much metal, you'll damage your magnetron.

You could use thermite and melt it to a pulp. Dangerous as well, though.

Really, just encrypt. Your CPU has AES extensions, performance impact is negligible. Simple, clean, and a protection against involuntary decommission as well.

[-] cmnybo@discuss.tchncs.de 1 points 2 days ago

If the file is on an SSD and trim is enabled, the blocks will be erased eventually.

[-] d00ery@lemmy.world 4 points 2 days ago

Interesting TIL

this post was submitted on 16 Jun 2026
373 points (97.0% liked)

linuxmemes

31793 readers
1127 users here now

Hint: :q!


Sister communities:


Community rules (click to expand)

1. Follow the site-wide rules

2. Be civil
  • Understand the difference between a joke and an insult.
  • Do not harrass or attack users for any reason. This includes using blanket terms, like "every user of thing".
  • Don't get baited into back-and-forth insults. We are not animals.
  • Leave remarks of "peasantry" to the PCMR community. If you dislike an OS/service/application, attack the thing you dislike, not the individuals who use it. Some people may not have a choice.
  • Bigotry will not be tolerated.
  • 3. Post Linux-related content
  • Including Unix and BSD.
  • Non-Linux content is acceptable as long as it makes a reference to Linux. For example, the poorly made mockery of sudo in Windows.
  • No porn, no politics, no trolling or ragebaiting.
  • Don't come looking for advice, this is not the right community.
  • 4. No recent reposts
  • Everybody uses Arch btw, can't quit Vim, <loves/tolerates/hates> systemd, and wants to interject for a moment. You can stop now.
  • 5. πŸ‡¬πŸ‡§ Language/язык/Sprache
  • This is primarily an English-speaking community. πŸ‡¬πŸ‡§πŸ‡¦πŸ‡ΊπŸ‡ΊπŸ‡Έ
  • Comments written in other languages are allowed.
  • The substance of a post should be comprehensible for people who only speak English.
  • Titles and post bodies written in other languages will be allowed, but only as long as the above rule is observed.
  • 6. (NEW!) Regarding public figuresWe all have our opinions, and certain public figures can be divisive. Keep in mind that this is a community for memes and light-hearted fun, not for airing grievances or leveling accusations.
  • Keep discussions polite and free of disparagement.
  • We are never in possession of all of the facts. Defamatory comments will not be tolerated.
  • Discussions that get too heated will be locked and offending comments removed.
  • Β 

    Please report posts and comments that break these rules!


    Important: never execute code or follow advice that you don't understand or can't verify, especially here. The word of the day is credibility. This is a meme community -- even the most helpful comments might just be shitposts that can damage your system. Be aware, be smart, don't remove France.

    founded 3 years ago
    MODERATORS