this post was submitted on 30 Oct 2024
9 points (100.0% liked)
techsupport
2468 readers
1 users here now
The Lemmy community will help you with your tech problems and questions about anything here. Do not be shy, we will try to help you.
If something works or if you find a solution to your problem let us know it will be greatly apreciated.
Rules: instance rules + stay on topic
Partnered communities:
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Rambling Story
Once, I had an El Cheapo and very questionable SATA SSD fail on my system. Had similar symptoms, Windows would hang and crash at random, becoming more frequent over time. Found out while digging through Windows logs and troubleshooting, that the system would crash when trying to access the drive via the file explorer, because the drive would disconnect. The SSD seemed to fail slowly, but I was using it as a faster workspace and saving everything to an HDD, so I never looked into the possibility of a failing drive until the system wouldn't boot. Removing the drive cured everything. I should probably note that the failed SSD wasn't the boot drive, it was used strictly for data, so the OS wasn't being unmounted directly. I think the drive itself was shorting out some of the SATA pins, scrambling the whole bus.Several years later, on the Linux side of things, I found out that fstab can prevent booting if a storage device is missing. Fstab had auto configured an external drive enclosure as a critical component on a fresh install. Not sure what the error messages would look like if an internal data drive mounted as critical disconnected on a running system, but I would assume Linux would halt even if no processes are running from the drive.
I'm not sure what the symptoms would have been if my SSD drive failed while running Linux. My gut says it would show similar to your Linux dmesg, like the boot drive I/O disconnecting or becoming inaccessible.
I've also had a system with an AMD processor fail to boot, but that one wouldn't even POST. Fixed that one by finally reseating the CPU. Turns out that's a common issue with some AMD CPUs using the AM4 socket, found a lot of complaints online for that one after the fact.
Since your system runs fine from a live USB, and you've already replaced the M.2 drive, I would try running the system without any SATA drives installed, and try to force a crash until you feel confident the issue is gone.
If the problems still persist, then I would look at getting a cheap fresh HDD and new SATA cable, installing a temporary OS, and try the test again.
If it STILL crashes, I would look at removing all unnecessary hardware from the motherboard and slowly testing each stage as you rebuild.
I unplugged SATA cables last night, booted from Windows USB to install it, SSD disconnected again mid course :) SSD is disconnected somehow and if it happens in OS installed on, it causes crash. On USB, there is no crash. It's not HDD, not memory or cpu, not SSD (it's brand new already). I'm down to motherboard at this point.