diff options
author | w96k <w96k@debian> | 2022-11-13 03:40:57 +0400 |
---|---|---|
committer | w96k <w96k@debian> | 2022-11-13 03:40:57 +0400 |
commit | 58548d3374588467636b4c4846a5a87568b6d0b2 (patch) | |
tree | 1cbf1190ec824dc8cab5698c401f9bb3889bd7c9 | |
parent | 5d02e4122773e73ce389bb7f0c409240f0afec72 (diff) |
Add RAM fix article
-rw-r--r-- | content/en/posts/ram-fix.org | 159 | ||||
-rw-r--r-- | public/images/thinkpad_ram_fail.jpg | bin | 0 -> 13046 bytes | |||
-rw-r--r-- | public/images/thinkpad_ram_repair.jpg | bin | 0 -> 8873 bytes |
3 files changed, 159 insertions, 0 deletions
diff --git a/content/en/posts/ram-fix.org b/content/en/posts/ram-fix.org new file mode 100644 index 0000000..01db750 --- /dev/null +++ b/content/en/posts/ram-fix.org @@ -0,0 +1,159 @@ +#+TITLE: RAM failures: How to detect and fix +#+DATE: <2022-10-26 Wed> +#+LANGUAGE: en + +* RAM failures: How to detect and fix + +#+BEGIN_abstract +If your system break in unexpected ways and you can't understand why it +might be the RAM issue. In this article, I will try to explain why RAM +is getting broken, how to detect it, and how to potentially fix it +without going too deep into details. +#+END_abstract + +#+ATTR_HTML: :loading lazy +[[file:../../../public/images/thinkpad_ram_fail.jpg]] + +I've started struggling with segfaults and other problems more often +after years of using my laptop. It was not critical, but annoying and it +turned out that the problem was a broken ram module. I tested it and +then after figuring out that it was broken I removed it completely. I +had only 2GiB of RAM available after that, but I got to the conclusion +that 2GiB is enough for me to do the work if to apply some optimizations +to my GNU+Linux system. + +** What RAM is used for +RAM is used to execute and store program memory. When you run the binary +all CPU instructions are being placed into RAM and then all that +instructions are being read sequentially one after another. While +executing your program it will store intermediate results such as +storing variables, data structures, and so on into RAM. After finishing +your program RAM is cleared from program instructions and intermediate +results of execution of these instructions. + +[[[https://en.wikipedia.org/wiki/Random-access_memory][Wikipedia: Random-access memory]]] + +** Why RAM can be damaged +I can't say why exactly it can be damaged, overheating is probably one +of the factors. There is a chance that RAM itself is not damaged at all. +Actually it looks like that it breaks rarely. What can be "damaged" is +RAM module contacts and it can be easily fixed. What you need to do is +to take out ram modules and use an eraser to clean its contacts. But how +to understand that you have a problem in the first place? + +** How to understand that RAM is broken +The most expected way to see RAM fail is [[https://computerhope.com/beep.htm][BIOS signalling]] that it is +broken. It should beep using PC Speaker a special signal. You can read +your motherboard manual to understand what does it mean. Usually it +means that computer won't start with "completely" broken RAM module. + +If system loads just fine, but you experience problems along the way +such as random segfaults and programs crashes, kernel panics and so on, +you might have broken segments of RAM. To detect such segments you can +use several programs listed below. RAM checking is usually not a fast +process, so you will probably need to left your device running for +several hours. + +*** Memtest86+ +Memtest86+ runs from grub menu before your OS. It needs to run in such +way because it needs whole range of RAM and your system can use a range +of ram and would not allow to properly check it. It runs a lot of checks +and checks every segment of your ram. While checking it logs the list of +broken segments that you can write down. + +You can install it using your GNU+Linux package manager such as apt. The +package is usually called ~memtest86+~. But there is a small caveat. If +you use old version it wont work on UEFI systems. + +If it doesn't work you can download memtest86+ newer version +distribution to your usb stick and load memtest from that. It should +work on UEFI and BIOS systems. It can be downloaded from offical +website. + +[[[https://memtest.org/][Official Website]]] + +*** Memtester +It has the same purpose as memtest86+, but it runs while your system is +running, so it doesn't check whole RAM range, but only specified free +ram available at your system at the moment of running this program. It +can be installed using your package manager of choice by typing +~memtester~ as package name. + +[[[https://pyropus.ca./software/memtester/][Official Website]]] + +** How to fix broken RAM +#+ATTR_HTML: :loading lazy +[[file:../../../public/images/thinkpad_ram_repair.jpg]] + +First of all, if memtest86+ or(and) memtester doesn't show you any +error, congratulations! You don't have any problems with your RAM. + +If it shows small amount of errors like one or two, you can let Linux +Kernel ignore such segments of RAM, so programs doesn't use such broken +segments and programs work stable all the time. You need to use for that +~memmap~ kernel argument in your grub configuration. For example: +~memmap=0x100000$762ce9c38420,0x100000$34e03060,0x100000$87fce060,0x100000$23c63060,0x100000$87b6c060~. There +is also grub config unit called ~GRUB_BADRAM~, but it looks like it is +deprecated and memmap is preffered. + +For more details about blacklisting bad segments of RAM read [[https://unix.stackexchange.com/questions/75059/how-to-blacklist-a-correct-bad-ram-sector-according-to-memtest86-error-indicati][this +comprehensive Stack Overflow answer]]. + +If it shows huge amount of errors, like many thousand, it means that +probably one of your sticks of RAM are broken. To detect which one is +broken exactly you can probably figure it out looking at addresses or +running another test using specific stick(s) of RAM and seeing if errors +are gone. + +Be aware, that if you left with one RAM stick there is a chance, that it +will only boot in specific RAM slot. Read your motherboard manual if +something doesn't work. + +If you have a RAM memory stick with tons of errors, you can try to +repair it. I can't tell how exactly it is being done and why it is done +in the way it should be done. You can find videos on fixing RAM sticks +on YouTube and other resources. Here is [[https://youtu.be/KVR91p-Bd6M][the link to one of such videos]]. + +** RAM Optimizations of GNU+Linux system +If your RAM was broken and you left with much less memory that you +expected, don't run and buy new RAM sticks. There is a chance that even +with less RAM the system will work completely fine. Linux is pretty good +at working on low end machines and it has different ways to handle lack +of memory. It is often situation to have devices that outperform their +tasks in modern world, like working on gaming laptop with very powerful +CPU and GPU, that are used mostly to render text in a text editor. + +*** Swap +Swap is a partiton on your hard drive that is being used in situation +when there is no RAM left. It is used for other reasons too and such +partition is recommended to have on most GNU+Linux systems. + +You can configure how often linux system will use swapp changing +~swappiness~. You can read about changing that setting and to learn about +swap in general in the link below. + +[[[https://wiki.archlinux.org/title/swap][Arch Linux Wiki: Swap]]] + +*** Zram +Zram is something that stays in between RAM and Swap in terms of +performance. It helps your system to stay performant if it uses swap a +lot, but it increases the CPU usage because of that. I use Zram on a +machine with 2GiB of RAM and 16GiB swap and it works great even with +many programs opened at the same time (text editor, browser, docker +container, messanger). + +[[[https://wiki.archlinux.org/title/Improving_performance#zram_or_zswap][Arch Linux Wiki: Zram]]] + +*** Less bloat software +Also as alternative way you can simply use less bloat software, so you +don't need so much RAM in the first place. In many cases good software +doesn't require a lot of RAM, but bad software always leak memory, so +you would need many GiBs of RAM to use it properly. The most bloated +software is a web-browser such as chromium and firefox and browser-based +apps done in electron such as Slack, VSCode and other proprietary +products. + +** Conclusions +Now you have directions about what to do when you suspect RAM +failure. That knowledge can also be used for testing when you buy used +memory sticks from someone else. diff --git a/public/images/thinkpad_ram_fail.jpg b/public/images/thinkpad_ram_fail.jpg Binary files differnew file mode 100644 index 0000000..dcf769b --- /dev/null +++ b/public/images/thinkpad_ram_fail.jpg diff --git a/public/images/thinkpad_ram_repair.jpg b/public/images/thinkpad_ram_repair.jpg Binary files differnew file mode 100644 index 0000000..fe4b290 --- /dev/null +++ b/public/images/thinkpad_ram_repair.jpg |