Tuesday, May 13, 2014

A DIMM which won't work with any other DIMMs in the same channel

Since I first got it, I used my Gigabyte GA-P35-DS3R motherboard with 2 GB of DDR2 RAM. This was more than enough at first, but now it results in too much disk access even when only running KDE and Firefox.

The old RAM is an OCZ2G8002GK DDR2-800 OCZ Gold 2*1GB 5-5-5-15 kit consisting of two OCZ28001G modules. I was running it at stock speeds and voltages, and it seemed perfectly stable, never producing any errors in tests. I concluded that 4GB would probably be enough, but I chose to upgrade using 2*2GB for two reasons: I would have 4GB even if I can't get the kits working together, and it's always better to have more memory than you think you need. I found a really good deal on eBay for OCZ2P8004GK DDR2-800 OCZ Platinum 2*2GB, consisting of two OCZ2P8002G modules.

When I put in the new RAM together with the old RAM, I got a hang at the initial graphical BIOS screen, but I could boot if I only put in the new RAM. At first, this seemed like a compatibility problem, maybe because the old RAM was required 1.8V and the new RAM required 2.1V.  I had forgotten about OCZ Platinum requiring 2.1V, and that requirement wasn't stated anywhere on the eBay item page or the labels on RAM photos. I became skeptical when I saw that the old and new kits both worked alone at 5-5-5-15 timings, at either normal voltage or 2.1V. They only failed to work together.

Then I tried relaxing various primary and secondary timings and reducing the frequency. It seems the OCZ28001G modules couldn't handle CL6, but I could relax all the other timings. Nothing helped. In most cases, my computer would power off and back on twice and then hang. That seemed to be the motherboard's attempt to switch to more conservative settings. I assume it is meant to recover from a failed overclock, but it never managed to recover from this. I would have to remove a DIMM to get into BIOS setup and change settings for another attempt. Early on in this process I removed the hard drive so it doesn't get subjected to all these power cycles. Eventually I was forced to give up because I couldn't imagine what else I could change.

I tried another experiment, putting the old RAM in the slots closest to the CPU, and the new RAM in the slots further away. The intention was to put the old RAM in one channel and the new RAM in another channel, in case they were incompatible in the same channel. This configuration allowed me to boot, but caused lots Memtest86+ errors past 4 GB. According to DMI data, this address was in the middle of one of the new DIMMs. I didn't conclude anything based on this, because I didn't know if the configuration was supposed to work, and because it was weird to see errors start in the middle of a DIMM.

Later, I was inspired to try yet another experiment: two DIMMs in one channel, with nothing in the other channel. This would allow more possibilities with only 4 DIMMs. I found that one of the DIMMs wouldn't work with any other DIMM in the same channel, but the other DIMMs would work together. This finally made it seem like one DIMM is defective.

After getting a new G.Skill F2-6400CL5D-4GBPQ set, the suspect DIMM wouldn't work with either of those in the same channel, but the other OCZ2P8002G DIMM worked fine as part of a 2*2+1*1 GB DDR2-800 5-5-5-15 configuration with one of the new G.Skill DIMMs. This seems to confirm that one OCZ2P8002G DIMM is defective.

It's surprising that a DIMM can be bad in a way that it passes tests if alone in a channel but fails when there is another DIMM in the channel. However, it makes sense. Diagnostic programs can only tell you if the memory subsystem of that computer is reliably storing and retrieving data. They can't tell you if a DIMM is meeting its electronic specifications.

No comments: