Wednesday, November 25, 2020

I give up on running recent NetBSD on old hardware

NetBSD is an amazingly cross platform operating system. It also supports some very old hardware, such as most of the first small VAX systems. However, software in general gets bigger and slower over time. I recently tried to run NetBSD 9.1 on my KA650 based MicroVAX with 16 MB of RAM. It takes about 12 minutes to boot to a login prompt in multi-user mode with a stripped down configuration. It doesn't seem like there's any single reason why things are so slow, and comparing various NetBSD releases, I see things slowed down gradually. So I don't see hope for reasonable performance, and I give up on running NetBSD there.

I also couldn't run NetBSD 9.1 on my SPARCstation ELC. It has 16 MB RAM, but the real problem is that each of the 4 MB memory modules are isolated in the address space, and the kernel needs to fit in one of the modules. Probably I could run NetBSD 9.1 if I had bigger memory modules, but I expect the experience would be disappointing.

This isn't meant to be a complaint about NetBSD. The same trend has been followed by other operating systems, and software in general. I still feel it's a bit sad. I'm not sure that the bloat offers a proportional improvement in functionality. But computers have gotten so much faster, storage has gotten so much bigger, and it has all gotten cheaper, so it's not really a problem in practice.

Finally, here's a boot log of that KA650 based MicroVAX. The first number is the number of seconds from the start, and the second number is time since the previous line.

  0.00   0.00  >> NetBSD/vax boot [1.12 (Sun Oct 18 19:24:30 UTC 2020)] <<
  4.91   4.91  >> Press any key to abort autoboot 0
  4.93   0.02  Trying BOOTP
  4.97   0.05  Using IP address: 192.168.1.4
  5.00   0.03  myip:  (192.168.1.4)
  5.06   0.06  root addr=192.168.1.42 path=/srv/nfs/vax/netbsd/anoat
  5.56   0.50  open netbsd.vax: No such file or directory
  5.58   0.02  > boot netbsd
 36.93  31.34  2043220+79088 [169744+162143]=0x25768c
 57.60  20.67  [   1.0000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
 57.70   0.10  [   1.0000000]     2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
 57.80   0.10  [   1.0000000]     2018, 2019, 2020 The NetBSD Foundation, Inc.  All rights reserved.
 57.87   0.07  [   1.0000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
 57.97   0.10  [   1.0000000]     The Regents of the University of California.  All rights reserved.
 57.97   0.00  
 58.05   0.08  [   1.0000000] NetBSD 9.1 (ANOAT) #1: Thu Nov 19 22:04:31 EST 2020
 58.11   0.06  [   1.0000000] root@:/usr/src/sys/arch/vax/compile/ANOAT
 58.16   0.04  [   1.0000000] MicroVAX 3500/3600
 58.22   0.06  [   1.0000000] total memory = 16328 KB
 58.27   0.05  [   1.0000000] avail memory = 13032 KB
 60.40   2.13  [   1.0000000] mainbus0 (root)
 60.49   0.09  [   1.0000000] cpu0 at mainbus0: KA650, CVAX microcode rev 4 Firmware rev 83
 60.55   0.06  [   1.0000000] lance at mainbus0 not configured
 60.61   0.05  [   1.0000000] uba0 at mainbus0: Q22
 64.54   3.94  [   1.0000000] sgmap exclusion at 0x3f0000 - 0x3fdfff
 64.96   0.42  [   1.0000000] uda0 at uba0 csr 172150 vec 774 ipl 17
 65.43   0.47  [   1.0000000] mscpbus0 at uda0: version 4 model 3
 65.48   0.05  [   1.0000000] mscpbus0: DMA burst size set to 4
 72.80   7.32  [   1.0000000] uda1 at uba0 csr 160334 vec 770 ipl 17
 79.99   7.19  [   1.0000000] mscpbus1 at uda1: version 6 model 13
 80.05   0.06  [   1.0000000] mscpbus1: DMA burst size set to 4
 82.49   2.44  [   1.0000000] qe0 at uba0 csr 174440 vec 764 ipl 17: deqna, hardware address 08:00:2b:XX:XX:XX
 82.69   0.20  [   1.0300080] ra0 at mscpbus0 drive 0: RD53
 82.74   0.06  [   1.0600080] ra1 at mscpbus0 drive 1: RD32
 82.80   0.06  [   1.0900080] rx0 at mscpbus0 drive 2: RX50
 82.86   0.06  [   1.1200080] rx1 at mscpbus0 drive 3: RX50
 88.46   5.60  [   1.1900080] boot device: qe0
 88.50   0.04  [   1.1900080] root on qe0
 88.55   0.05  [   1.1900080] nfs_boot: trying DHCP/BOOTP
 91.68   3.13  [   4.2200080] nfs_boot: DHCP next-server: 192.168.1.42
 91.73   0.05  [   4.2200080] nfs_boot: my_addr=192.168.1.4
 91.79   0.06  [   4.2200080] nfs_boot: my_mask=255.255.255.0
 91.84   0.05  [   4.2200080] nfs_boot: gateway=0.0.0.0
 98.20   6.36  [  10.3300080] root on synthesis:/srv/nfs/vax/netbsd/anoat
 98.24   0.05  [  10.3300080] root file system type: nfs
 98.31   0.07  [  10.3300080] kern.module.path=/stand/vax/9.1/modules
 98.39   0.08  [  10.3400080] TODR too smallWARNING: preposterous TOD clock time
 98.45   0.05  [  10.3400080] WARNING: using filesystem time
 98.50   0.05  [  10.3400080] WARNING: CHECK AND RESET THE DATE!
122.88  24.38  Fri Nov 20 04:01:44 UTC 2020
155.75  32.87  Not checking /: not listed in /etc/fstab
162.67   6.92  mount: Unknown special file or file system `/'
177.78  15.11  Starting file system checks:
204.22  26.44  random_seed: /var/db/entropy-file: Bad file system type nfs
205.15   0.94  /etc/rc.d/random_seed exited with code 1
208.30   3.14  Setting tty flags.
218.78  10.49  Setting sysctl variables:
221.91   3.13  ddb.onpanic: 1 -> 0
226.99   5.08  Starting network.
227.28   0.29  Hostname: anoat.tff.ca
243.03  15.75  Configuring network interfaces:.
243.75   0.72  Adding interface aliases:.
245.54   1.79  Waiting for DAD to complete for statically configured addresses...
347.59 102.05  Building databases: dev, utmp, utmpx.
369.39  21.81  Starting syslogd.
428.49  59.10  Mounting all file systems...
444.08  15.59  Clearing temporary files.
460.34  16.26  Creating a.out runtime link editor directory cache.
546.65  86.31  Setting securelevel: kern.securelevel: 0 -> 1
558.94  12.29  /etc/rc: WARNING: No swap space configured!
559.65   0.71  /etc/rc.d/swap2 exited with code 1
562.11   2.46  Starting virecover.
581.65  19.54  Checking for core dump...
585.27   3.62  savecore: no core dump (no dumpdev)
607.05  21.78  Starting local daemons:.
616.16   9.11  Updating motd.
704.94  88.78  Starting inetd.
719.23  14.28  Starting cron.
735.03  15.80  The following components reported failures:
735.07   0.05      /etc/rc.d/random_seed /etc/rc.d/swap2
735.13   0.05  See /var/run/rc.log for more information.
735.26   0.14  Fri Nov 20 04:11:44 UTC 2020
741.85   6.59  
741.90   0.05  NetBSD/vax (anoat.tff.ca) (constty)

Sunday, October 25, 2020

ULTRIX 4.00 problems writing to devices hosted on a Linux NFS server

When the VAX ULTRIX 4.00 Rev. 158 /usr/diskless/genvmunix kernel does a non-appending write to a device, like "echo foo > /dev/console", that fails if the device is located on a x86_64 Ubuntu 20.10 Linux 5.8.0-25-generic NFS server.

ULTRIX seems to be at fault. It sends an NFS V2 SETATTR call setting the length of that device file (eg. /dev/console) to zero, and that fails. This behaviour was observed after netbooting ULTRIX, on the root NFS file system, and is a problem for running netbooted ULTRIX. Even if I mount another NFS file system read-only, ULTRIX still sends the SETATTR there. Surely it shouldn't be trying to change things on a file system that's mounted read-only! 

Use of SETATTR to truncate a file is standard behaviour when overwriting a file, like "echo foo > bar". In that case it works fine on a writable file system. Also, ULTRIX doesn't make attempts to change ordinary files on read-only mounted NFS file systems.

Appending to devices, like "echo foo >> /dev/console" works properly, both on writable and read-only file systems. 

Monday, October 19, 2020

Installing MIT Project Athena 4.3BSD onto a VAX

Athena 4.3BSD is an MIT modification of 4.3BSD for their Project Athena distributed computing environment. Although it is designed to work in such an environment, it can be installed and used on a single machine. A few things do break, but there are ways around that.

First, you need the Athena 4.3BSD srvd tree, and a way to boot the VAX into a minimal environment. Both are available via http at those links, though if you want proper Unix permissions and ownerships, you should download srvd via AFS if possible.

You also may need a VAX. I have successfully installed on a VAXstation 2000, MicroVAX II, and VAXstation 3100 m38, but I have not successfully installed in simh yet. Though the RD53 drive image from the MicroVAX II does work in simh.

Export the srvd tree read-only via NFS, using a server which accepts version 2 NFS and mountd protocols. In Linux you can add --nfs-version 2,3,4 to nfsd and mountd options. In Ubuntu and probably also Debian, in /etc/default/nfs-kernel-server add --nfs-version 2,3,4 to both at the start of RPCNFSDCOUNT (because RPCNFSDARGS won't make any difference) and in RPCMOUNTDOPTS.

I have not mastered netbooting yet. The first stage seems to be etftp loaded via MOP, but then it seems you cannot simply TFTP load the kernel and expect it to run, and I'm not sure the kernel supports root over NFS. If interested take a look at the vax650 folder in bootkit and the LORE file there. In this case, I'm using an RX33 floppy image from bootkit, 61+.floppy, to boot a VAXstation 2000. That's an ordinary 5.25" 1.2 MB floppy image, which can be written on a PC using dd or rawritewin. Maybe you could netboot NetBSD/VAX on the same VAXstation 2000 and write from there, but I don't know which version has working floppy support.

If not using an attached display and keyboard, hook up the serial console. The VAXstation 2000 uses 9600 baud. The initial steps seem to be 8-N-1, but later on the OS switches to 7-E-1, meaning you'll see lots of characters with the high bit set if you don't reconfigure your terminal emulator.

After booting from the floppy, you'll see kernel output, a few lines from the startup script, a big warning that this destroys the contents of the hard disk, and a prompt for the hostname. I don't believe you can get past this prompt, because it relies on Athena infrastructure to query the hostname. So just control-C out of that. Then you get a prompt. It's a very minimal environment, without even ls.

Now you need to configure networking and mount the srvd tree via NFS. You need to use get_pack to mount via NFS, because mount doesn't seem to understand it. Change the network device name and IP addresses as needed. Note that the NFS server address and path are separate arguments, and not connected with a colon like you would use with mount.

# stty erase ^H
# ifconfig se0 192.168.1.5
# get_pack nfs 192.168.1.42 /srv/nfs/vax/athena/srvd /srvd
/srv/nfs/vax/athena/srvd server ok
/srv/nfs/vax/athena/srvd mounted on /srvd

At this point you could run stuff from /srvd if you wanted and explore. Note that the root filesystem is on a floppy and there is little space and few inodes available there. To continue the installation, you need to set some variables and run the phase 2 installer.

# MACH=VS2000
# save_site_flag=false
# . /srvd/install/phase2.sh

Actually, you might want to first look at phase2.sh, because it assumes some things. For me it was a perfect fit because it assumed the disk was an RD32, but you might have a different hard disk. Also, if your VAX has multiple hard disks, and you want to install onto a particular one, be careful. Remember the front panel write protect buttons.

The installer worked fine, except umount didn't work, and at the end I got a "/dev/rrd0a: DIRECTORY /etc/athena: LENGTH 1040 NOT MULTIPLE OF 512 (ADJUSTED)" warning. This should put a bootable system on your hard disk, but some customizations are needed for a successful multi-user boot. If you try to boot you'll get errors which might be mostly unreadable over serial because new console output starts before old output is finished, making network configuration failure look like "Setticannot opeUnablePleaseHaltinsyncing disks... done". If you do end up in that situation, boot with "b/2 dua0" to enter single user mode and fix things from there.

Since /etc/athena/rc.net relies on Athena infrastructure, I replaced that line in /etc/rc with just a simple "/etc/ifconfig se0 192.168.1.5". Also, change the configuration in /etc/athena/rc.conf. You can set everything false except for NFSCLIENT if you need it to mount srvd. In particular AFSADJUST takes a long time or maybe never completes, and needs to be disabled.

The installer only creates the root filesystem and a site filesystem for local storage. Most of the operating system is in the srvd tree, which will need to be read-only mounted at /srvd. If you don't have space to put it somewhere on the VAX, again mount it via NFS. Add the hostname of the server to /etc/hosts, and add the line for mounting it to /etc/fstab. You can't mount via IP address in fstab.

In /etc/ttys, there are lines which either start an X display manager or getty in text mode. Booting the VAXstation 2000 from serial, X fails to start. This means a wait for a timeout before getting a login prompt. Commenting out the /etc/athena/dm line for console and uncommenting the /etc/getty line fixes that.

You can get rid of the root password by removing the 2pEdLRdD8rMnk from /etc/passwd. (Documents about Athena say the root password of workstations is identical and not secret, but I don't know what it is.) Trying to set a new password via passwd doesn't work because it wants to talk to a Kerberos server.

That should be it. You now have a VAX which runs Athena 4.3BSD. You can even run X on a VAXstation or a KA630 MicroVAX with QDSS. (The KA650 kernel doesn't include QDSS support, though the bootkit one seems to have it.) It's leaner than NetBSD, even working fine on a VAXstation 2000 with 4 MB RAM.

Friday, October 16, 2020

AT motherboard serial connector pinouts can differ between motherboards

Older AT format motherboards don't have a back panel with many connectors, like an ATX motherboard. The only direct external connection is for a keyboard. Many motherboards have onboard serial and parallel ports. These connect to the outside via ribbon cables which go to standard connectors. The standard connector is mounted on an expansion slot cover or elsewhere on the case (so slots aren't wasted for this), and the ribbon cable connects to the motherboard.

Different motherboards can have the same ribbon cable connectors for serial and parallel ports. However, the pinout of those connectors differs, and swapping motherboards may require swapping those connectors.

Thursday, October 15, 2020

ATI Mach 64 GX character generator failure

This is an PCI video card from ATI, based on the 210888GX00 Mach64 chip. I think it's a Graphics Pro Turbo. It is only designed to accelerate the Windows desktop, and was released well before 3D cards became common. To the right you see a 2 MB VRAM expansion board which plugs into the main board.

Here's the image produced by the card:

Note that this isn't just random garbage. It is a text mode screen. Some characters are blank and the rest are all the same horizontal lines. Colours are still there. This is a BIOS startup screen. At the top right you would see the Energy Star logo in yellow and EPA pollution preventer in green. Those graphics are created in text mode by changing some characters of the font and putting them there. So, both standard and user defined characters are wrong the same way.

It's weird that such a card failed while sitting unused in a PC for a long time. I don't know what's wrong with it. If anyone knows, leave a comment.

An ugly fix for on ODIN OEC12C887A RTC

The Odin OEC12C887A chip contains a battery backed real time clock and a small amount of battery backed RAM. The battery and the crystal needed for this are part of the chip. Many old motherboards use these chips, and have them soldered onto the board. Of course, batteries don't last forever, and when they fail, motherboards lose the time, date and settings. Unfortunately it seems settings cannot be retained even while the PC is on. After applying settings, the BIOS reboots and then reports they're invalid. Some motherboards can function this way, with problems related to few settings, like floppy drive type. Others may not allow you to boot.

Then the proper fix is to unsolder the chip, solder a socket, and put a new chip into the socket. Though it's also possible to drill into the chip and connect a new 3V lithium non-rechargeable battery, such as a CR2302. I don't have a Dremel or equivalent, so I used a soldering iron. The result is ugly but it works. Here is a photo of the chip with exposed connections:

You can see 5 metal pieces among the melted plastic. Going clockwise you see a pin from the chip, a tiny bit of metal which it was pried away from (which connects to the bottom of the battery), the top of the battery, a bit of metal connecting to the top of the battery, and another pin from from the chip which was pried away from that.

To understand this better, imagine a normal dual in-line package (DIP) chip. The pins come out the side and bend downwards, in order to go through holes in the circuit board and connect to the rest of the circuit. In this case, some of the pins instead bend upwards, and connect to the battery and crystal. If you look underneath the board you'll see that some pins are missing, and some or all of those are bent upwards. Probably these devices start off as a typical DIP package, which is then encased in a second layer of plastic after those components are added.

The pinout can be found at https://www.betaarchive.com/forum/viewtopic.php?t=22000, in particular in this image. Pin 16 is negative, and pin 20 is positive. Pin 16 has continuity with motherboard ground. Pin 20 has continuity with the larger surface of the coin cell, which is generally positive. Probably the crystal is connected to pins 2 and 3.

It doesn't smell too bad, and controlling the temperature so it's mostly melting and not burning plastic helps. I'm not going to recommend it because plastic fumes might be unhealthy, and I certainly wouldn't want to do this on a regular basis. But the repair was a success. The PCI slot is fine too, with only very minor cosmetic damage to the outer surface.

Monday, October 12, 2020

Fix for Tseng ET3000AX SVGA card displaying monochrome

This is a Micro-Labs VGA Solution, an ISA super VGA card based on the Tseng Labs ET3000AX. It seems to be a reference design because other brands of cards also exist with the same board layout. It displayed in monochrome on the SVGA monitor.

At first I thought the switches on the back were set incorrectly. They tell the board what kind of monitor is connected and whether it's primary or secondary. They're documented here, but basically they all need to be off. That tells it a that it's the primary card, a "multisync analog monitor" is connected, and a secondary monochrome card may or may not be present. Up, away from the circuit board, is off.

The switches seemed to make no difference. So, I thought the board was broken. But no, the VGA connector standard changed. Those switches only talk about colour and monochrome regarding TTL monitors, using the DB9 connector. The VGA monitor is detected via monitor ID bits according to the old VGA connector standard. The current standard doesn't include those and instead uses an EEPROM providing information about the monitor. It seems that based on what the board detects on those ID bits, it can go into monochrome mode. All the palette entries get transformed via gray-scale summing, making everything black and white. This even happens if you try to set the palette via int 10h.

A software solution is coloron.com, which uses int 10h functions to switch the card to colour mode from DOS. Another solution is to boot with the monitor disconnected. This doesn't only mean power-on, but also if you reboot via the reset button or control-alt-delete. I opted for a hardware solution, cutting the circuit board trace between those connector pins and the rest of the card:

One last thing. The old VGA connector standard has a key at pin 9, meaning no pin there in the male and no hole there in the female. Now that is +5 V DC supplied to the monitor for the identification EEPROM. You might want to drill a hole there in the connector to avoid breaking pins on more modern VGA connectors.

Monday, October 05, 2020

Telequipment D54 oscilloscope bad MPS6518 transistor in sweep-gating bistable

My Telequipment D54 oscilloscope didn't show a trace. Instead, if I turned up brightness all the way there were some very faint blobs which changed a bit with the horizontal position adjustment. If I pulled the plug, the spot appeared coming from the left of the screen. This is similar to when trigger stability has been turned up and it is not being triggered, but stability couldn't make it work and even external X input mode did not work.

The service manual (available at radiomuseum.org) contains schematics and a detailed circuit description.

Clearly the horizontal deflection circuit was pushing the beam all the way to the left. I first removed JFET TR107 and connected a resistor to ground from the sawtooth output. This affected the voltage at the collector of TR108, and the horizontal deflection amplifier worked as expected. But there was still no spot on the screen. I assumed that was due to blanking.

The fact the CRT was blanked and even external X input didn't work made me focus on the sweep-gating bistable built around TR105 and TR106 transistors. After probing it a bit I unplugged TR105, an MPS6518 transistor, and to my surprise the base seemed open circuit. I don't know what made it fail there. Putting in a 2N2907A made it work. This replacement was very easy thanks to the fact all the transistors are socketed. Here's the right side board with the timebase circuitry, a closeup on TR105, and the bad MPS6518 transistor:




For completeness, here's the left side board, with the two channel inputs, vertical amplifiers, and trigger circuitry which provides an input to that sweep-gating bistable:



Friday, October 02, 2020

DEC KA650 (MicroVAX III / 3500 / 3600) error 62 can mean a bad keyboard connected to QDSS

My MicroVAX produced no output via the QDSS. I saw the monitor flash as if the QDSS was initialized, but video output appeared. Instead text went to the serial console of the KA650 CPU board. There, at the start of self test, I saw an error:

?62 2 08 FF 00 0000


According to the KA650 CPU Module Technical Manual, this is:

62 2004 E254 console QDSS mark_notpresent self test r0 self test r1 ****

It appeared at the same time as a keyboard beep. At the end of self tests there was the ominous "Normal operation not possible" warning. Swapping the LK201 keyboard fixed the problem. The keyboard causing the problem had all 4 LEDs lit, while the working keyboard had them all unlit except for a flash during initialization. There is no more error, and QDSS console works fine.

The MicroVAX can still boot NetBSD/VAX 1.3_BETA from the RA90 drive. Digital sure built this stuff to last!

The known failing parts are the RD53 drives. They can spin up with the trick of disconnecting the head actuator coil at startup, but have many read errors. I'll try reformatting them.

Thursday, September 24, 2020

Netbooting SunOS on a SPARC workstation from Ubuntu 20.04

The general procedure for booting a Sun SPARC workstation over the network is well documented. I liked this guide.The post here exists to document ways in which things break in modern Ubuntu and changes which need to be made. I was booting a SPARCstation ELC.

First you need an IP address. It is best to choose a name, add that line to /etc/hosts, and use the name whenever possible.

The boot process starts with the workstation requesting its IP address via Reverse Address Resolution Protocol (RARP). This simply works. Install the rarpd package and put the address in /etc/ethers.

You can now load boot (like kvm/stand/boot.sun4c) or the kernel (kvm/stand/vmunix) via TFTP. Note that the file name is the IP address followed by period and the architecture, capitalized, like C0A8017B.SUN4C for a SPARCstation ELC at 192.168.1.123.

In either case the next step is an answer from bootparamd. Install the bootparamd package, but it won't see the requests, because they're broadcasts and not sent to the server's IP address. You need to add -r to OPTIONS in /etc/default/rpcbind for bootparamd to see the requests. For some reason the lines adding to options weren't expanding ${OPTIONS}, so I had to set them all at once. It will then answer. You must supply both root and swap values in /etc/bootparams.

After that the final step is NFS. Install nfs-kernel-server. SunOS will want NFS and rpc.mountd version 2 protocols, which are disabled. In /etc/default/nfs-kernel-server add --nfs-version 2,3,4 to both at the start of RPCNFSDCOUNT (because RPCNFSDARGS won't make any difference) and in RPCMOUNTDOPTS. Note that /etc/exports requires directories, so to share a swap file you need to share a directory containing it.

If you sent boot via TFTP, it will make a bootparamd request and then load the kernel via NFS. That might be faster than loading the kernel via TFTP, because TFTP is simple and inefficient. The kernel makes another bootparamd request, and then mounts root and swap. What happens afterwards is up to what's in the root file system.

Don't forget to restart daemons after changing configuration files.

When creating devices in the root file system, ./MAKEDEV std is insufficient. You need ./MAKEDEV pty for telnet or xterm, and ./MAKEDEV win for the windowing system. The MAKEDEV script worked fine from Linux. The arch -k error is harmless unless you're using a sun4m system.

SPARCstation ELC repair

The SPARCstation ELC is a SPARC workstation built into a monochrome CRT monitor. It is fanless, and gets quite hot, which is bad for electrolytic capacitors. I suspect that also some particular capacitor models used are bad.

Accessing the computer motherboard (under the cover at the back of the top) is very easy. Unfortunately, disassembling the rest isn't very convenient. Maybe the design tried to make it convenient at first, and then changes defeated that.

Before disassembling further, consider the shock and electrocution hazards. Obviously the line voltage side of the power supply is dangerous. The secondary side is dangerous too, because besides supplying safe voltages to the computer, it also supplies higher voltages to the monitor. The main monitor board uses those to make even higher voltages, and sends some of these to the CRT socket board. Due to capacitors, voltages can persist even when the device is turned off and unplugged.

The first problem was the power supply, the board to the right when viewed from the rear. It seems like it could be possible to lift out, but the speaker bracket screwed into its bottom prevents that. So the whole side needs to be freed and moved outwards.

The power supply was pulsing. This can happen when an SMPS detects a fault, shuts down and restarts, but that's not what was happening here. The -12 V rail had excessive ripple, and I replaced its 100 µF 16 V filter capacitor (the missing C528 in the photo) to fix that, but it wasn't causing the problem either.



Just guessing I saw the 100 µF 35 V capacitor near what seems to be the SMPS controller IC. The board in that area was browned due to heat around various holes, so it makes sense that the capacitor may have been ruined by heat. Also, a failure in power supply to that IC could cause the symptoms. After testing the theory by temporarily placing a capacitor in parallel, I replaced the capacitor and fixed the problem.



Now the picture was barely visible. The firmware screen has black text on a white background, but the text was just a bit darker than the background. Replacing the two visibly leaking pairs of back to back 100 µF 10 V capacitors at the bottom of the CRT socket board took care of that.It turned out those transistors, labelled ITT 895 115C, near the corner of the board were bad as well. According to NTE's cross reference, 2N4401 was a good substitute. I don't know why they failed. I thought maybe I damaged them by accidentally solder bridging surface mount tubular C417S to the neighbouring resistor on the underside while replacing the rightmost capacitor, but that solder bridge apparently needs to be there. I also replaced some other capacitors but am not sure any of that was necessary. This board is probably the worst for capacitors because it is in its own RFI shield box inside the monitor.


Note that the CRT socket board requires the RFI shield to make ground connections. The main ground area visible at the CRT socket and the three tabs to the left of it need to be connected by wires if you remove that shield and want to power on the monitor.

Finally there was some vertical foldover at the top. It was interesting to note that its start stayed in place as vertical size and position was adjusted. Multiple guides about this problem with CRTs in general say that the prime suspect is the pump-up capacitor. The vertical deflection circuit needs a pulse of higher voltage to overcome inductance and quickly move the beam back to the top. This is accomplished via a charge pump, which charges a capacitor in parallel with the supply, and then connects it in series, to provide that higher voltage. That was another one of those 100 µF 35 V light blue capacitors, like in the power supply (the large cap in this photo). Here are the definitely bad components:

If you want to see more photos, take a look at the album.

Friday, September 04, 2020

Fixing Whirlpool part 3378207 to make dishwasher start washing reliably

A dishwasher doesn't simply recirculate water while washing. It tries to separate solids coming off the dishes from the water, and recirculate only the water. Some dishwashers have filters. This dishwasher has a soil separator. First, a chopper breaks up any large particles. Then part of the water being pumped by the wash impeller goes through the plastic doughnut surrounding it, where baffles try to trap particles. Later, when the dishwasher pumps out water, it pumps from the soil separator, sending the particles down the drain.

The outlet of the soil separator connects to the drain pump via the drain pump cover. Also, the outlet of the soil separator has a simple valve operated by water pressure. Under that little cover to the right is a diaphragm which is pushed down by a spring to open the passage. When pressure builds up, it acts against the spring, pushing the diaphragm upwards, and pulling a rubber cone which blocks the passage.

But, how does that pressure build up in the first place, while the passage is open? I'm assuming it builds up due to the drain impeller spinning the wrong way. Both the wash and drain impellers are on the same shaft, and the current function depends on the rotation direction, determined via the motor start winding.

However, if water leaks out elsewhere, then pressure may not build up enough to push up the diaphragm and seal off the passage. That Whirlpool 3378207 drain pump cover has a rubber gasket on top. It has a thin plastic ridge surrounding that gasket, to keep the gasket in place. That ridge breaks off, and then pressure stretches the gasket, allowing water to leak out. The result was that sometimes the dishwasher would fill with water normally and not start washing, even though the pump was running. It's surprising that enough water can escape that way to cause this, but apparently it can.

If I stopped and restarted it while full of water, it always started properly. I guess the big pulse happening when the pump starts provided enough pressure to move the diaphragm, but the slow pressure increase happening as it was filling didn't.

I cut two thin slices from a copper water line, and soldered them into the appropriate shape for holding the gasket in place. The gasket had been stretched a lot, but I managed to squeeze it into place.


Thursday, May 14, 2020

Using SpeedFan's driver to call Dell BIOS functions for fan control

SpeedFan uses a signed driver to talk to hardware for temperature measurement and fan control. The fact it's signed is important because Windows 10 makes running unsigned drivers difficult. I wasn't satisfied with how SpeedFan fights with Dell Inspiron 6400 laptop's built in Fan control. I8kfanGUI was better, but didn't have a signed driver, so I only used that in Windows 7 and earlier.

I also wasn't satisfied with fan control options in Linux. The Linux kernel provides access to Dell BIOS temperature measurement and fan control functions via the i8k driver, but attempting fan control led to the same issues. So, I first created a small fan control program in C in Linux. Getting it to run in Windows was surprisingly easy, thanks to SpeedFan's driver. First one needs to open the driver:

    sfdrv = CreateFile("\\\\.\\SpeedFan",
                       FILE_ALL_ACCESS, FILE_SHARE_READ, NULL,
                       OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);

Then one can use the IOCTL to call Dell BIOS functions:

#define SFD_CALL_DELL 0x9C402424

int call_dell(uint32_t eax, uint32_t ebx)
{
    uint32_t inbuf[4], outbuf[4];
    DWORD readbytes = 0;

    inbuf[0] = eax;
    inbuf[1] = ebx;
    inbuf[2] = 0;
    inbuf[3] = 0;

    if (DeviceIoControl(sfdrv, SFD_CALL_DELL,
                        &inbuf, sizeof(inbuf),
                        &outbuf, sizeof(outbuf),
                        &readbytes, NULL) == 0 ||
        readbytes < 4) {
        return -1;
    } else {
        return outbuf[0] & 0xFF;
    }
}

I don't know if you need to read and write 16 bytes. SpeedFan always did it this way. I guess they're probably eax, ebx, ecx and edx, in little endian order of course. Certainly the first two seem to be eax and ebx. The i8k Linux driver, now part of dell-smm-hwmon.c, will show you what to do with this.

#define I8K_SMM_SET_FAN     0x01a3
#define I8K_SMM_GET_FAN     0x00a3
#define I8K_SMM_GET_TEMP    0x10a3

int get_temp(unsigned int which)
{
    return call_dell(I8K_SMM_GET_TEMP, which);
}

int get_fan(void)
{
    return call_dell(I8K_SMM_GET_FAN, 0);
}

int set_fan_real(int speed)
{
    if (speed < 0 || speed > 2) return -1;
    return call_dell(I8K_SMM_SET_FAN, (speed << 8) | 0);
}

In my program I used the following rules to reduce fighting between my fan control and the built in fan control: Read fan speed setting (not RPM, but off / low / high) before setting it. Only write it to change it (because otherwise it's pointless). It is always okay to raise fan speed. Once the program raises fan speed, it is allowed to lower it. If the program reads a speed that is higher than the last one it set, it is not allowed to lower it anymore until after the next time it raises fan speed.

I used the rohitab.com API Monitor to figure this out by watching the DeviceIoControl calls that SpeedFan was using.

Other people have also used the SpeedFan driver. Although you need to have Administrator rights to talk to the driver, the driver seems to be a bit of a security hole because even Administrator isn't supposed to have such absolute total control in Windows. Here's one example: https://github.com/SamLarenN/SpeedFan-Exploit/

Here's an example of MSR reading to get the temperature from the CPU's internal thermal sensor:

#define SFD_READ_MSR 0x9C402438

int read_msr(uint32_t msr, uint64_t *dest)
{
    DWORD readbytes = 0;
    if (DeviceIoControl(sfdrv, SFD_READ_MSR,
                        &msr, sizeof(msr),
                        dest, sizeof(*dest),
                        &readbytes, NULL) == 0 ||
        readbytes != sizeof(*dest)) {
        return -1;
    } else {
        return 0;
    }
}

int get_coretemp(void)
{
    uint64_t msrdata;
    if (read_msr(0x19C, &msrdata) < 0) return -1;
    if (msrdata & 0x80000000) {
        return 100 - ((msrdata >> 16) & 0x7F);
    } else  {
        /* Reading not vaid */
        return -1;
    }
}

I'm not using that because the CPU sensor reported by the Dell BIOS gives the temperature of the hottest core. It seems if I read the MSR directly I would have to run the code on each core I want to measure. Calling Dell BIOS once is simpler.

I hope the code didn't get mangled by blogger. I haven't written anything here in a long time. Trying to format stuff as code, in a fixed font, made it too wide, so I didn't bother.

Opening a "single use" armband plastic snap together button

As you can see the armband's button was secured by 4 plastic ribs which stick out from the post. The ribs on the post and the profile of the ring act like a wedge, temporarily deforming plastic to allow you to close and latch the button. But, once latched, the slope you would have to work against to unlatch is very steep. So, the button is easy to close but very hard to open via pulling.

In order to open the button less destructively, a pipe is needed, to go in the slot between the post and ring and . Finding a pipe of the right diameter and thickness would be difficult. So, I took a sheet of aluminum, cut a small piece, and shaped it onto a pipe using a suitably sized wire and pliers. Such aluminum can be easily be cut by scissors, though you probably shouldn't use good scissors which you want to keep sharp. Once I made the pipe, I adjusted it, making it a bit smaller by cutting a tiny sliver with scissors to reduce circumference and again shaping with pliers. Then I forced the pipe into the gap between the ring and core.