Thursday, December 31, 2015

APC Back-UPS ES 500 BE500U-CN fluctuation in reported battery voltage due to power switch

I first noticed the problem when the Cinnamon desktop in Linux reported the battery was critically low, and even inappropriately hibernated the system although AC power was available. After rebooting into Windows, PowerChute Personal Edition 3.0.2 also said the battery was at 0%, but failed to report this as a problem.

The UPS also reports battery voltage, but that can only be viewed via apcupsd apcaccess or HWMonitor in Windows. HWMonitor gives faster updates, but it can cause blue screen Windows crashes, even after quitting the program, so I recommend apcupsd. Both programs showed voltage fluctuations, from below 11.1V to 13.3V.

The real battery voltage, was 13.60V with the UPS off, and 13.55V with the UPS on. So, the battery and the charger were working properly, and the UPS was measuring an incorrect voltage. The UPS was functional, though the incorrectly low voltage measurements could lead to a short runtime.

I removed the circuit board from the UPS, and only powered it via the battery connections using a bench supply. This is a safe method for testing. The board does not have any high voltage capacitors, and it cannot produce high voltage without connection to the large transformer mounted elsewhere in the UPS case. The small transformer on the board is only used for powering the UPS circuit from AC and charging the battery.

The microcontroller on the board is the Microchip PIC16C745. It has an analog to digital converter, so it's reasonable to suspect that the battery is measured that way. According to the data sheet, only 5 pins are usable as ADC channels. By varying the bench supply voltage while measuring voltage at those pins I saw that pin 3 was probably used to measure battery voltage. R15 and R2 form a voltage divider, and C18 is used to smooth the measured voltage. Fluctuations were also happening on the other end of R15, which is the positive "12V" supply to lots of things on the board. The supply comes from the positive battery terminal via the power switch and one of the transistors above the microcontroller.

I removed C18 just to make sure it is fine, and it was, with extremely low leakage. The problem was the switch for turning on the UPS. Its resistance was unpredictable, sometimes a few ohms and sometimes a few tens of ohms. It only gave zero resistance while I was pushing it in hard.

Although it is a latching pushbutton, it works on the principle of a slide switch. It is DPDT, but is used as an SPST switch with the poles connected in parallel. The switch is hard to unsolder due to all the pins, but it is very easy to disassemble by prying the sides to release the bottom with the contacts. Be careful with the sliding contacts. They are tiny, light, delicate, and nothing is holding them in place. I cleaned them by passing paper moistened in isopropyl alcohol inside them. I also cleaned the stationary contacts that way, and cleaned them from the outside with a pen eraser. After that, the resistance was less than 0.1 Ω and the problem is fixed. I hope it stays fixed.

Further comments on the APC Back-UPS ES 500 BE500U-CN

Voltage at the battery terminals goes above 14V when the UPS is AC powered and turned off, and no battery is connected. This would be bad for a battery, but even a 10 kΩ resistor is enough to bring the voltage down within acceptable levels. I don't think this is a problem, because there's probably more than 1.4 mA flowing into a fully charged battery.

Note the big metal blocks on the circuit board near the power switch. Those are heat sinks for the inverter MOSFETs. A normal heat sink would have fins to radiate heat. The metal blocks mainly provide heat capacity. The run time of the UPS is limited by battery capacity and charging speed. It is okay if those blocks need an hour to cool down, because battery charging takes even longer. Those heat sinks would need to be replaced if you want to use a much higher capacity battery or use the UPS continuously as an inverter.

Monday, December 28, 2015

Not disabling cache before reinitializing TCC76X led to some weird things

My RCA RC3000A digital boombox port of Rockbox started getting unexpected shutdowns at startup after I upgraded it to a newer version of Rockbox. A binary search showed it happened when the audio thread starts, but that made no sense! Simply starting the thread and making it execute a loop which only sleeps was enough to cause shutdowns.

Then I saw that they didn't happen when I loaded via USB boot. It made me suspect that some initialization performed by the Rockbox bootloader caused problems later on when the chip is re-initialized by Rockbox. My first guess was cache, prompted by how ROLO flushes ARM cache but doesn't disable it. Disabling the cache fixed the problem when loading Rockbox via ROLO. It also had to be disabled in another place to fix the bootloader.

The TCC760 uses standard ARM940T cache. I'm not sure if this affects other similar ARM cores or other Telechips SoC families.

Wednesday, December 23, 2015

Switching to tmpfs and unmounting root in Raspbian

This describes how to make a Raspberry Pi running Raspbian use RAM for the root file system. This allows you do things which cannot be done while the root file system is mounted, such as shrinking it with resize2fs. It would also allow you to write a totally new image to the SD card, overwriting everything. It might also allow you to swap SD cards and write an image to a different card, but I'm not sure that the SD card driver allows it. Here is the script:

cp -r lib bin sbin etc /mnt
mkdir -p mnt/usr/lib/arm-linux-gnueabihf
cp usr/lib/arm-linux-gnueabihf/libarmmem.so mnt/usr/lib/arm-linux-gnueabihf
# Move other mounts
mkdir mnt/dev
mount --move dev mnt/dev
# Pivot root using instructions from pivot_root(8) man page
cd mnt
mkdir old_root
pivot_root . old_root
# The current directory now seems invalid, so fix it
cd /
# Old root can only be unmounted once sh running from old root finishes.
# If enough was copied, you could continue startup normally using init.
exec old_root/usr/sbin/chroot . bin/sh

First, save the script somewhere. To use it, add init=/bin/sh to /boot/cmdline.txt and reboot. When the boot text finishes, you may not see the # prompt because of kernel output, but the shell should be running, and you'll get another prompt if you press enter.

You must run the script in the shell which is running as init, not a shell spawned from it. That is because the final line needs to end that shell. You won't be able to unmount old_root if you're still running a shell from it. So, for example if the script is /root2ram, use . /root2ram to run it.

The final line is the place where any errors will manifest themselves. You would get a kernel panic if chroot or sh can't run. If you're experience a problem, comment out the last line and look at the state of the system at that point.

The script is intentionally minimalist. All of Raspbian won't fit into RAM, so it only copies some parts. Some binaries in /bin and /sbin require libraries from
/usr/lib/arm-linux-gnueabihf/, and only the most commonly used one gets copied. The script does not umount old_root at the end so you can test whatever you want to run and copy anything else you need before manually unmounting it.

This has been tested on a Raspberry Pi 2 B running Raspbian Jessie. If you want to run it on a Raspberry Pi with less RAM, you might need to be more specific when copying from /lib. It is big and there are un-needed things there.

When you are done and you want to boot normally, simply remove init=/bin/sh from /boot/cmdline.txt. You would need to mkdir boot && mount boot. Vi is in /usr/bin, but sed is in /bin and you can use sed -i 's,init=/bin/sh *,,' /boot/cmdline.txt to edit it. Normal rebooting won't work without init, so umount /boot ; sync && reboot -f.

Thursday, December 17, 2015

WM_SYSCOMMAND SC_SCREENSAVE doesn't work anymore for activating the screen saver?

I use HoeKey as my hotkey program in Windows 7. One hotkey is supposed to activate the screen saver:

^`=Wait|1000
=Msg|0|274|61760|0      ; start screensaver


This waits one second to avoid the key release event, and then sends a WM_SYSCOMMAND SC_SCREENSAVE message. It failed a few times in the past, but it generally worked. Now it fails all the time. I wonder why?

This doesn't seem to be a HoeKey problem, because a trivial C program can't activate a screen saver this way either. However, SC_MONITORPOWER always works in both HoeKey and the C program, shutting off the monitor when lParam is 2.

In the past when it failed I thought it was due to the adaptive screen saver timeout, but disabling that doesn't help.

Emscripten 1.34.2 and later can reveal more alignment problems

Em-DOSBox started getting graphical corruption when built with recent versions of Emscripten. This was due to improper alignment of a statically allocated variable, which was declared as 8 bit values but also accessed as 16 and 32 bit.

Here is the Emscripten change which triggered the problem. It changes the alignment of variables which are allocated at compile time. Previously, they were 64 bit aligned. Afterwards, they are aligned as appropriate for their data type. 8 bit values don't need any alignment, and so they can end up improperly aligned for 16 and 32 bit access. This change is in emscripten-fastcomp, between 1.34.1 and 1.34.2 tags. Variable alignment depends on the build used during the final Emscripten link.

This is not an Emscripten problem, but an issue in DOSBox which started causing problems after that Emscripten change. If you think you may be dealing with an alignment problem, link the program with -s SAFE_HEAP=1. That will cause misaligned access to report and alignment error and throw an exception. You may also want to use a lower level of optimization or at least --profiling-funcs, so it is easier to find where the problem is occurring.

Thursday, December 10, 2015

Western Digital PM2 (power up in standby) was not useful

I have a 1TB WD1002FAEX Western Digital Black which used to by my main drive but is now just a data drive. Operating systems and commonly used data can all fit on the PNY CS1211 240GB SSD. As a result, the 1TB drive doesn't really need to spin up every time the computer turns on. Enabling power up in standby seems like a good idea.

Western Digital drives support power up in standby via a jumper, which they document in a PDF. The drive did not support setting of the feature using software via hdparm -s. This is probably wise. If power up in standby is enabled via software, you could end up in a situation where your operating system doesn't recognize the drive and you can't disable the feature. Such a drive isn't bricked, but you may need to connect it to a different computer running a different operating system in order to rescue it. A jumper is very easy to remove.

After placing the jumper, the drive didn't spin up when I turned on my computer, as expected. The Gigabyte GA-P35-DS3R F13 BIOS identified a hard drive there, but didn't identify what kind and didn't spin up the drive. There was a long delay before booting, but Linux successfully booted from the SSD. Linux spun up the drive and mounted the partitions on there. It complained about failing to spin up the drive, but I didn't notice any actual problems. So far, that is all okay. Then I put the computer to sleep and woke it up, hoping the drive wasn't going to spin up. Unfortunately, Linux spun it up. So, the drive remains usable with Linux, but this feature is pointless because Linux spins up the drive anyways.

Next, I tested Windows 7. It acted as if the drive was totally non-existent. Since it wasn't detected, other software couldn't be used to spin up the drive. This means power up in standby is totally incompatible with Windows 7. It could only work if the BIOS or a 3rd party driver spun up the drive during boot and after waking from sleep.

Wednesday, December 09, 2015

Comparing an Olympus Li-10B battery to a "1400 mAh" replacement

Both of these batteries had to be discarded because they were swelling. The Li-10B was purchased along with my Olympus C-770 Ultra Zoom camera in May 2005. I'm pretty sure the replacement was purchased from DX (DealExtreme) in January 2011.

I wanted to open them up for two main reasons: to see if it's possible to replace the cell, and to examine deception in the replacement. Both batteries were easy to open up by cutting along the seam. There are tabs, but I think the cases were welded or glued. Here's what's inside:
The Olympus Li-10B has a Sony cell inside, and the "1400 mAh" replacement contains a 750 mAh cell. It's probably decent quality, because in June 2013, after almost two and a half years of use, I measured 760 mAh.  The part number seems to be 063040AL.

In the original Li-10B, the cell fills the case. There is only some very thin double-sided tape on one side, and a bit of very thin insulating tape. The replacement contains a thick foam rubber spacer, and a thinner paper spacer. The cell thickness difference is easy to see. This is at least part of why the "1400 mAh" battery was lighter than the Li-10B.

Finally, here are the protection boards. The top is the original, and the bottom is the replacement. In the original board, the small black square is the control chip, and the much larger component is a low Rds(on) MOSFET. I'm not sure about the components in the replacement, or why one of the chips is cracked in a corner.

The replacement never had as much battery life as the original when it was new, and it probably wasn't much better than the already worn out Li-10B. However, at $5.18 US I can't say it was a ripoff, and it was helpful.

Preserving order when importing Internet Explorer Favourites

Internet Explorer stores its favourites as individual shortcut files in %USERPROFILE%\Favourites. Folders of favourites correspond to directories there. This is all easy to deal with, and other browsers can easily import it.

Ordering of favourites within a folder is stored separately, in HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\MenuOrder\Favorites. It uses an undocumented binary format. If you want to import into Internet Explorer elsewhere, you can simply export that branch of the registry using regedit, and then import it to restore the order. Exporting the ordering into other browsers is more difficult.

A CodeProject article explains the registry format. The code there is compatible with Windows XP and Vista, but not later versions. I also had some other problems, which I listed in a comment over there. Nevertheless, the code there was a very helpful starting point and only a few changes were needed.

The problem then is using the bookmarks.xml file, which is in XBEL format. Firefox is incapable of reading that format, and the XBELFox extension didn't work either. Debian had some XBEL tools in the past, but those depend on Python 2.1. I eventually ended up installing Konqueror and exporting to Mozilla format from there. That may seem like a ridiculously inefficient method, but it's much faster than spending more time searching for a suitable format conversion utility. Afterwards, it was easy to use dpkg.log to purge Konqueror and all of the KDE stuff it installed.

Tuesday, December 01, 2015

Windows 7 won't let you delete a hard-linked read-only file

Normally, Windows 7 file management ignores the read-only attribute. You can delete read-only files in Explorer or in cmd.exe command prompt without any need to remove the attribute. There is no additional prompting or notification, so it's effectively as if the attribute wasn't set. Some attempts to overwrite the file can be stopped by the attribute, so the read-only attribute isn't totally pointless.

When a file has hard links (or in other words, is associated with multiple directory entries), weird things happen. It seems as if there is one read-only attribute for the file itself which affects all of the file's directory entries, and an attribute cache associated with directory entries (file names). When you change the attribute for one name, you change it for all names, but attrib will give you incorrect old information for other names. The cache is updated when you attempt to overwrite or delete the file, and attrib will then provide accurate results.

If you try to delete a read-only file with multiple directory entries in cmd.exe, it will fail with "Access is denied." Explorer may fail to delete the file in some situations, incorrectly telling you that the file is in use.

Wednesday, November 11, 2015

Simple Linux boot repair by loading GRUB stage 1 from a file.

When booting on a PC with a legacy BIOS and MBR partitioning, Linux has a problem. At the start of the boot process, the BIOS can only load the first sector from the drive. That's only 512 bytes, and because it contains the partition table, only 446 bytes are available for code which could continue boot-up. The standard MBR code would load the first sector from the active partition, which is once again 512 bytes.

That's not enough space for GRUB, and it's not even enough space for code which could load a file from ext4 or other complex file systems used by Linux. The default solution is to put some more code right after the first sector, in the free space between it and the first partition. The BIOS loads the stage 1 code in the MBR, that code loads the stage 1.5 code which follows, and then finally it can load stage 2 from the Linux file system.

The problem is that it depends on GRUB stage 1 being in the MBR sector. When Windows and other operating systems are installed, they can overwrite stage 1 with their own code. The MBR code installed by Windows is reasonable. It will boot from whatever partition is active. However, if you installed GRUB in the MBR, the Linux partition is not bootable, so that won't help you.

There is plenty of information online on how to fix this by booting from removable media and re-installing GRUB. I'm proposing a much simper fix: load GRUB stage 1 from a file or write it to the MBR sector yourself. If you're running Windows you can take GRUB stage1 or boot.img files, put them somewhere on the Windows partition, and set up the Windows boot loader to boot them. In Windows Vista and later, you can do this via bcdedit. In earlier versions, it's just a single line of text in C:\boot.ini, like C:\stage1="GRUB stage 1".

If you want to write stage 1 to the MBR sector yourself, you can do that with a disk editor or dd in Cygwin. It's critical to remember to only overwrite the first 446 bytes, so you don't overwrite the partition table! This would mean using bs=446 count=1 arguments with dd. (If you overwrite the partition table, all is not lost though. TestDisk should be able to recover it.) GRUB stage 1 has some code after the first 446 bytes which doesn't get copied, but that code is only needed when booting from floppy. It's also critical to understand where you're writing, because writing to the wrong place could corrupt a file system! It is easier and safer to boot stage 1 from a file and then once booted into Linux use grub-install to do this.

On a system with multiple operating systems, I prefer installing GRUB in the Linux partition. In that situation, grub-install will complain about blocklists, and --force will be needed to make it install. The problem here is that the location of the file to load next is hard-coded in the boot sector, and if that file moves you can't boot anymore. It hasn't been a problem in practice when booting normally from a Linux partition. It is however a problem when using EasyBCD, because it makes a copy of the Linux boot sector which becomes out of date. My solution there is to chain load the Linux boot sector.

Tuesday, October 13, 2015

Anatomy of a miswired power supply

This Thermaltake power supply was working fine with an IDE/PATA hard drive. Then when I added a SATA SSD, the computer would turn off immediately after turning on. I first thought the motherboard was doing something because of some weird incompatibility, but the same thing happened with the data cable disconnected.

After finding that the SSD doesn't work anymore in another PC, I checked the SATA power connectors. The colours seemed right, but the black wires were at +12V. The other wires were fine. That means the SSD got -7V power instead of +5V, and its ground was at -7V relative to SATA signal ground. Surprisingly, I managed to fix the SSD by bypassing a damaged component. This post is about fixing the power supply.

Here's the power supply label:

It has passed all the quality control checks, and the warranty sticker is intact:


The SATA power connectors look fine based on the wire colours:



Here's an external demonstration that they're not okay, by measuring resistance from a yellow +12V wire to their black wires. This is a cheap multimeter, and less than 3 ohms basically means zero ohms.


Now it's time to open up the PSU. It looks nice inside, without obviously bulging capacitors, and with very little dust considering how long it has been in use:

Note that the two big capacitors at the bottom of the picture could hold a dangerous high voltage charge, which could give you a big shock. The low voltage capacitors at the top should not be a risk.

The circuit board is held in by 4 screws, but also fan had to be unscrewed to free it more. I couldn't unplug the fan connector, because it was glued.

Here's a closeup of the problem. There are markings on top of the circuit board, indicating areas which connect to a specific voltage. One of the black ground wires connects to the wrong area, among the yellow +12V wires. Further up, the wire, you can heat shrink tubing covering where the wire splits into two wires. That way this one connection goes to both of the SATA ground wires.


There already was an unused hole in the ground area. A high wattage soldering iron made the repair easy.



Putting the power supply back together was a bit tricky. There are several places where things need to interlock properly. Take care around the grommet where the wires come out. The top of the case is supposed to fit into a groove in the grommet, so the wires don't chafe against the metal edge.

Here's a photo of the nicely voided warranty sticker. I still hope Thermaltake will reimburse me for this. Power supplies should not have defects like this, and most users have no way to protect themselves from this. It would be easy to check for miswiring with a tester at the factory.


Finally, here's a photo of the SSD repair. I'm not sure what was damaged, because it's hard to find information on some surface mount parts. I'm guessing it's a regulator that comes before the regulators which provide voltages that the SSD actually needs, providing greater voltage stability. I just bypassed it with a diode, which had to be filed down to fit inside the SSD case.


Monday, July 27, 2015

Intel Rapid Storage Technology 9.5.0.1037 may be incompatible with SSDs

After installing a PNY CS1111 SSD, I started getting occasional hangs in Windows 7. Sometimes shutdown or standby would take way too long, and sometimes a hang would happen during normal use. Often, the hard drive LED would stay on all the time, but the computer would work normally, with no signs of actual constant disk activity. Also, trimcheck 0.7 reported that TRIM isn't working.

I just upgraded Intel Rapid Storage Technology from 9.5.0.1037 to 10.8.0.1003. The hard drive LED is behaving properly now, and trimcheck shows that TRIM works. I haven't gotten any hangs, though it hasn't been long enough to be sure.

There are newer versions of  Intel Rapid Storage Technology available, but I don't know of any newer versions which work with the ICH9R on this GA-P35-DS3R motherboard.

Sunday, June 28, 2015

Getting rid of the Tools pane in Acrobat Reader DC

Acrobat Reader is the best PDF viewer. It has the best performance on documents with huge amounts of stuff on a page, such as maps. It also has the fastest searching on documents with huge amounts of text.

Adobe stuffs it with all sorts of crap I do not want in an attempt to sell their online services, but the program starts up reasonably fast despite being bloated. There's just one important annoyance that the preferences can't fix: the Tools pane. It wastes the right side of the screen for a bunch of functions which I never use. It can be hidden, but there's no option to hide it permanently, so it has to be hidden again every time I open a document.

The Tools pane can be disabled by moving or deleting the Viewer.aapp plugin which creates it. For me, it is located at C:\Program Files\Adobe\Acrobat Reader DC\Reader\AcroApp\ENU\Viewer.aapp. In 64-bit windows it would be in C:\Program Files (x86)\, and the ENU part could be different if you have a different language installed. I created a new folder within ENU and put the file there, so I can move it back if I ever actually need it or if updates break because it's not there. It may be necessary to repeat this procedure after every Reader update.

Wednesday, June 10, 2015

AA cell diameter differs


I got some AA to D adapters so I can use AA rechargeables with the RCA RC3000A digital boombox.



The cell I tried was a new Duracell DX1500 pre-charged rechargeable NiMH, rated 2400 mAh. It was a very tight fit, and hard to remove. These adapters lock in the cell when you push it in all the way. You can start removal by pushing on the positive contact, which is like a button. After that, all you can do is pull on the negative side, which sticks out a bit. It's hard to apply a lot of pulling force by hand to a cell that sticks out very little. When I finally got the cell out, I saw the adapter had damaged it.

My first thought is that the inexpensive no-name adapters sucked. Then I tried three other kinds of cells. They all fit nicely. There was a bit of friction with some, but it wasn't a problem. Also, there was no damage. Even the older 2000 mAh Duracell DX1500 (same model as the problem cells) fit properly. Here are the cells. Only the cell at the right has a problem fitting in the adapters.


Does this PNY CS1111 SSD use a SandForce controller?

I recently finally upgraded to an SSD. I'm not too impressed. Some things are much faster, but those are generally rare operations such as rebooting and installing software or updates. Operating system caching and preloading was taking care of common operations. Practically speaking, going from a 160 GB Seagate 7200.7 to a 1 TB WD Black and later upgrading from 2 GB to 6 GB RAM were both more useful.

I chose a PNY CS1111 series 120 GB drive. It doesn't have the fastest read speeds, but it's faster than 3 Gbit/s SATA or 1x PCI Express 1.x, so the GA-P35-DS3R motherboard is the bottleneck.

According to PNY's 2015 SSD Product Comparison PDF, the CS1111 series uses a Silicon Motion SM2246EN controller. However, the SMART attributes don't make sense as SM2246EN attributes, and make sense as SandForce attributes. For example 241 and 242 are definitely measuring gigabytes. So, is PNY's information wrong? Are there multiple versions of this drive with different controllers? PNY's support didn't answer. I don't feel like opening up the drive to see if there's a SandForce controller inside, because that would void warranty. Here are the SMART attributes, as reported by smartmontools.

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_
FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0000   100   100   000    Old_age   Offline      -
       0
  5 Reallocated_Sector_Ct   0x0000   100   100   000    Old_age   Offline      -
       0
  9 Power_On_Hours          0x0000   100   100   000    Old_age   Offline      -
       82
 12 Power_Cycle_Count       0x0000   100   100   000    Old_age   Offline      -
       85
171 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -
       0
172 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      -
       0
174 Unknown_Attribute       0x0000   000   000   000    Old_age   Offline      -
       2
177 Wear_Leveling_Count     0x0000   000   000   000    Old_age   Offline      -
       0
181 Program_Fail_Cnt_Total  0x0000   100   100   000    Old_age   Offline      -
       0
182 Erase_Fail_Count_Total  0x0000   100   100   000    Old_age   Offline      -
       0
187 Reported_Uncorrect      0x0000   100   100   000    Old_age   Offline      -
       0
194 Temperature_Celsius     0x0000   033   100   000    Old_age   Offline      -
       33
195 Hardware_ECC_Recovered  0x0000   100   100   000    Old_age   Offline      -
       1262
196 Reallocated_Event_Count 0x0000   098   098   003    Old_age   Offline      -
       0
201 Unknown_SSD_Attribute   0x0000   100   100   000    Old_age   Offline      -
       0
204 Soft_ECC_Correction     0x0000   100   100   000    Old_age   Offline      -
       0
230 Unknown_SSD_Attribute   0x0000   100   100   000    Old_age   Offline      -
       39
231 Temperature_Celsius     0x0000   100   100   010    Old_age   Offline      -
       33
233 Media_Wearout_Indicator 0x0000   000   000   000    Old_age   Offline      -
       0
234 Unknown_Attribute       0x0000   000   000   000    Old_age   Offline      -
       1
241 Total_LBAs_Written      0x0000   000   000   000    Old_age   Offline      -
       125
242 Total_LBAs_Read         0x0000   000   000   000    Old_age   Offline      -
       52

If the Malicious Software Removal Tool won't go away, next month's updates might fix it

In May I stopped installation of Windows 7 updates because one simple update seemed to be taking a long time with no activity. After that, the May 2015 Malicious Software Removal Tool would not go away. It would install successfully every time, writing to C:\Windows\debug\mrt.log and setting HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\RemovalTools\MRT\Version to the proper GUID. Nevertheless it would reappear after the next check for updates.

There were various ideas online, and I tried everything except deleting C:\Windows\SoftwareDistribution. Nothing helped. Eventually I just created a DWORD at HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\MRT\DontOfferThroughWUAU, setting it to 1 to disable offering of the tool via Windows Update. You cannot simply hide the update, because then last month's tool is offered instead, and so on.

I just removed that setting, wondering if the June updates might fix it. They did. I don't know what happened. Maybe the June Malicious Software Removal Tool set something that Windows Update finally recognized. All I know is I spent way too much time on this little problem.

BTW There is some useful information in KB891716.

Monday, May 11, 2015

Performance of the Em-DOSBox CPU interpreter

When I first got DOSBox to run in a web browser, performance was terrible. The problem was the CPU interpreter. A single function fetches, decodes and executes most x86 instructions. Most of the function consists of a big switch statement with many cases. It is big because there are many x86 instructions.

The first problem was Emscripten converting the switch statement into a long chain of else if comparisons. Actually, a switch statement was used, but in most cases it merely set a variable which was later tested via comparisons. Instruction decoding, which needs to be done for every instruction, changed from O(1) into O(n).

Emscripten could generate a much better switch statement with a patch. This made DOSBox run fast in Firefox, but it was much too slow to use in Chrome. When I profiled it, I saw a warning triangle by the CPU interpreter function, telling me it's not optimized because the switch statement is too big. There was already v8 bug filed about this issue.

I solved this problem by transforming the cases of the big switch statement into functions using extractfun.py. This reduces function size and allows a function pointer to be used instead of a switch statement. The process is somewhat convoluted because the switch statement is normally built using the preprocessor. First, preprocessor output files need to be produced. In order to get Automake to create them using proper dependencies, it needs to create a library which is otherwise unnecessary. Then the Python script parses the preprocessor files. It stores functions into a function store, removing duplicates and fixing name collisions. Finally, it creates header files which are used when building the final version of the CPU interpreters. Three CPU interpreters are processed this way: the simple, normal and prefetch cores.

Since then, the Emscripten bug has been fixed, I assume by the switch to Fastcomp. When Chrome started using TurboFan for asm.js, it could finally get good performance with an un-transformed CPU interpreter. This led me to check whether extractfun.py is still necessary.

Safari 8.0.6 and Internet Explorer 11 still get terrible performance without the transformation. Use of --llvm-opts '["-lowerswitch"]' doesn't seem to help. Looking at the JavaScript, I can confirm that it changes the big switch into a binary search, so this probably means the problem is due to the size or complexity of the function, and not just due to switch statements. I also experimented with the Emscripten outlining limit, with or without -lowerswitch. I assume that transforming switch cases into functions is a more efficient split than what's done by Emscripten outlining.

Friday, May 01, 2015

Gigabyte GA-P35-DS3R enables wake on any unicast packet by default

Recently I found that if I totally cut power to my computer (including standby power), booted to Linux and went into suspend, it would wake unexpectedly very soon. Booting into Windows would prevent this problem from happening until power is totally cut again. I suppose this problem existed before, but I never noticed it because I rarely fully cut power and I was using Linux less.

At first I thought this was a Linux bug, but actually, it the result of a crazy default setting for wake on LAN. By default, wake on unicast packet and magic packet are both enabled. If there was some network activity which caused ordinary unicast packets to arrive while the computer was sleeping, it woke up. That's why I found that with a minimal X setup using twm, the wakeups only happened if I was running a web browser.

This setting can be seen by running sudo ethtool eth0. Its output included:
        Supports Wake-on: pumbg
        Wake-on: ug

From the ethtool man page:
              u Wake on unicast messages
              g Wake on MagicPacket™

The solution was adding ethtool -s eth0 wol d to /etc/rc.local to disable wake on LAN. Then sudo ethtool eth0 would report Wake-on: d, which means "Disable (wake on nothing).". It's possible to also use g instead of d, which should only enable wake on magic packet.

The GA-P35-DS3R rev 1.0 motherboard F13 BIOS does not seem to have any options for changing wake on LAN settings, so this seems to be the only way to do it. I had already disabled wake on LAN in Windows via the Advanced settings in Device Manager. That setting persisted until standby power was cut.

It sure seemed like a Linux bug at first, so here's the Ubuntu bug I reported. Now I just wish Linux would tell me the wake reason. If something told me these wakeups were a result of wake on LAN, I would have wasted a lot less time on this.

Thursday, April 30, 2015

Sleep and wake notifications with systemd

After I upgraded to Ubuntu 15.04 Vivid Vervet, the Audacious plugin I used to control a display device I built had a problem. Functionality related to sleep and wake didn't work anymore. That's because UPower doesn't handle sleep and wake anymore. Notification events instead come from systemd-logind. In particular, it is the "PrepareForSleep" signal on the "org.freedesktop.login1.Manager" interface. When its argument is true, the system is preparing to go to sleep, and when it's false, the system is waking up. Here's the new code.

I still need to find a way to determine the last wake time. Formerly I was using the /var/log/pm-suspend.log modification time, but that is also now unavailable because systemd is handling it instead. If the plugin observes a wakeup, it gets the correct time from that, but when started it needs another method.

This illustrates a major difference between Windows and Linux. The Windows version of the plugin can still use the WM_POWERBROADCAST message, with parameters dating from Windows 2000. Linux is a moving target, and things will break unless you keep updating them. This probably also extends to Linux application APIs.

Tuesday, April 28, 2015

CPUID HWMonitor 1.20 causes blue screens

32-bit Windows 7 is normally totally stable for me. Today I was running CPUID HWMonitor 1.20 to troubleshoot a UPS problem, and I got two blue screens. One was BAD_POOL_HEADER (19) and the other DRIVER_CORRUPTED_EXPOOL (c5). Both of these point to memory corruption.

HWMonitor is a nice utility which shows temperature, voltage and other measurements. It supports the motherboard, hard drive, graphics card and UPS. However, I've seen blue screens after using it in the past, and this experience further confirms that HWMonitor causes blue screens.

I just upgraded to version 1.27. I hope that version will work better.

Edit: Nope, it's not fixed. I didn't trust version 1.27, so wanted to restart after using it. I got another memory corruption crash on shutdown.

I wonder what's wrong with this APC BE500U-CN Back-UPS ES 500?

I saw Cinnamon claiming that the UPS is empty (0%), which made me assume there was some bug in Linux. It even hibernated the PC even though AC power wasn't interrupted, which is definitely a bug. Then I double-checked via apcupsd in Linux, and then in Windows, with PowerChute Personal Edition 3.0.2 and CPUID HWMonitor 1.20.0. The UPS was definitely claiming it was empty. However, PowerChute said it was operating normally and charging.

HWMonitor provides the most interesting information. It constantly updates the battery voltage and levels, and keeps minimum and maximum values..The voltage was fluctuating from below 12 to above 13. That does not seem like proper charging. When I took the battery out, it was around 12.6. My first thought was that there was a bad capacitor causing excessive ripple and giving bad readings. This led me to open up the UPS:



I saw high ripple at the leftmost capacitor, near the red wiring fault light and the heat sinks. After replacing that 220µF 50V capacitor, ripple went down from 5V to 3V. I'm not sure that this helped though. When it started charging, HWMonitor reported voltage steadily rising toward 13.50V. It stayed stable for a while. Then it fluctuated going down to 11.54V even. Now, the voltage is stable again, around 13.26V, and percentage is rising.

The stability makes me think the fluctuations were the firmware interrupting charging and doing some discharge tests. Surely 11.54V is too low though. Maybe the battery is bad and the firmware isn't smart enough to report that. I saw it quickly go below 12V with a car headlight, so it probably is worn out.

Sunday, April 26, 2015

Goodbye KDE!

I switched to KDE when GNOME 3 was released. Over time, version 4 generally got better and less buggy. I was satisfied and even happy with it. Now Plasma 5 seems to have thrown away a lot of that progress. There are far more bugs and less features. It's not as bad as the GNOME 3 change, but Plasma 5 is bad enough that I don't want to use it.

It's tragic how these free software desktop environments get to be good and then revert to a state which should be called an alpha release. Of course GNOME 2 and KDE 4 had some limitations and disadvantages. A big bold change can help there, but I think the only way to truly improve is to slowly evolve into something better. Even Microsoft can't successfully make sudden big changes.

GNOME 3 has improved in the meantime. I still think the window switching, application launcher and top bar are inefficient and limiting, and it wastes screen space. So, I definitely won't be choosing GNOME 3. MATE reminded me that things have improved since GNOME 2, and I don't want that either.

Cinnamon seems good now. It seems to be a combination of the best of GNOME 2 and GNOME 3. Its web page may be unimpressive, but the software works well and has enough features. It seems significantly faster than KDE, and I'm forced to install a lot less stuff. Losing KDE 4 was annoying at first, but now it seems I'm switching to something even better.

Thursday, April 23, 2015

My favourite music visualization program running in a web browser

Synaesthesia is my favourite music visualization program. I created an updated Windows port and also ported it to Emscripten. The code is on GitHub. Here is a screen shot, but you have to see it in action to really appreciate it.


Click here to run the program. Then start visualization by dragging an audio file from a file browser window on your computer to the web page. No information is sent over the network. The file is played by the web browser and visualized by asm.js running in the browser.

If you find that Firefox can't play MP3 files in Linux, install gstreamer1.0-fluendo-mp3.

Optimizing Emscripten SDL 1 settings

Emscripten's SDL tries to emulate desktop SDL. This involves some costly operations which many programs don't need. Performance can be improved by changing settings to prevent those operations.

Consider the multiple copies

The image exists in 3 places in the web page: program memory, canvas ImageData, and the canvas element. Normally, SDL_LockSurface() copies from the canvas, to ImageData and then to program memory, and SDL_UnlockSurface() copies from program memory to ImageData and to the canvas. Conversion may be needed between program memory and ImageData.

SDL.defaults.copyOnLock = false

SDL_LockSurface() will copy from canvas to ImageData but not from ImageData into program memory.

SDL.defaults.discardOnLock = true

SDL_LockSurface() will use createImageData() once to initially create ImageData, and never copy from the canvas. Copying from ImageData to program memory is prevented regardless of SDL.defaults.copyOnLock.

SDL.defaults.opaqueFrontBuffer = false

With normal SDL you can write only the RGB values and get opaque pixels of the requested colour. Canvas pixels also have an alpha value, which needs to be set to 255 to make pixels fully opaque.

Normally, both SDL_LockSurface() and SDL_UnlockSurface() set alpha values in ImageData to 255. This option prevents those operations. With it, the SDL_HWPALETTE 8 bpp mode works normally, your code that writes pixels into memory must set the alpha values. You can simply bitwise or pixels with Amask from the surface SDL_PixelFormat.

Use SDL_HWPALETTE flag for 8 bpp modes

It's possible to use 8 bpp without SDL_HWPALETTE. However, that uses less optimized code when converting to 32 bpp for the canvas, and doesn't work with SDL.defaults.opaqueFrontBuffer = false.

SDL_HWPALETTE requires that SDL_LockSurface() copying is disabled. 8 bpp modes without the flag don't have that requirement, but you'll end up with 32 bpp RGB values copied back, which you probably don't want.

Module.screenIsReadOnly = true

This prevents SDL_LockSurface() copying. You could use it instead of SDL.defaults.discardOnLock = true. The only difference is that ImageData is copied from the canvas the first time SDL_LockSurface() is called instead of being created via createImageData(). It's probably better to use the SDL.defaults options instead because they're better documented and better named.

Sample code

Here is a code fragment which sets uses the recommended optimization settings and enters 8 bpp mode in the recommended way. Some of this is redundant as noted above, but there's no harm in that.

EM_ASM(
    SDL.defaults.copyOnLock = false;
    SDL.defaults.discardOnLock = true;
    SDL.defaults.opaqueFrontBuffer = false;
);
surface = SDL_SetVideoMode(WIDTH, HEIGHT, 8, SDL_HWPALETTE);

Wednesday, April 22, 2015

Compiling the iPodLinux 2.6 kernel

Normally, iPodLinux uses a Linux 2.4 kernel. The SVN repository on SourceForge also has a 2.6 kernel. It's an abandoned work in progress which lacks some drivers and support for PP502x iPods. Nevertheless, it may be a good starting point.

The port is based on Linux 2.6.7 with linux-2.6.7-hsc0.patch.gz applied. (This is the MMU-less ARM patch. It was mainlined in later 2.6 kernels.) Then, the iPodLinux files need to be copied into that tree, overwriting some Linux files.

This runs into several problems. Building with the old GCC 2.95.3 toolchain fails because the assembler is too old. With arm-uclinux-elf-tools.tar.bz2, the assembler is fine, but old GCC options are used, which aren't accepted by GCC 3.4.3. There are also two other small errors, and one error preventing make menuconfig. I'm distributing a fixed version in the form of a bare Git repository. It successfully builds linux.bin, which I didn't try to run because I don't have an old PP5002 based iPod.

One build problem remains: make always rebuilds everything. I'm not sure what's going on. If I run make -n --trace after successfully building linux.bin, I see a lot of stuff being re-built due to FORCE, as if this is intentional.

Tuesday, April 21, 2015

Audacious plugin for opening folder where files are located

I've been using Winamp for a long time. With a stripped-down configuration and the 2.x interface it is perfect. It supports a lot of formats, plugins and a playlist. With a nice skin it looks good, but takes up very little screen space and resources. I'm used to playing files from folders, and not using any sort of database.

Audacious seems to be the best replacement for Winamp in Linux these days. Its default interface is too big and plain, but it supports Winamp 2.x skins. It also uses plugins to implement various functionality and has an even better plugin API than Winamp.

There didn't seem to be any equivalent of the Winamp "Find File on Disk" plugin, so I created one. This gist contains the source code. To use it, right click on a file in the playlist, go into the "Services" sub-menu and select "Open File Location". You can also select multiple files in the playlist, and they will all be selected in the file manager.

Right now the plugin is written to use the KDE Dolphin file manager, which offers the nice --select flag. I'm not sure how to make this automatically work well with any file manager.

Zooming out in Google Maps and Earth without multitouch

The Android versions of Google Maps and Google Earth are optimized for multi-touch. Double clicking zooms in a set amount, but the way to zoom out with a mouse isn't obvious.

To zoom in or out, double click, but don't release the second click. Keep holding the button and drag the mouse up or down to zoom in or out.

Getting iPod hard drive SMART data via iPodLinux

Hard drives found in iPods generally support SMART. It is useful for checking drive health and running tests. The flash-based diagnostic mode reports only a few of the attributes. Rockbox lacks support for USB mass storage features which would allow programs running on a computer to use iPod hard drive SMART functions. I believe the original firmware's disk mode and the emergency disk mode in flash also lack this support.

Fortunately, there's iPodLinux. With some changes it's possible to cross-compile smartmontools for iPodLinux and run it there. Here's a link to the modified smartmontools-6.1 source with the smartctl binary inside. I did this quickly a while ago. I'm sure it's possible to port smartmontools more elegantly, but this works.

I find iPodLinux text input annoyingly inefficient, so I prepared three text files with these common commands:

smartctl -d ata -a /dev/hda > sao
smartctl -d ata -t long /dev/hda
smartctl -d ata -t short /dev/hda


You must specify -d ata to tell smartctl what type of interface to use. I'm sure it's possible to improve the port to remove that requirement.

Wednesday, April 08, 2015

Porting Rockbox to the RCA RC3000A digital boombox

This is an old post that I didn't get around to completing, and kept as a draft. It talks about how I ported Rockbox to the RCA RC3000A digital boombox. Code is available on GitHub.

Summary

First I opened it up and noted the chips inside. Some already have Rockbox drivers. Then, I figured out how to use USB boot mode to run code using tcctool with slight modifications. I used code to dump the firmware, outputting data via a GPIO pin and reading it via the sound card. Then I did some reverse engineering, figured out various functionality, got a Rockbox bootloader running, got Rockbox running and finally made playback work.

The LCD

First, I wanted a better form of feedback than GPIO pin toggling, so I figured out the LCD. The LCD command and data writing routines can be reached by following subroutine calls from a function that displays strings on the LCD. Then, other functions can be identified which use these functions to do stuff with the LCD. The initialization sequence was interesting. Instead of writing register indexes and then writing data to those registers, some commands have a command number in the high bits and data in the lower bits. I identified the controller by searching through Rockbox LCD driver files for | (which is bitwise or in C). The lcd-archos-bitmap.c driver for the graphical LCD on old Archos devices seemed like a match, so the LCD controller is probably a SSD1815. However, the initialization routine uses different values because the LCD panel is different, and it's important to use those values.

The call to the LCD initialization routine was surrounded by other interesting and relevant code. Just before the initialization, a call to a long function set the TCC760 chip select register for access to the LCD. Instead of trying to understand the function, I simply executed it in ARMSim to get the value. There was also a nearby GPIO call which controls the backlight.

Now it was possible to execute some code and write to the LCD. I thought building even a Rockbox bootloader would be difficult, so I just used a modified lcd-archos-bitmap.c file. This worked, but there was nothing interesting to display. I also verified that the nearby GPIO controlled the backlight. This was tricky, because although the CPU and LCD can run from USB power only, the backlight requires external power.

Building a Rockbox bootloader

This success made me want to try running more of Rockbox. I first decided to build a bootloader, because it uses only selected components of Rockbox. The Rockbox Wiki provides some basic information about how to create a new target. I started off by copying files from the Sansa C100, which uses the related Telechips TCC770 CPU. There were plenty of things to edit and rename, but it was all fairly straightforward, and error messages can serve as a guide showing what needs to be done.

Uploading code to SDRAM

The resulting bootloader was too big for loading in TCC760 internal SRAM, so I needed to use DRAM. Unlike with later chips, USB boot mode doesn't provide a way to set the SDCFG register. So, I used a small bit of assembler code to set SDCFG and then jump back to 0 to restart USB boot mode and enable uploading of a longer program to SDRAM. It was easy to get the bootloader running, but timer interrupts took a bit more work. The TCC760 is similar to TCC770, but there are some key differences with some peripheral register bits.

Figuring out the buttons

The bootloader allowed viewing of GPIO and ADC values which helped me figure out the buttons and some other values. The power/play/pause button is connected to its own GPIO pin, but all the other buttons are connected to one pin, with each connecting it to ground via a different value resistor. This means the ADC must be used to read these buttons. I ended up finding the original firmware routine which reads buttons and using its thresholds between different buttons. I suppose that's better than using measured values, because the measured values are affected by resistor tolerances.

SD card access

The RC3000A has 512MB of internal NAND flash storage, and an SD card. I chose to first use the SD card, because that seems safer. When I viewed GPIO values, I found pins for SD card write protect and SD card detection. I used these to find the SD card code. ARM code typically loads a base address into a register and then accesses a memory mapped peripheral register using that register plus an offset encoded into an instruction. In this case, 0x80000000 is always loaded into the register, so the offsets are predictable and they can be used to search disassembled code for GPIO access with few false matches. There is one unfortunate complication though. All the code I've seen makes the GPIO port obvious, but some uses a value loaded from RAM to define the bit.

I assumed that the SD card uses bit banging because there is no SD controller and USB access to the card is ridiculously slow. I did find bit banging code, but actually the SD card is hooked up to GSIO0, and there is also code for that. I chose to initially use bit banging code, because it is simpler to be sure I'm doing it correctly. GSIO could have left me wondering if the problem was the way I'm using GSIO or the interface to the SD card.

SD card bit bang SPI interface code can be used to add SD cards to routers, and source is available there. However, I chose to start with code from within Rockbox, from the Archos Ondio MMC card driver. Its structure more closely matches my intended final structure, with GSIO and hopefully also DMA.

After learning about the SD card protocol and commands, I first worked on getting initialize_card() working. At first this seemed futile, but then I added the initial synchronizing code sending 0xFF bytes with chip select not asserted and it worked. Then I just had an endianness issue in card_extract_bits() and after that the function succeeded. Then it was simple to read a sector.

Building Rockbox, adding functionality, and getting sound

(Post was interrupted here and continued much later.)

Once SD card reading was available and mountable, it seemed like a good time to run a full build of Rockbox. Since this includes files and functionality not compiled into the bootloader, some work was needed to get it to compile. Then it was time to add more functionality.

First I added I2C support, testing it using the E8564 RTC chip, which Rockbox already supports. The RC3000A uses purely software based bit-banging I2C, although the TCC760 chip does contain an I2C controller. 

I wanted to first set up the CODEC for FM radio playback, which should be simpler than music playback. At first I thought I could use the CS42L55 code present in Rockbox, but the CS42L51 is significantly different, so I ended up creating a new driver using parts of the Linux driver. I got the radio working using the pre-existing TEA5767 driver. Only the headphone jack was usable, because the amplifier driving the speakers was off.

Then it was time to do various tweaks to CODEC and I2S configuration and make playback work. However, the result was interrupted playback, because the device was too slow. I enabled the CPU caches, and started using GSIO for SD card access, and sped up the SD clock. Finally, I got perfect playback of a FLAC file.

Cleanups, tweaks and optimizations

There was still a lot of optimizing to do. The fast internal RAM in the TCC760 could be used to speed up code. Thumb and INIT_ATTR could free up memory. The SD driver needed more work. DMA could be used for playback.

I spent a lot of time trying to get DMA to work with GSIO for SD card accesses, but it seemed impossible and eventually I gave up. I optimized things the best I could with 16 bit GSIO transfers.

Getting USB to work was interesting, because I had never worked with a USB device before.

There were a lot of features to support. Rockbox includes various LCD settings such as flip and inverse, and a way to display greyscale on an LCD which doesn't officially support it. I also enabled sound bass and treble setting via the codec and backlight PWM fading.

Flashing the bootloader

So far, Rockbox could only be loaded via USB boot. It time to flash a Rockbox bootloader with dual-boot support, allowing for choosing between original firmware and Rockbox. I was a bit anxious because this could brick the device, but there was no real danger because USB boot should allow recovery.

Initially, I was thinking I would create a flash plugin and use it to flash the bootloader. However, I ended up creating a firmware upgrade file and flashing it via the original firmware. My first attempt allowed access to original firmware, but failed to run Rockbox. After adding some short initialization code from the original firmware I could load Rockbox.

The end?

Once I had a working Rockbox port, I stopped working on this. The RCA RC3000A devices are probably quite rare, and I've never encountered anyone else interested in running Rockbox on one. It doesn't even seem like there's any interest in running Rockbox on TCC76X devices. Therefore, I don't think adding this port to Rockbox would benefit anyone.

One thing that still interests me is making use of the OHCI USB host controller on the device. The original firmware supports it but fails badly when accessing a large device because it tries to scan all the files and keep a list in memory. I wonder if there's a good free software embedded USB stack? The SeaBIOS code might be useful, but it is GPL 3 and Rockbox is GPL 2.

I've also thought about adding an ESP8266 module and creating a Rockbox plugin for playing Internet radio. Such a plugin could also be used with other targets with externally accessible serial ports, such as the iPods.

Sunday, April 05, 2015

Where is the iPodLinux kernel source?

iPodLinux is a Linux distribution for iPods based on PortalPlayer chips: the old hard disk based iPods up to the 5th generation, and the 1st generation Nano. It's actually a port of uClinux, because the processor lacks the memory management hardware needed for ordinary Linux. iPodLinux consists of a kernel port, the Podzilla application which provides an iPod-like interface, and various modules for Podzilla which are in effect small programs.

I have iPodLinux installed on my 5th generation iPod. It's interesting but not very useful for me. I normally run Rockbox, and I've only used iPodLinux for running Smartmontools smartctl to check hard disk health. Nevertheless, I'm interested in learning more about the Linux kernel, so I just spent some time tracking down the kernel source code.

Source locations

The old home page at www.ipodlinux.org has been down for a long time. Don't bother trying Archive.org, because the site wasn't archived due to its robots.txt.

There is an ipodlinux SourceForge project. Its files section contains patches and binaries. The latest 2.4.24-ipod2 version there is old and does not contain PP502x code needed for 5th generation iPods. If you want to use it, get linux-2.4.24, apply uClinux-2.4.24-uc0.diff and then apply the ipod2 patch. You can build with the arm-elf-tools-20030314.sh toolchain.

The same SourceForge project also has a SVN repository which has a later version. The kernel was initially at trunk/linux/ and then in r2251 "The Great Befuddlement" moved it to linux/2.4/. The repository only contains changed files, so you have to take an unmodified Linux directory, apply the uClinux patch, and then copy files from the repository into that tree. This kernel builds with arm-elf-tools-20030314.sh and works with my iPod. The repository seems abandoned, with the last commit in 2009. For convenience I'm providing a Git repository of just the kernel.

The SVN repository also has a 2.6 port at linux/2.6/ but it's at a very early stage. It might partially work with some older iPod, but it certainly won't have full functionality. I didn't even get it to compile successfully yet.

The uClinux-2.4.x patches starting with uClinux-2.4.24-uc1.diff.gz also contain iPodLinux code. It seems to be a stripped-down version of the ipod0 code, meaning it's very old. This complicates attempts to upgrade later iPodLinux to a later kernel version version, but it wasn't too hard to do with a git merge.

The latest version?

Finally I found ipodlinux on GitHub. The linux-2.4.32-ipod kernel seems to contain everything from the SourceForge SVN repository, but it switches over to uClinux 2.4.32 earlier, so it is not an exact copy. After that it contains some iPod specific patches, and then various Linux fixes.

The repository contains the entire kernel, so you can simply build from there. It is meant to be compiled with a later version of the toolchain. I used arm-uclinux-elf-tools.tar.bz2 with GCC 3.4.3. It compiles without problems and works on my 5th generation iPod.

Installing the kernel

You need to convert the kernel to a binary linux.bin image. Take a look at the instructions on SourceForge. Since I mainly use Rockbox, I have the Rockbox bootloader installed. It loads /linux.bin from the FAT32 partition if I hold down the play/pause key at the bottom of the wheel. You need a separate Linux partition for iPodLinux. If you need to resize partitions, be careful, because the original firmware partition is marked as unused but is actually required.

Binaries

There are some binaries at http://rvvs89.ucc.asn.au/ipl/. This includes a Linux-2.4.32-ipod2-cc.bin.gz kernel which I was using previously. I cannot find the source for that kernel. I just installed the new kernel I compiled from GitHub source.

SansaLinux

The V1 Sansa players also use PortalPlayer chips. SansaLinux seems to be an adaptation of iPodLinux to those players. In some parts kernel code is the same but "ipod" has been replaced by "sansa". I have not investigated this in depth. It could potentually have some useful new PP502x functionality.


Friday, April 03, 2015

The Dell Inspiron 6400/E1505 hinge switch is easy to remove

My laptop would sometimes wake up from sleep while closed. After some searching online I learned about the hinge switch. Replacement switches are inexpensive, but the replacement procedure seemed complicated and involved removing the top half of the laptop.

After opening up the laptop, I found that the switch can be removed without removing the top half. Just the hinge cover and keyboard are in the way. After unplugging the switch cable beneath the keyboard, the switch can be carefully manoeuvred out.

The hinge cover was the worst part. It's easy to open the screen 180 degrees and pry it up via the slot on the right, but the clips at the left side didn't disengage easily. I was afraid I would break something, but after looking at some videos I just continued, with carefully distributed force and some wiggling. The keyboard is easy to move out of the way, with just two screws at the top and easy pry points near Backspace and Esc. There is no need to unplug it, but be careful to not put stress on the flexible cable.

I also tightened screws connecting hinges to the bottom part of the laptop. They were a bit loose and this improved things, but the plastic itself flexes anyways. The top half would need to be opened up to tighten the hinge screws there, and that involves a lot of plastic clips.

I simply took the switch out and didn't replace it. I don't think I'll miss it. It's such a tiny delicate thing! Whoever designed that part of the laptop is an idiot. I assumed they would be using a magnet and reed switch or hall effect sensor.

This picture is with the switch removed. Its location is behind the black sleeved WiFi antenna cables where they enter the top half. The oval shape of the plastic where the wires enter actuates the switch.

Monday, March 02, 2015

HTTP/2 is being used as a tool to promote encryption, and I approve of that

When I read that Firefox and Chrome will only allow encrypted HTTP/2 connections, I was shocked and disappointed. The standard itself does not require encryption, so this is an intentional limitation that those browsers chose to add.

Encryption by itself can only be trusted if you can be sure who you're talking to. Otherwise, you could have an encrypted connection to an adversary, who then makes an encrypted connection to the web server you wanted to reach. Because of this, web servers need certificates from recognized certification authorities. Without a trusted certificate, web browsers show scary warnings, as if something is horribly insecure. Firefox also makes getting around that warning annoying. In reality, such an encrypted connection is no worse than an unencrypted connection. It could be better, but you have no proof of that.

Because of these warnings, if you want to set up an HTTP/2 server, you effectively require a certificate from a recognized certification authority. This is an unprecedented limitation! The need to register somewhere to run a web server is reminiscent of what a totalitarian state might do. You generally even have to pay money, as if HTTP/2 is shareware with nag screens if you don't register.

Then I learned that StartSSL is offering free dynamic DNS and certificates. This means you can get a subdomain and associated recognized certificate for free. This gives you a way to use HTTP/2 for free, and you can probably also avoid supplying accurate personal information. It definitely helps, but having just one company in the whole world offering this isn't good.

What finally changed my opinion was thinking about the big picture. We know that there is extensive eavesdropping going on, by entities which have absolutely no respect for privacy. You can choose to use encryption wherever possible by installing HTTPS Everywhere, but many sites still don't support encryption or don't have it properly configured. HTTP/2 is being used as a tool to induce sites to start supporting encryption. What is being done with HTTP/2 may seem wrong, but it is being used to fight against something that is far more wrong. So, I think requiring encryption with HTTP/2 is justified.

Perhaps the best thing here is how those implementing new technology are leveraging that power, and not giving it away.

Friday, February 06, 2015

SDL 2 text input and Emscripten

SDL 1 used SDL_EnableUNICODE() to enable and disable text input. It started off disabled, and you needed to call SDL_EnableUNICODE() if you depended on the unicode member of SDL_keysym.

SDL 2 has a different text input API which is designed to accommodate international users and touchscreen devices. There, SDL video initialization turns on text input if an on-screen keyboard is not needed. This is normally harmless, but it causes a problem in the Emscripten port. 

Normally, many keys trigger browser actions. Those actions should be prevented if you want to use the same keys in an SDL application. In Firefox, it is possible to prevent browser actions from keypress events, but in Chrome, Safari and IE they need to be prevented in keydown events. However, preventing default actions from keydown events prevents keypress events which are needed to get text characters.

If you need to prevent browser actions and don't need SDL 2 text input, simply disable text input after initializing SDL video:
if (SDL_IsTextInputActive()) SDL_StopTextInput();

You can find more information about these JavaScript events on quirks.org. They also have a test page which you can use to examine their behaviour on different browsers.

Tuesday, February 03, 2015

Fixing the hard problem in Em-DOSBox using Emscripten emterpreter sync

If you just want to use this

Use Emscripten incoming until a new version is released, after 1.29.4. Configure with: ./configure --enable-sync --with-sdl2 . The dosbox.html.mem file that is produced must be in the same directory as dosbox.js. The .mem file is big, but will compress well. Ensure your web server can serve files in compressed format. Ideally pre-compress them so they don't need to be compressed repeatedly.

Technical explanation

I previously wrote about the hard problem with porting DOSBox to Emscripten. Basically, DOSBox cannot easily work as one Emscripten main loop. I had an idea about how I could modify functions to make them resumable, but it would be messy and many functions would need modifications.

Fortunately, around the same time I learned about a new Emscripten feature: emterpreter sync. Instead of compiling to JavaScript, Emscripten can compile functions to a bytecode, and include an interpreter for that bytecode. Code running in emterpreter can be interrupted using emscripten_sleep(). This will call JavaScript setTimeout() and stop execution. The timeout function will then restore state as if the emscripten_sleep() call returned.

There is a catch here. Code running in emterpreter will run much more slowly than asm.js JavaScript. It is far too slow for DOSBox. The solution is to only use emterpreter for functions which could be interrupted by emscripten_sleep(). This means all functions that may be on the call stack when calling emscripten_sleep(). JavaScript functions must not be in the call stack, because their state cannot be resumed.

Once again, the way DOSBox is structured is a problem. The CPU interpreter can recursively call itself, leading to emscripten_sleep(), but emterpreter would make it too slow. A reasonable compromise is possible there: preventing sleep during those nested calls. They are typically CPU exceptions which should return quickly, so there is no need for sleep there.

The main remaining problem is the DOSBox paging code. It recursively runs the CPU interpreter, and only returns when execution returns to where the page fault occurred. This is a problem if execution never returns there, or another page fault happens and they don't return in a last in first out order. This is also a problem with ordinary DOSBox, but it is worse with Em-DOSBox. DOSBox would just run slowly and Em-DOSBox would hang the browser. A timeout check is used to prevent browser hangs.

Finding all the functions which require emterpreter was made easier by linking with -profiling, then using csplit dosbox.js '/^function /' '{*}' to split up functions and using grep to search for calls. Using console.trace() at the start of emscripten_sleep() is also helpful. If an abort occurs after a resume, the previous backtrace will show what function requires emterpreter. Note that -O3 is required, because otherwise functions may have too many local variables for emterpreter.

Originally emterpreter only supported a blacklist of functions which should not use emterpreter. That was unusable in this situation due to the number of functions and changing name mangling of some. Alon Zakai helpfully added a whitelist, so functions which need emterpreter can be listed instead. Right now, there are 40 functions on that list. That's a small number compared to the thousands of functions in Em-DOSBox, but manually transforming all those functions to make them resumable would be a lot of work. It would also complicate use of new changes from SVN or applying of DOSBox patches. Based on performance and compatibility with DOS games, emterpreter sync was the right choice.

Sunday, January 25, 2015

The DOSBox FPU emulator and ole2disp.dll

Running Netscape 3 in Windows 3.11 caused Em-DOSBox to fail with an "FPU stack underflow". At first I couldn't reproduce this in DOSBox in Linux, but then when I recompiled with --disable-fpu, I reproduced it.

DOSBox has two FPU (floating point unit) emulators. One is used by default when running DOSBox on x86 hardware. It uses actual x86 FPU instructions, and it should provide the full 80-bit long double precision. The other one does not require x86 hardware, and uses standard doubles. That means it does not give the full precision one would expect from a real FPU.

In OLE2DISP.DLL 2.3.3027.1, there is a check for precision loss. The code loads a 64-bit integer using FILD, stores a 10 byte BCD number using FBSTP, and then tests the last four digits. If the digits aren't correct, it pops another value, causing the stack underflow.

Normally, this would cause an FPU underflow exception to be noticed by Windows, but DOSBox doesn't pass on those exceptions and instead quits emulation. This is of course an inaccuracy in CPU emulation. DOSBox isn't a very good general purpose x86 CPU emulator, partly because it is a hybrid of an operating system and CPU emulator. DOSBox is designed for running old games and good at that, but not general purpose emulation. I'm tempted to try to port Bochs to make a better general purpose emulator available. For now, I simply disabled the FPU stack underflow check, because that allows many Windows 3.x apps to work.

Here is an image of execution diverging after the test. I used this to find the location where the test was. Note that due to relocations, searching for this in files can be tricky. The image at the right is from the very helpful http://ref.x86asm.net/coder32.html table.



Finally, here's some x86 assembler code performing this test. This fails in DOSBox 0.74 in 64 bit Linux, but works in my SDL 2 branch, which is based on r3869. This is probably due to r3851. I hadn't written a DOS assembler program in so long, so this was fun:

; FPU test like OLE2DISP.DLL 2.3.3027.1
; Build using: nasm ole2disp.asm  -o ole2disp.com
segment code
    org 100h

; This is the test
    wait
    fild qword [input]
    wait
    fbstp tword [output]
    nop
    wait

    mov dx, header
    mov ah, 9
    int 21h

; This displays output
    mov cx, 10
    mov si, output+9
    std
outloop:
    mov al, [si]
    shr al, 1
    shr al, 1
    shr al, 1
    shr al, 1
    call outdig
    lodsb
    call outdig
    loop outloop
    int 20h

; Display a single hex digit.
outdig:
    xor ah, ah
    and al, 0fh
    mov bx, hex
    add bx, ax
    mov dl, byte [bx]
    mov ah, 2
    int 21h
    ret
 
segment data
input:   times 6 db 0
         db 0dfh, 00dh
output:  times 10 db 0
hex:     db "0123456789ABCDEF"
header:  db "FILD, FBSTP FPU test like OLE2DISP.DLL 2.3.3027.1", 13, 10
         db "00999517642299539456 is correct result. "
         db "Last 4 digits of following must match:", 13, 10, '$'