If you just want to use this
Use Emscripten incoming until a new version is released, after 1.29.4. Configure with:
./configure --enable-sync --with-sdl2
. The
dosbox.html.mem
file that is produced must be in the same directory as
dosbox.js
. The
.mem
file is big, but will compress well. Ensure your web server can serve files in compressed format. Ideally pre-compress them so they don't need to be compressed repeatedly.
Technical explanation
I previously wrote about
the hard problem with porting DOSBox to Emscripten. Basically, DOSBox cannot easily work as one
Emscripten main loop. I had an idea about how I could modify functions to make them resumable, but it would be messy and many functions would need modifications.
Fortunately, around the same time I learned about a new Emscripten feature:
emterpreter sync. Instead of compiling to JavaScript, Emscripten can compile functions to a bytecode, and include an interpreter for that bytecode. Code running in emterpreter can be interrupted using
emscripten_sleep()
. This will call JavaScript
setTimeout()
and stop execution. The timeout function will then restore state as if the
emscripten_sleep()
call returned.
There is a catch here. Code running in emterpreter will run much more slowly than asm.js JavaScript. It is far too slow for DOSBox. The solution is to only use emterpreter for functions which could be interrupted by emscripten_sleep(). This means all functions that may be on the call stack when calling emscripten_sleep(). JavaScript functions must not be in the call stack, because their state cannot be resumed.
Once again, the way DOSBox is structured is a problem. The CPU interpreter can recursively call itself, leading to
emscripten_sleep()
, but emterpreter would make it too slow. A reasonable compromise is possible there: preventing sleep during those nested calls. They are typically CPU exceptions which should return quickly, so there is no need for sleep there.
The main remaining problem is the DOSBox paging code. It recursively runs the CPU interpreter, and only returns when execution returns to where the page fault occurred. This is a problem if execution never returns there, or another page fault happens and they don't return in a last in first out order. This is also a problem with ordinary DOSBox, but it is worse with Em-DOSBox. DOSBox would just run slowly and Em-DOSBox would hang the browser. A timeout check is used to prevent browser hangs.
Finding all the functions which require emterpreter was made easier by linking with
-profiling
, then using
csplit dosbox.js '/^function /' '{*}'
to split up functions and using
grep
to search for calls. Using
console.trace()
at the start of
emscripten_sleep()
is also helpful. If an abort occurs after a resume, the previous backtrace will show what function requires emterpreter. Note that
-O3
is required, because otherwise functions may have too many local variables for emterpreter.
Originally emterpreter only supported a blacklist of functions which should not use emterpreter. That was unusable in this situation due to the number of functions and changing name mangling of some. Alon Zakai helpfully added a whitelist, so functions which need emterpreter can be listed instead. Right now, there are 40 functions on that list. That's a small number compared to the thousands of functions in Em-DOSBox, but manually transforming all those functions to make them resumable would be a lot of work. It would also complicate use of new changes from SVN or applying of DOSBox patches. Based on performance and compatibility with DOS games, emterpreter sync was the right choice.