Running PalmOS without PalmOS

A traditional PalmOS emulator requires a ROM: a binary object that contains the original PalmOS compiled and linked for the 68K architecture. When you run an application PRC in those emulators, everything is emulated down to the hardware layer, so the ROM thinks it is talking to an actual device. Therefore, as an emulator developer, your job is to provide an implementation of the CPU, memory, display, serial port, and so on, taking into accounting the low level differences between the myriad of devices that ran PalmOS back then. As long as your implementation of the physical layer is accurate, applications will generally run fine.

PumpkinOS also allows you to run binary 68K applications, but do not require a copyrighted PalmOS ROM. The short story is this: the developers of PalmOS devised a clever way to implement system calls (also used in other 68K systems, I think). They used a feature of the 68K CPU called trap. A trap is like a subroutine call, but instead of jumping to a different memory addresses depending on the system call, it jumps to a fixed address, passing an argument identifying the system call. PumpkinOS takes advantage of this fact and, whenever a trap is issued, it intercepts the execution flow, identifies the system call, extract the parameters and calls a native implementation inside PumpkinOS, bypassing a ROM altogether. It is very similar to the way PACE (Palm Application Compatibility Environment) was implemented when PalmOS 5 was introduced. If the 68K application plays by the rules and only calls the OS through system traps, never accessing hardware directly, it will also run fine on PumpkinOS. Now, if you want to know the long version of this story, keep reading.

As it usually happens in real life, tough, the devil is in the details. First, 68K is a 32 bits, big endian architecture. This means that a 32 bits hexadecimal integer like 0x11223344 is stored in memory with the most significant byte coming first: 0x11, 0x22, 0x33, 0x44. Both the x86 and ARM architectures are little endian. The same number would be stored in memory as 0x44, 0x33, 0x22, 0x11, with the least significant byte first. When intermediating calls between emulated 68K and real x86, a byte swapping operation is needed, for input parameters, output parameters and function return codes.

Another problem is word size. On 68K, addresses are 32 bits. On x86_64, addresses are 64 bits. What happens if a 68K application passes an 32 bits address to PumpkinOS? And how does PumpkinOS returns the 64 bits address of a new allocated memory region back to the application? Consider this example code that a 68K application would typically use:

void *p = MemPtrNew(1024);

The MemPtrNew system call is one way to allocate memory in PalmOS. A single argument specifies the amount of bytes to allocate and the returned value is a pointer to the allocated memory block. PumpkinOS can not simply call malloc() and pass the (64 bits) address to the 68K application. The solution is simple: before a application is started, PumpkinOS allocates a big contiguous memory block for it. This block has, of course, a 64 bits address, but the offsets within this block can be expressed with 32 bits numbers. Back to our example, the native implementation of MemPtrNew finds a free region of 1024 bytes inside the big block and returns its offset from the start of the block. When the 68K application executes an instruction that reads from memory, the 68K emulator calls this function:

uint32_t cpu_read_long(uint32_t address) {
  return read_32(ram + address);
}

Where ram is a 64 bits pointer to the big memory block. When an offset is added to the block start, we get back a valid 64 bits address. Of course, the real implementation performs additional checks, like proper address alignment, range checking (is address a valid offset within the big memory block?), among other things. Instead of using malloc() and friends, I opted for another heap management library. I found a nice implementation called dlmalloc that is compact and easy to customize.

We have seen how 68K code can talk to native PumpkinOS code, but before anything we must answer another question: how does a 68K application start? How are code, data and stack segments created and prepared? In classic PalmOS, code and data segments are stored in resources inside the application PRC. Take for example MINEHUNT.PRC, an application bundled with early versions of the PalmOS SDK. It has two code resources and one data resource as we can see in the output of pilot-file:

$ pilot-file -l MINEHUNT.PRC
entries
index   size    type    id
0       6304    code    1
1       24      code    0
2       56      data    0
3       6       tver    1
4       144     tAIB    1000
5       154     tFRM    1000
6       148     MBAR    1000
7       508     tSTR    1001
8       48      Talt    1002
9       43      Talt    1003
10      46      Tbmp    1000
11      46      Tbmp    1001
12      46      Tbmp    1002
13      46      Tbmp    1003
14      46      Tbmp    1004
15      39      Tbmp    1005
16      314     tFRM    1200
17      10      pref    0

In PalmOS, code.0 is not actually executable code, but meta information describing how other segments are arranged. Even back when PalmOS was still rolling, there were very few information describing this resource. The only reference I could find is available here. PumpkinOS uses only the first two 32 bits words of this resource. The first contains the size of the jump table and parameters (not used here), and the second the size of the data segment (application globals). The remaining of the resource is silently ignored. The actual application 68K code resides in the code.1 resource, and is limited to 32KB. Later on, when support for multiple code segments was added, applications could provide additional code resources (code.2, code.3, etc). Initialization code put at the beginning of code.1, and not PalmOS itself, would handle loading and linking of other code resources. And this is a blessing, because I do not have to worry about them either. PumpkinOS only cares about code.0 and code.1. If an application uses other code resources, they are already instrumented to handle them using standard PalmOS calls.

Then we have the globals at resource data.0. In a simple world, we would just load it into memory and point register A5 to its end and the application would magically have access to its globals (negative offsets from A5 are used to access initialized globals). But, to save space, data.0 is encoded in a form of RLE (Run Length Encoding). Special byte sequences tell PalmOS if zero-filled regions are present, or if a fixed byte value is repeated, and some other corner cases. The document describing code.0 linked above also talks about data.0 encoding. The data.0 resource may also have information about XREF (cross reference) sections for both code and data. I found no documentation describing these sections, so for now PumpkinOS can not run programs that depend on them (so far I found only one such application: SFCave, an early game for PalmOS 1.0).

data.0 resource from SFCave

After both code and data segments are loaded/decoded into the memory block belonging to the application, stack space is also allocated there. Then the Musashi 68K emulator is setup and emulation can begin:

m68k_set_cpu_type(M68K_CPU_TYPE_68020);
m68k_pulse_reset();
m68k_set_reg(M68K_REG_PC, code1_start);
m68k_set_reg(M68K_REG_A5, data0_end);
m68k_set_reg(M68K_REG_SP, stack_end);

for (; some_condition;) {
  m68k_execute(m68k_state, 100000);
}

Here, code1_start is the offset within the big memory block where code.1 was loaded, data0_end is the offset of the end of data.0 (remember that globals are accessed using negative offsets from there), and stack_end is the offset to the stack (that grows towards lower addresses). The emulation loop runs a fixed number of cycles each tine and bails out in case of errors or if the application returns from PilotMain.

Let’s view another system call example, this time with more details. PalmOS applications must call EvtGetEvent from time to time to retrieve and process events from the event queue. Its prototype is like this:

void EvtGetEvent(EventType *event, Int32 timeout);

It expects a pointer to an EventType structure and a 32 bits signed integer specifying the number of ticks to wait for the event (-1 meaning wait forever). The following excerpt has a condensed debug trace from PumpkinOS running MinHunt, at a point where it is calling an EvtGetEvent system trap:

MineHunt M68K: 00152132: 2f2e 0008 : move.l  ($8,A6), -(A7)
MineHunt M68K: 00152136: 486e ffe8 : pea     (-$18,A6)
MineHunt M68K: 0015213A: 4e4f      : trap    #$f
MineHunt M68K: 0015213C: a11d
MineHunt EmuPalmOS: EvtGetEvent(0x00153592, -1)

The 68K instruction at program counter 0x00152132 pushes the second argument to EvtGetEvent on the stack (which in this case comes from a parameter in the stack frame pointed to by register A6). The next instruction pushes the first argument on the stack, the address of a local variable on the stack frame. The next instruction calls trap $f (or 15 in decimal), which is used as the system call trap in PalmOS. The next two bytes 0xA11D at address 0x0015213C are not 68K code, but contain the 16 bits trap number. The code in Musashi was adapted to call a function in PumpkinOS named palmos_systrap() whenever a trap instruction is executed:

static void m68k_op_trap(void) {
  if ((REG_IR & 0xf) == 0x0f) {
    uint16_t trap = m68k_read_memory_16(REG_PC);
    REG_PC += 2;
    palmos_systrap(trap);
  } else {
    m68ki_exception_trapN(EXCEPTION_TRAP_BASE + (REG_IR & 0xf));
  }
}

palmos_systrap() has a switch statement with a case for each trap number. The case handling sysTrapEvtGetEvent (0xA11D) is show below. First, both arguments are fetched from the 68K stack (macro ARG32 does this). Then EvtGetEvent is called. It receives a pointer to a native EventType structure. The function encode_event() takes the native EventType structure filled in by EvtGetEvent and encodes it into the 68K space pointed to by eventP. Besides handling endianness conversion and address mapping as shown before, it must understand the arguments to all PalmOS events. Finally, the debug() function call is what produced the “EvtGetEvent(0x00153592, -1)” in the excerpt above.

case sysTrapEvtGetEvent: {
// void EvtGetEvent(EventType *event, Int32 timeout)
uint32_t eventP = ARG32;
int32_t timeout = ARG32;
EventType event;
EvtGetEvent(&event, timeout);
encode_event(eventP, &event);
debug("EmuPalmOS", "EvtGetEvent(0x%08X, %d)", eventP, timeout);
}
break;

When palmos_systrap() returns, the emulator resumes the 68K application right after the “trap #$f” instruction (also skipping the two bytes storing the trap number). The application thinks it has just called the PalmOS ROM, but in fact it called the implementation of EvtGetEvent inside PumkinOS. Again, for simplicity, some details were omitted from this explanation, but in essence this is how PumpkinOS works with classic 68K applications.

8 thoughts on “Running PalmOS without PalmOS

  1. Thanks for the feedback. I am not familiar with Legacy. Is it a game? A few months ago I experimented with running PumpkinOS on Android. It kind of worked, but I got lost in the myriad of permission/authorization calls necessary just to make the program read a file. Maybe I will give it another try.

    Like

  2. Looking forward to the release. I’m hoping it will support playing Legacy by Redshift and will provide a more comfortable platform than the Palm Simulator which you can’t even scale up. Perhaps you could provide a better core to Retroarch than the current very limited Palm emulator they have. An Android port should be nice down the line too. I can imagine a galaxy note or Tab S8 being the perfect platform, with their stylus.

    Like

  3. How creating native program for this system?
    simple hello world , can You put repo with makefile etc?

    (blender is open source, big campaining big money and create fundations)

    Like

  4. this is open source? I can run this on raspberry pi 2040 or esp with screen?
    i need normal os and need device than play music, compiling my code (cc) and can write a text in my text editor, for week or month. I no need faster cpu, internet , etc. no need play video

    Like

Leave a comment