Using the One-Pass RISC-V Assembler

14 Feb 2015

Not too long ago, I completed a RISC-V RV32I and RV64I-subset assembler written using SwiftForth. In this article, I’d like to illustrate how I’ve been using it. Please note, the assembler has not landed in master branch, as I haven’t completed the complete set of RV32I and RV64I instructions yet.

The assembler requires a 32-bit ANS-compliant Forth environment for a little-endian processor. I’ve been using SwiftForth i386-Linux 3.5.6 26-Oct-2014. It may work with a 64-bit Forth; I haven’t tried it. It definitely will not work on a big-endian platform without code changes. I can’t imagine getting it working should be terribly difficult though, as code which is environment-specific has appropriate commentary.

In this article, I’ll dissect one of the Kestrel emulator’s verification programs. The complete listing appears in the conclusion, below.

Assembling and Building the Software

To build the software in this article, I have a script romfile.f:

    include src/vasm/va.f
    include src/e/m4.asm
    2 argv >rom

It is invoked via a redo script like so:

    sf src/e/romfile.f $3 >&2

where sf invokes the above-mentioned SwiftForth interpreter.

You’ll notice that we start by including the assembler. The assembly listing, then, comes as a separate include file. The >rom ( c-addr u — ) word saves the assembled binary image to the named file. If you wish to statically provide the ROM-file name, you can use rom", like so:

    rom" my-rom-file"

That’s basically it, at this point.

The assembler does not provide a relocatable object file format at this time. I simply don’t need it just yet. However, when I port the STS operating system to the Kestrel-3 environment, I’ll almost certainly need to implement a relocatable file format then.

Assembly Listing Structure

You’ll notice right away that the assembler listing consists of Forth code. The dead give-aways include comments in parentheses or after back-slashes, and the reverse-polish notation (operands precede operations).

    \ This comment extends to the end of the line.
    \ Parentheses are used for smaller comments, like so:
    1 ( a number ) 2 ( another number ) + . ( should print 3 )

Perhaps more importantly, for those who remember the horrors of Kestrel-2 machine Forth, you’ll notice an absence of commas everywhere.

Assembly listings have less rigorously defined program structure than the Kestrel-2 Machine Forth listings. Make no mistake: this is not a Forth subset compiler; this is a real assembler, producing arbitrary programs with no presumptions of being used in a Forth environment.

Setting Assembly Origin

You might recall from the days of CP/M or Commodore 8-bit machines that your assembly listings included an ORG statement to set the “origin”. This origin determined, exclusively, where in memory your assembly language program would (often) load and (always) run.

The ORG Directive

The RISC-V assembler, similarly, includes an ORG directive:

    $0000 ORG

Because all RISC-V opcodes are position-independent by design, this directive doesn’t mean as much as it used to. Indeed, if you do not provide it, the origin defaults to zero.

Please observe: if opcodes are position independent by design, why include an ORG directive at all? It proves more useful when building complex data structures using the D,, W,, and related construction directives. (Discussed below.)

Location Counter

The assembler also provides a pseudo-symbol called LC. This location counter always holds the address of the next byte to be placed by the assembler.

    LC x0 jal               ( Halt. )

This instruction, for example, will always jump to itself; even if we never know its absolute address.

More on the Influence of ORG on Data Structures

Encoding the location of something can take two forms: relative or absolute. Encoding absolute addresses in a binary image requires a base address to perform address arithmetic on. This is the job of ORG.

For example, let’s pretend someone ported AROS to the Kestrel-3, and we wanted to use this assembler to write a library or device driver. To do so, you need to embed a Resident structure somewhere in the code. Here’s how you’d do it:

    -> romtag
    $4AFC H,        ( match word )
    romtag W,       ( points back to itself )
    0 W,            ( endSkip parameter )
    129 B, 33 B,    ( flags and version )
    NT_DEVICE B,    ( type of resident module )
    100 B,          ( initialization/discovery priority )
    AW> devname W,  ( device name )
    AW> idstr W,    ( device ID string )
    AW> init W,     ( initialization function address )

    -> devname
    S" typical.device" ASCII, 0 B,

    -> idstr
    S" typical 33.10 (1990-Mar-21)" ASCII, 0 B,

    -> init
            \ initialization logic here...

Notice that we include both forward and backward references to symbols. Without setting the origin, the statement romtag W, will produce a wrong address, as will anything that starts with AW>.

Laying Data Structures in Memory

The assembler includes several methods of laying data into memory, depending on the type of data you want to place. Here, we place strings and individual bytes:

    -> msg1 S" Polaris V1 RISC-V CPU" ASCII, 13 B, 10 B,
    -> msg1end

    -> msg2 S" Hello world!" ASCII, 13 B, 10 B,
    -> msg2end

The assembler also provides H,, W,, and D, for 16-bit, 32-bit, and 64-bit numbers, respectively:

    -> uart-tx      $0F000000.00000000 D,

Advancing to an Arbitrary Address

Let’s go back to ORG briefly. You might recall in days of yore that you could use ORG to relocate the assembler’s location counter arbitrarily. Programmers often exploited this to leave gaps in the program, often for buffer allocation, or just to ensure proper program alignment.

This assembler doesn’t support this use of the ORG directive. It could, but it’d introduce a level of complexity that I felt wasn’t worth the investment. Instead, a simpler, more explicit directive exists for this purpose:

    $CC $2000 ADVANCE       ( RISC-V ISA spec says we boot here. )

The ADVANCE directive takes two arguments. The first is the value to fill in the unused space with. The second argument is the address we wish to continue assembling at. In the example above, instructions assembled after the ADVANCE directive will be placed at address $2000 and higher.

I picked $CC for the fill byte above because it has several benefits with respect to debugging: 1) All CPU instructions have bits 1 and 0 set, so any attempt to execute an instruction in this unused space will produce an illegal instruction exception. 2) Kestrel addresses start with $00, $01, or $0F. Thus, attempting to reference data with an address that starts with $CC will result in an illegal access exception. 3) $CC..CC is a pattern which is easily visible to programmers.

You can use the ADVANCE directive to fill in arbitrary numbers of bytes by using address arithmetic:

    $CC  LC 16 + ADVANCE

will provide exactly 16 bytes of $CC in memory.

Forward References

The RISC-V instruction set architecture specifies at least five different ways of encoding operands for its instructions. This means we need (at least) five forward-parsing words to keep track of forward references. These are as follows:

  • AW> — 32-bit absolute forward reference to a symbol.
  • AD> — 64-bit absolute forward reference to a symbol.
  • B> — 12-bit signed forward reference suitable for conditional branches.
  • JAL> — 20-bit signed forward reference suitable for the JAL instruction.
  • GL> — 12-bit signed forward reference suitable for memory load and JALR instructions relative to a global pointer.
  • GS> — 12-bit signed forward reference suitable for memory store instructions, also relative to a global pointer.

One not nice feature of the assembler is that you, the programmer, need to remember when to use the appropriate forward reference. In practice, this is learned quickly; still, I’d rather this be automated somehow. If/when I complete the two-pass assembler, this burden will disappear.

Here’s a simple example:

    0 x31 auipc             ( X31 = address of next instruction )
    x0 msg1 x1 addi         ( X1 -> msg to print )
    x0 msg1end msg1 - x2 addi       ( X2 = length of msg )
    x31 gl> putstr x3 jalr  ( try printing the string)

The AUIPC instruction adds (sign-extended, upper 20 bits of) the immediate parameter to the current PC register contents, leaving the result in the specified (X31) register. In other words, after running the above instruction, X31 will point to the x0 msg1 x1 addi instruction. This lets us reference absolutely-specified data in close proximity to our routine. In this case, X31 becomes our global pointer, or GP.

The ASSUME-GP directive tells the assembler what value we think is in our global pointer register. In this case, it’s the current location counter, which is the next ADDI instruction. Once the assembler knows this, we can make forward references to symbols in program space. For example, observe the JALR instruction, used to invoke the putstr subroutine. This instruction takes a source register and a 12-bit, sign-extended displacement to add to its value to arrive at the subroutine’s address.

We also find operand access via the global pointer in the putstr subroutine.

    x31 gl> uart-tx x4 ld   ( X4 -> UART TX register )

Because the Kestrel-3’s address space is 64-bits wide, but the instructions only encode 20 or fewer operand bits (depending on instruction form), we need to load absolute addresses from memory explicitly. The frequency of global pointer-relative loads, stores, and subroutine calls justify the GL>, GS>, and <G prefixes. I’ll discuss the <G form shortly.

Backward References

You should never need any kind of prefixes for backward references, except in one special case (<G), which I’ll discuss shortly.

    -> agn
    x1 0 x8 lb              ( X8 = byte to print )
    x8 x4 0 sb              ( print it. )
    x1 1 x1 addi            ( Next byte to print )
    x2 -1 x2 addi           ( One less to print )
    x0 x2 agn bne           ( Print more characters until no more )

In the above loop, you’ll notice that the BNE instruction directly references the agn symbol. We didn’t need a B> prefix here, because agn is already defined by this point.

Likewise, when computing the lengths of the messages to print to the console, we do so using address arithmetic:

    x0 msg1end msg1 - x2 addi       ( X2 = length of msg )

There is a special case where you still need a prefix though. Consider this program fragment:

    -> aSubroutine
    x0 42 x2 addi
    x1 0 x0 jalr

    -> ultimateAnswer
    0 x31 auipc
    x31 aSubroutine x1 jalr         ( point 1; crash! )
    \ ...

When we call aSubroutine at point 1, the program will branch to the wrong address. At this point, X31 points to the JALR instruction. Thus, we expect to find a negative offset to the JALR instruction in order to call backwards in program space. The problem is, a simple reference to aSubroutine will leave its address, which is a positive number. Thus, the CPU will branch to an address that is X31+(a potentially huge number).

We use the <G prefix to fix this problem. It calculates the GP-relative displacement for a backward reference. (This is why the ‘arrow’ points to the left.)

    x31 <G aSubroutine x1 jalr      ( This is correct. )

Note that, unlike GL> and GS>, <G applies for both loads and stores, as <G does not produce a relocation record for later resolution.

Defining Symbols

By now, you probably already guessed that -> defines new symbols. From the point of definition,

    -> putstr

is equivalent to this Forth code:

    LC VALUE putstr

In addition to defining a VALUE-like Forth word, it also takes care of any pending forward references to the symbol.


In this post, I illustrated several directives of the assembler. I showed how I use the assembler to build flat ROM files for emulator consumption. I also illustrated how a typical listing is structured by examining smaller pieces in isolation and exploring their relationships.

The listing below puts all the pieces together. This program, in fact, is the Polaris emulator’s second verification and validation milestone.

    \ This software is the Kestrel-3's second milestone: print a message and halt.
    \ We do this by calling a subroutine.

            $0000 ORG

    -> msg1 S" Polaris V1 RISC-V CPU" ASCII, 13 B, 10 B,
    -> msg1end

    -> msg2 S" Hello world!" ASCII, 13 B, 10 B,
    -> msg2end

            $CC $2000 ADVANCE       ( RISC-V ISA spec says we boot here. )

            0 x31 auipc             ( X31 = address of next instruction )

            x0 msg1 x1 addi         ( X1 -> msg to print )
            x0 msg1end msg1 - x2 addi       ( X2 = length of msg )
            x31 gl> putstr x3 jalr  ( try printing the string)

            x0 msg2 x1 addi
            x0 msg2end msg2 - x2 addi
            x31 gl> putstr x3 jalr

            LC x0 jal               ( Halt. )

    -> putstr
            x31 0 x30 addi          ( Save and assume a GP )
            0 x31 auipc

            x31 gl> uart-tx x4 ld   ( X4 -> UART TX register )
    -> agn
            x1 0 x8 lb              ( X8 = byte to print )
            x8 x4 0 sb              ( print it. )
            x1 1 x1 addi            ( Next byte to print )
            x2 -1 x2 addi           ( One less to print )
            x0 x2 agn bne           ( Print more characters until no more )

            x30 0 x31 addi          ( Restore GP )
            x3 0 x0 jalr            ( return )

    -> uart-tx      $0F000000.00000000 D,

Samuel A. Falvo II
Twitter: @SamuelAFalvoII
Google+: +Samuel A. Falvo II

About the Author

Software engineer by day. Amateur computer engineer by night. Founded the Kestrel Computer Project as a proof-of-concept back in 2007, with the Kestrel-1 computer built around the 65816 CPU. Since then, he's evolved the design to use a simple stack-architecture CPU with the Kestrel-2, and is now in the process of refining the design once more with a 64-bit RISC-V compatible engine in the Kestrel-3.

Samuel is or was:

  • a Forth, Oberon, J, and Go enthusiast.
  • an amateur radio operator (KC5TJA/6).
  • an amateur photographer.
  • an intermittent amateur astronomer, astrophotographer.
  • a student of two martial arts (don't worry; he's still rather poor at them, so you're still safe around him. Or not, depending on your point of view).
  • a former semiconductor verification technician for the HIPP-II and HIPP-III line of Hifn, Inc. line-speed compression and encryption VLSI chips.
  • the co-founder of Armored Internet, a small yet well-respected Internet Service Provider in Carlsbad, CA that, sadly, had to close its doors after three years.
  • the author of GCOM, an open-source, Microsoft COM-compatible component runtime environment. I also made a proprietary fork named Andromeda for Amiga, Inc.'s AmigaDE software stack. It eventually influenced AmigaOS 4.0's bizarre "interface" concept for exec libraries. (Please accept my apologies for this architectural blemish; I warned them not to use it in AmigaOS, but they didn't listen.)
  • the former maintainer and contributor to Gophercloud.
  • a contributor to Mimic.

Samuel seeks inspirations in many things, but is particularly moved by those things which moved or enabled him as a child. These include all things Commodore, Amiga, Atari, and all those old Radio-Electronics magazines he used to read as a kid.

Today, he lives in the San Francisco Bay Area with his beautiful wife, Steph, and four cats; 13, 6.5, Tabitha, and Panther.