OpenTitan Big Number Accelerator (OTBN) Instruction Set Architecture
This document describes the instruction set for OTBN. For more details about the processor itself, see the OTBN Technical Specification. In particular, this document assumes knowledge of the Processor State section from that guide.
The instruction set is split into base and big number subsets. The base subset (described first) is similar to RISC-V’s RV32I instruction set. It also includes a hardware call stack and hardware loop instructions. The big number subset is designed to operate on 256b WDRs. It doesn’t include any control flow instructions, and just supports load/store, logical and arithmetic operations.
In the instruction documentation that follows, each instruction has a syntax example.
For example, the SW
instruction has syntax:
SW <grs2>, <offset>(<grs1>)
This means that it takes three operands, called grs2
, offset
and grs1
.
These operands are further documented in a table.
Immediate operands like offset
show their valid range of values.
Below the table of operands is an encoding table.
This shows how the 32 bits of the instruction word are filled in.
Ranges of bits that map to an operand are named (in capitals) and those names are used in the operand table.
For example, the SW
instruction’s offset
operand is split across two ranges of bits (31:25 and 11:7) called OFF_1
and OFF_0
, respectively.
Pseudo-code for operation descriptions
Each instruction has an Operation section.
This is written in a Python-like pseudo-code, generated from the instruction set simulator (which can be found at hw/ip/otbn/dv/otbnsim
).
The code is generated from Python, but there are some extra changes made to aid readability.
All instruction operands are considered to be in scope and have integer values. These values come from the encoded bits in the instruction and the operand table for the instruction describes exactly how they are decoded. Some operands are encoded PC-relative. Such an operand has its absolute value (an address) when it appears in the Operation section.
Some state updates are represented as an assignment, but take effect at the end of the instruction. This includes register updates or jumps and branches (updating the PC). To denote this, we use the ⇐ symbol, reminiscent of Verilog’s non-blocking assignment.
The program counter (PC) is represented as a variable called PC
.
Machine registers are accessed with an array syntax. These arrays are:
GPRs
: General purpose registersWDRs
: Wide data registersCSRs
: Control and status registersWSRs
: Wide special purpose registers
Accesses to these arrays are as unsigned integers.
The instruction descriptions are written to ensure that any value written to a register is representable.
For example, a write to GPRs[2]
will always have a non-negative value less than 1 << 32
.
Memory accesses are represented as function calls.
This is because the memory can be accessed on either the narrow or the wide side, which isn’t easy to represent with an array syntax.
Memory loads are represented as DMEM.load_u32(addr)
, DMEM.load_u256(addr)
.
Memory stores are represented as DMEM.store_u32(addr, value)
and DMEM.store_u256(addr, value)
.
In all cases, memory values are interpreted as unsigned integers and, as for register accesses, the instruction descriptions are written to ensure that any value stored to memory is representable.
Some instructions can stall for one or more cycles (those instructions that access memory, CSRs or WSRs).
To represent this precisely in the pseudo-code, and the simulator reference model, such instructions execute a yield
statement to stall the processor for a cycle.
There are a few other helper functions, defined here to avoid having to inline their bodies into each instruction.
def from_2s_complement(n: int) -> int:
'''Interpret the bits of unsigned integer n as a 32-bit signed integer'''
assert 0 <= n < (1 << 32)
return n if n < (1 << 31) else n - (1 << 32)
def to_2s_complement(n: int) -> int:
'''Interpret the bits of signed integer n as a 32-bit unsigned integer'''
assert -(1 << 31) <= n < (1 << 31)
return (1 << 32) + n if n < 0 else n
def logical_byte_shift(value: int, shift_type: int, shift_bytes: int) -> int:
'''Logical shift value by shift_bytes to the left or right.
value should be an unsigned 256-bit value. shift_type should be 0 (shift
left) or 1 (shift right), matching the encoding of the big number
instructions. shift_bytes should be a non-negative number of bytes to shift
by.
Returns an unsigned 256-bit value, truncating on an overflowing left shift.
'''
mask256 = (1 << 256) - 1
assert 0 <= value <= mask256
assert 0 <= shift_type <= 1
assert 0 <= shift_bytes
shift_bits = 8 * shift_bytes
shifted = value << shift_bits if shift_type == 0 else value >> shift_bits
return shifted & mask256
def extract_quarter_word(value: int, qwsel: int) -> int:
'''Extract a 64-bit quarter word from a 256-bit value.'''
assert 0 <= value < (1 << 256)
assert 0 <= qwsel <= 3
return (value >> (qwsel * 64)) & ((1 << 64) - 1)
Errors
OTBN can detect various errors when it is operating.
For details about OTBN’s approach to error handling, see the Errors section of the Technical Specification.
The instruction descriptions below describe any software errors that executing the instruction can cause.
These errors are listed explicitly and also appear in the pseudo-code description, where the code sets a bit in the ERR_BITS
register with a call to state.stop_at_end_of_cycle()
.
Other errors are possible at runtime.
Specifically, any instruction that reads from a GPR or WDR might detect a register integrity error.
In this case, OTBN will set the REG_INTG_VIOLATION
bit.
Similarly, an instruction that loads from memory might detect a DMEM integrity error.
In this case, OTBN will set the DMEM_INTG_VIOLATION
bit.
TODO: Specify interactions between these fatal errors and any other errors. In particular, how do they interact with instructions that could cause other errors as well?
Base Instruction Subset
The base instruction set of OTBN is a limited 32b instruction set. It is used together with the 32b wide General Purpose Register file. The primary use of the base instruction set is the control flow in applications.
The base instruction set is an extended subset of RISC-V’s RV32I_Zcsr. Refer to the RISC-V Unprivileged Specification for a detailed instruction specification. Not all RV32 instructions are implemented. The implemented subset is shown below.
ADD
Add.
This instruction is defined in the RV32I instruction set.
Errors
ADD might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
orgrs2
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full and neithergrs1
norgrs2
isx1
.
Syntax
ADD <grd>, <grs1>, <grs2>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
ADD | 0 | 0 | 0 | 0 | 0 | 0 | 0 | GRS2 | GRS1 | 0 | 0 | 0 | GRD | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
val2 = GPRs[grs2]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = (val1 + val2) & ((1 << 32) - 1)
GPRs[grd] ⇐ result
ADDI
Add Immediate.
This instruction is defined in the RV32I instruction set.
Errors
ADDI might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full andgrs1
is notx1
.
Syntax
ADDI <grd>, <grs1>, <imm>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
ADDI | IMM | GRS1 | 0 | 0 | 0 | GRD | 0 | 0 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = (val1 + imm) & ((1 << 32) - 1)
GPRs[grd] ⇐ result
LUI
Load Upper Immediate.
This instruction is defined in the RV32I instruction set.
Errors
LUI might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrd
when the call stack is full.
Syntax
LUI <grd>, <imm>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
LUI | IMM | GRD | 0 | 1 | 1 | 0 | 1 | 1 | 1 |
Operation
GPRs[grd] ⇐ imm << 12
SUB
Subtract.
This instruction is defined in the RV32I instruction set.
Errors
SUB might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
orgrs2
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full and neithergrs1
norgrs2
isx1
.
Syntax
SUB <grd>, <grs1>, <grs2>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
SUB | 0 | 1 | 0 | 0 | 0 | 0 | 0 | GRS2 | GRS1 | 0 | 0 | 0 | GRD | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
val2 = GPRs[grs2]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = (val1 - val2) & ((1 << 32) - 1)
GPRs[grd] ⇐ result
SLL
Logical left shift.
This instruction is defined in the RV32I instruction set.
Errors
SLL might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
orgrs2
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full and neithergrs1
norgrs2
isx1
.
Syntax
SLL <grd>, <grs1>, <grs2>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
SLL | 0 | 0 | 0 | 0 | 0 | 0 | 0 | GRS2 | GRS1 | 0 | 0 | 1 | GRD | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
val2 = GPRs[grs2] & 0x1f
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = (val1 << val2) & ((1 << 32) - 1)
GPRs[grd] ⇐ result
SLLI
Logical left shift with Immediate.
This instruction is defined in the RV32I instruction set.
Errors
SLLI might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full andgrs1
is notx1
.
Syntax
SLLI <grd>, <grs1>, <shamt>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
SLLI | 0 | 0 | 0 | 0 | 0 | 0 | 0 | SHAMT | GRS1 | 0 | 0 | 1 | GRD | 0 | 0 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = (val1 << shamt) & ((1 << 32) - 1)
GPRs[grd] ⇐ result
SRL
Logical right shift.
This instruction is defined in the RV32I instruction set.
Errors
SRL might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
orgrs2
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full and neithergrs1
norgrs2
isx1
.
Syntax
SRL <grd>, <grs1>, <grs2>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
SRL | 0 | 0 | 0 | 0 | 0 | 0 | 0 | GRS2 | GRS1 | 1 | 0 | 1 | GRD | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
val2 = GPRs[grs2] & 0x1f
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = val1 >> val2
GPRs[grd] ⇐ result
SRLI
Logical right shift with Immediate.
This instruction is defined in the RV32I instruction set.
Errors
SRLI might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full andgrs1
is notx1
.
Syntax
SRLI <grd>, <grs1>, <shamt>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
SRLI | 0 | 0 | 0 | 0 | 0 | 0 | 0 | SHAMT | GRS1 | 1 | 0 | 1 | GRD | 0 | 0 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = val1 >> shamt
GPRs[grd] ⇐ result
SRA
Arithmetic right shift.
This instruction is defined in the RV32I instruction set.
Errors
SRA might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
orgrs2
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full and neithergrs1
norgrs2
isx1
.
Syntax
SRA <grd>, <grs1>, <grs2>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
SRA | 0 | 1 | 0 | 0 | 0 | 0 | 0 | GRS2 | GRS1 | 1 | 0 | 1 | GRD | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = from_2s_complement(GPRs[grs1])
val2 = GPRs[grs2] & 0x1f
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = val1 >> val2
GPRs[grd] ⇐ to_2s_complement(result)
SRAI
Arithmetic right shift with Immediate.
This instruction is defined in the RV32I instruction set.
Errors
SRAI might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full andgrs1
is notx1
.
Syntax
SRAI <grd>, <grs1>, <shamt>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
SRAI | 0 | 1 | 0 | 0 | 0 | 0 | 0 | SHAMT | GRS1 | 1 | 0 | 1 | GRD | 0 | 0 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = from_2s_complement(GPRs[grs1])
val2 = shamt
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = val1 >> val2
GPRs[grd] ⇐ to_2s_complement(result)
AND
Bitwise AND.
This instruction is defined in the RV32I instruction set.
Errors
AND might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
orgrs2
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full and neithergrs1
norgrs2
isx1
.
Syntax
AND <grd>, <grs1>, <grs2>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
AND | 0 | 0 | 0 | 0 | 0 | 0 | 0 | GRS2 | GRS1 | 1 | 1 | 1 | GRD | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
val2 = GPRs[grs2]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = val1 & val2
GPRs[grd] ⇐ result
ANDI
Bitwise AND with Immediate.
This instruction is defined in the RV32I instruction set.
Errors
ANDI might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full andgrs1
is notx1
.
Syntax
ANDI <grd>, <grs1>, <imm>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
ANDI | IMM | GRS1 | 1 | 1 | 1 | GRD | 0 | 0 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
val2 = to_2s_complement(imm)
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = val1 & val2
GPRs[grd] ⇐ result
OR
Bitwise OR.
This instruction is defined in the RV32I instruction set.
Errors
OR might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
orgrs2
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full and neithergrs1
norgrs2
isx1
.
Syntax
OR <grd>, <grs1>, <grs2>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
OR | 0 | 0 | 0 | 0 | 0 | 0 | 0 | GRS2 | GRS1 | 1 | 1 | 0 | GRD | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
val2 = GPRs[grs2]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = val1 | val2
GPRs[grd] ⇐ result
ORI
Bitwise OR with Immediate.
This instruction is defined in the RV32I instruction set.
Errors
ORI might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full andgrs1
is notx1
.
Syntax
ORI <grd>, <grs1>, <imm>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
ORI | IMM | GRS1 | 1 | 1 | 0 | GRD | 0 | 0 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
val2 = to_2s_complement(imm)
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = val1 | val2
GPRs[grd] ⇐ result
XOR
Bitwise XOR.
This instruction is defined in the RV32I instruction set.
Errors
XOR might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
orgrs2
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full and neithergrs1
norgrs2
isx1
.
Syntax
XOR <grd>, <grs1>, <grs2>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
XOR | 0 | 0 | 0 | 0 | 0 | 0 | 0 | GRS2 | GRS1 | 1 | 0 | 0 | GRD | 0 | 1 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
val2 = GPRs[grs2]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = val1 ^ val2
GPRs[grd] ⇐ result
XORI
Bitwise XOR with Immediate.
This instruction is defined in the RV32I instruction set.
Errors
XORI might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full andgrs1
is notx1
.
Syntax
XORI <grd>, <grs1>, <imm>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
XORI | IMM | GRS1 | 1 | 0 | 0 | GRD | 0 | 0 | 1 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
val2 = to_2s_complement(imm)
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
result = val1 ^ val2
GPRs[grd] ⇐ result
LW
Load Word.
Loads a 32b word from address offset + grs1
in data memory, writing the result to grd
.
Unaligned loads are not supported.
Any address that is unaligned or is above the top of memory will result in an error (setting bit bad_data_addr
in ERR_BITS
).
This instruction takes 2 cycles.
This instruction is defined in the RV32I instruction set.
Errors
LW might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
when the call stack is empty. - A
BAD_DATA_ADDR
error if the computed address is not a valid 4-byte aligned DMEM address. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full andgrs1
is notx1
.
Syntax
LW <grd>, <offset>(<grs1>)
Operands
Operand | Description |
---|---|
|
Decode as |
|
Valid range: Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
LW | OFF | GRS1 | 0 | 1 | 0 | GRD | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
Operation
# LW executes over two cycles. On the first cycle, we read the base
# address, compute the load address and check it for correctness, then
# perform the load itself, returning the result.
#
# On the second cycle, we write the result to the destination register.
base = GPRs[grs1]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return None
addr = (base + offset) & ((1 << 32) - 1)
if not DMEM.is_valid_32b_addr(addr):
state.stop_at_end_of_cycle(ErrBits.BAD_DATA_ADDR)
return None
result = DMEM.load_u32(addr)
# Stall for a single cycle for memory to respond
yield None
if result is None:
state.stop_at_end_of_cycle(ErrBits.DMEM_INTG_VIOLATION)
return None
GPRs[grd] ⇐ result
return None
SW
Store Word.
Stores a 32b word in grs2
to address offset + grs1
in data memory.
Unaligned stores are not supported.
Any address that is unaligned or is above the top of memory will result in an error (setting bit bad_data_addr
in ERR_BITS
).
This instruction is defined in the RV32I instruction set.
Errors
SW might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
orgrs2
when the call stack is empty. - A
BAD_DATA_ADDR
error if the computed address is not a valid 4-byte aligned DMEM address.
Syntax
SW <grs2>, <offset>(<grs1>)
Operands
Operand | Description |
---|---|
|
Decode as |
|
Valid range: Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
SW | OFF_1 | GRS2 | GRS1 | 0 | 1 | 0 | OFF_0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 |
Operation
base = GPRs[grs1]
addr = (base + offset) & ((1 << 32) - 1)
value = GPRs[grs2]
bad_grs1 = state.gprs.call_stack_err and (grs1 == 1)
saw_err = False
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
saw_err = True
if not DMEM.is_valid_32b_addr(addr) and not bad_grs1:
state.stop_at_end_of_cycle(ErrBits.BAD_DATA_ADDR)
saw_err = True
if saw_err:
return
DMEM.store_u32(addr, value)
BEQ
Branch Equal.
This instruction is defined in the RV32I instruction set.
Errors
BEQ might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
orgrs2
when the call stack is empty. - A
BAD_INSN_ADDR
error if the branch is taken and the computed address is not a valid PC. - A
LOOP
error if this instruction appears as the last instruction of a loop body.
Syntax
BEQ <grs1>, <grs2>, <offset>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BEQ | OFF_3 | OFF_1 | GRS2 | GRS1 | 0 | 0 | 0 | OFF_0 | OFF_2 | 1 | 1 | 0 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
val2 = GPRs[grs2]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
tgt_pc = offset & ((1 << 32) - 1)
if val1 == val2:
if not state.is_pc_valid(tgt_pc):
state.stop_at_end_of_cycle(ErrBits.BAD_INSN_ADDR)
else:
PC ⇐ tgt_pc
BNE
Branch Not Equal.
This instruction is defined in the RV32I instruction set.
Errors
BNE might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
orgrs2
when the call stack is empty. - A
BAD_INSN_ADDR
error if the branch is taken and the computed address is not a valid PC. - A
LOOP
error if this instruction appears as the last instruction of a loop body.
Syntax
BNE <grs1>, <grs2>, <offset>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BNE | OFF_3 | OFF_1 | GRS2 | GRS1 | 0 | 0 | 1 | OFF_0 | OFF_2 | 1 | 1 | 0 | 0 | 0 | 1 | 1 |
Operation
val1 = GPRs[grs1]
val2 = GPRs[grs2]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
tgt_pc = offset & ((1 << 32) - 1)
if val1 != val2:
if not state.is_pc_valid(tgt_pc):
state.stop_at_end_of_cycle(ErrBits.BAD_INSN_ADDR)
else:
PC ⇐ tgt_pc
JAL
Jump And Link.
The JAL instruction has the same behavior as in RV32I, jumping by the given offset and writing PC+4
as a link address to the destination register.
OTBN has a hardware managed call stack, accessed through x1
, which should be used when calling subroutines.
Do so by using x1
as the link register: jal x1, <offset>
.
This instruction is defined in the RV32I instruction set.
Errors
JAL might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrd
when the call stack is full. - A
BAD_INSN_ADDR
error if the computed address is not a valid PC. - A
LOOP
error if this instruction appears as the last instruction of a loop body.
Syntax
JAL <grd>, <offset>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
JAL | OFF_3 | OFF_0 | OFF_1 | OFF_2 | GRD | 1 | 1 | 0 | 1 | 1 | 1 | 1 |
Operation
mask32 = ((1 << 32) - 1)
link_pc = (state.pc + 4) & mask32
GPRs[grd] ⇐ link_pc
next_pc = offset & mask32
if not state.is_pc_valid(next_pc):
state.stop_at_end_of_cycle(ErrBits.BAD_INSN_ADDR)
else:
PC ⇐ next_pc
JALR
Jump And Link Register.
The JALR instruction has the same behavior as in RV32I, jumping by <grs1> + <offset>
and writing PC+4
as a link address to the destination register.
OTBN has a hardware managed call stack, accessed through x1
, which should be used when calling and returning from subroutines.
To return from a subroutine, use jalr x0, x1, 0
.
This pops a link address from the call stack and branches to it.
To call a subroutine through a function pointer, use jalr x1, <grs1>, 0
.
This jumps to the address in <grs1>
and pushes the link address onto the call stack.
This instruction is defined in the RV32I instruction set.
Errors
JALR might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full andgrs1
is notx1
. - A
BAD_INSN_ADDR
error if the computed address is not a valid PC. - A
LOOP
error if this instruction appears as the last instruction of a loop body.
Syntax
JALR <grd>, <grs1>, <offset>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
JALR | OFFSET | GRS1 | 0 | 0 | 0 | GRD | 1 | 1 | 0 | 0 | 1 | 1 | 1 |
Operation
val1 = GPRs[grs1]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
mask32 = ((1 << 32) - 1)
link_pc = (state.pc + 4) & mask32
GPRs[grd] ⇐ link_pc
next_pc = (val1 + offset) & mask32
if not state.is_pc_valid(next_pc):
state.stop_at_end_of_cycle(ErrBits.BAD_INSN_ADDR)
else:
PC ⇐ next_pc
CSRRS
Atomic Read and Set bits in CSR.
Reads the value of the CSR csr
, and writes it to the destination GPR grd
.
The initial value in grs1
is treated as a bit mask that specifies bits to be set in the CSR.
Any bit that is high in grs1
will cause the corresponding bit to be set in the CSR, if that CSR bit is writable.
Other bits in the CSR are unaffected (though CSRs might have side effects when written).
If csr
isn’t the index of a valid CSR, this results in an error (setting bit illegal_insn
in ERR_BITS
).
This instruction is defined in the RV32I instruction set.
Errors
CSRRS might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
when the call stack is empty. - An
ILLEGAL_INSN
error ifcsr
doesn’t name a valid CSR.
Syntax
CSRRS <grd>, <csr>, <grs1>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Valid range: Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
CSRRS | CSR | GRS1 | 0 | 1 | 0 | GRD | 1 | 1 | 1 | 0 | 0 | 1 | 1 |
Operation
if not state.csrs.check_idx(csr):
# Invalid CSR index. Stop with an illegal instruction error.
state.stop_at_end_of_cycle(ErrBits.ILLEGAL_INSN)
return None
bits_to_set = GPRs[grs1]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return None
if csr == 0xfc0:
# A read from RND. If a RND value is not available, request_value()
# initiates or continues an EDN request and returns False. If a RND
# value is available, it returns True.
while not state.wsrs.RND.request_value():
# There's a pending EDN request. Stall for a cycle.
yield None
# At this point, the CSR is ready. Read, update and write back to grs1.
old_val = CSRs[csr]
new_val = old_val | bits_to_set
GPRs[grd] ⇐ old_val
if grs1 != 0:
CSRs[csr] ⇐ new_val
return None
CSRRW
Atomic Read/Write CSR.
Atomically swaps values in the CSR csr
with the value in the GPR grs1
.
Reads the old value of the CSR, and writes it to the GPR grd
.
Writes the initial value in grs1
to the CSR csr
.
If grd == x0
the instruction does not read the CSR or cause any read-related side-effects.
If csr
isn’t the index of a valid CSR, this results in an error (setting bit illegal_insn
in ERR_BITS
).
This instruction is defined in the RV32I instruction set.
Errors
CSRRW might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
when the call stack is empty. - A
CALL_STACK
error from usingx1
asgrd
when the call stack is full andgrs1
is notx1
. - An
ILLEGAL_INSN
error ifcsr
doesn’t name a valid CSR.
Syntax
CSRRW <grd>, <csr>, <grs1>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Valid range: Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
CSRRW | CSR | GRS1 | 0 | 0 | 1 | GRD | 1 | 1 | 1 | 0 | 0 | 1 | 1 |
Operation
if not state.csrs.check_idx(csr):
# Invalid CSR index. Stop with an illegal instruction error.
state.stop_at_end_of_cycle(ErrBits.ILLEGAL_INSN)
return None
new_val = GPRs[grs1]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return None
if csr == 0xfc0 and grd != 0:
# A read from RND. If a RND value is not available, request_value()
# initiates or continues an EDN request and returns False. If a RND
# value is available, it returns True.
while not state.wsrs.RND.request_value():
# There's a pending EDN request. Stall for a cycle.
yield None
# At this point, the CSR is either ready or unneeded. Read it if
# necessary and write to grd, then overwrite with new_val.
if grd != 0:
old_val = CSRs[csr]
GPRs[grd] ⇐ old_val
CSRs[csr] ⇐ new_val
return None
ECALL
Environment Call.
Triggers the done
interrupt to indicate completion of the operation.
This instruction is defined in the RV32I instruction set.
Errors
ECALL cannot cause any software errors.
Syntax
ECALL
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
ECALL | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 1 |
Operation
# Set INTR_STATE.done and STATUS, reflecting the fact we've stopped.
state.stop_at_end_of_cycle(err_bits=0)
LOOP
Loop (indirect).
Repeats a sequence of code multiple times.
The number of iterations is read from grs
, treated as an unsigned value.
The number of instructions in the loop is given in the bodysize
immediate.
The LOOP
instruction doesn’t support a zero iteration count.
If the value in grs
is zero, OTBN stops, setting bit loop
in ERR_BITS
.
Starting a loop pushes an entry on to the loop stack.
If the stack is already full, OTBN stops, setting bit loop
in ERR_BITS
.
LOOP
, LOOPI
, jump and branch instructions are all permitted inside a loop but may not appear as the last instruction in a loop.
OTBN will stop on that instruction, setting bit loop
in ERR_BITS
.
For more information on how to correctly use LOOP
see loop nesting.
Errors
LOOP might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs
when the call stack is empty. - A
LOOP
error if the value ingrs
is zero. - A
LOOP
error if this instruction appears as the last instruction of a loop body.
Syntax
LOOP <grs>, <bodysize>
Operands
Operand | Description |
---|---|
|
Name of the GPR containing the number of iterations Decode as |
|
Number of instructions in the loop body Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
LOOP | SZ | GRS | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 |
Operation
num_iters = GPRs[grs]
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
return
if num_iters == 0:
state.stop_at_end_of_cycle(ErrBits.LOOP)
else:
state.loop_start(num_iters, bodysize)
LOOPI
Loop Immediate.
Repeats a sequence of code multiple times.
The number of iterations is given in the iterations
immediate.
The number of instructions in the loop is given in the bodysize
immediate.
The LOOPI
instruction doesn’t support a zero iteration count.
If the value of iterations
is zero, OTBN stops with the ErrCodeLoop
error.
Starting a loop pushes an entry on to the loop stack.
If the stack is already full, OTBN stops, setting bit loop
in ERR_BITS
.
LOOP
, LOOPI
, jump and branch instructions are all permitted inside a loop but may not appear as the last instruction in a loop.
OTBN will stop on that instruction, setting bit loop
in ERR_BITS
.
For more information on how to correctly use LOOPI
see loop nesting.
Errors
LOOPI might cause the following software errors:
- A
LOOP
error ifiterations
is zero. - A
LOOP
error if this instruction appears as the last instruction of a loop body.
Syntax
LOOPI <iterations>, <bodysize>
Operands
Operand | Description |
---|---|
|
Number of iterations Valid range: Decode as |
|
Number of instructions in the loop body Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
LOOPI | SZ | ITERATIONS_1 | 0 | 0 | 1 | ITERATIONS_0 | 1 | 1 | 1 | 1 | 0 | 1 | 1 |
Operation
if iterations == 0:
state.stop_at_end_of_cycle(ErrBits.LOOP)
else:
state.loop_start(iterations, bodysize)
NOP
No Operation. A pseudo-operation that has no effect.
This instruction is defined in the RV32I instruction set.
Syntax
NOP
This instruction is a pseudo-operation and expands to the following instruction sequence:
ADDI x0, x0, 0
LI
Load Immediate. Loads a 32b signed immediate value into a GPR. This uses ADDI and LUI, expanding to one or two instructions, depending on the immediate (small non-negative immediates or immediates with all lower bits zero can be loaded with just ADDI or LUI, respectively; general immediates need a LUI followed by an ADDI).
This instruction is defined in the RV32I instruction set.
Syntax
LI <grd>, <imm>
LA
Load absolute address. Loads an address given by a symbol into a GPR. This is represented as a LUI and an ADDI.
This instruction is defined in the RV32I instruction set.
Syntax
LA <grd>, <imm>
RET
Return from subroutine.
This instruction is defined in the RV32I instruction set.
Syntax
RET
This instruction is a pseudo-operation and expands to the following instruction sequence:
JALR x0, x1, 0
UNIMP
Illegal instruction. Triggers an illegal instruction error and aborts the program execution. Commonly used in code which is meant to be unreachable.
This instruction is defined in the RV32I instruction set.
Syntax
UNIMP
This instruction is a pseudo-operation and expands to the following instruction sequence:
CSRRW x0, 0xC00, x0
Big Number Instruction Subset
All Big Number (BN) instructions operate on the Wide Data Registers (WDRs).
BN.ADD
Add. Adds two WDR values, writes the result to the destination WDR and updates flags. The content of the second source WDR can be shifted by an unsigned immediate before it is consumed by the operation.
Errors
BN.ADD cannot cause any software errors.
Syntax
BN.ADD <wrd>, <wrs1>, <wrs2>[ <shift_type> <shift_bits>][, FG<flag_group>]
Operands
Operand | Description | ||||||
---|---|---|---|---|---|---|---|
|
Name of the destination WDR Decode as | ||||||
|
Name of the first source WDR Decode as | ||||||
|
Name of the second source WDR Decode as | ||||||
|
The direction of an optional shift applied to
Decode as | ||||||
|
Number of bits by which to shift Valid range: Decode as | ||||||
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.ADD | FG | ST | SB | WRS2 | WRS1 | 0 | 0 | 0 | WRD | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operand shift_type
is referred to by its integer value.
The operand table above shows how this corresponds to assembly syntax.
a = WDRs[wrs1]
b = WDRs[wrs2]
b_shifted = logical_byte_shift(b, shift_type, shift_bytes)
full_result = a + b_shifted
mask256 = (1 << 256) - 1
masked_result = full_result & mask256
carry_flag = bool((full_result >> 256) & 1)
flags = FlagReg.mlz_for_result(carry_flag, masked_result)
WDRs[wrd] ⇐ masked_result
FLAGs[flag_group] ⇐ flags
BN.ADDC
Add with Carry. Adds two WDR values and the Carry flag value, writes the result to the destination WDR, and updates the flags. The content of the second source WDR can be shifted by an unsigned immediate before it is consumed by the operation.
Errors
BN.ADDC cannot cause any software errors.
Syntax
BN.ADDC <wrd>, <wrs1>, <wrs2>[ <shift_type> <shift_bits>][, FG<flag_group>]
Operands
Operand | Description | ||||||
---|---|---|---|---|---|---|---|
|
Name of the destination WDR Decode as | ||||||
|
Name of the first source WDR Decode as | ||||||
|
Name of the second source WDR Decode as | ||||||
|
The direction of an optional shift applied to
Decode as | ||||||
|
Number of bits by which to shift Valid range: Decode as | ||||||
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.ADDC | FG | ST | SB | WRS2 | WRS1 | 0 | 1 | 0 | WRD | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operand shift_type
is referred to by its integer value.
The operand table above shows how this corresponds to assembly syntax.
a = WDRs[wrs1]
b = WDRs[wrs2]
b_shifted = logical_byte_shift(b, shift_type, shift_bytes)
carry = int(FLAGs[flag_group].C)
full_result = a + b_shifted + carry
mask256 = (1 << 256) - 1
masked_result = full_result & mask256
carry_flag = bool((full_result >> 256) & 1)
flags = FlagReg.mlz_for_result(carry_flag, masked_result)
WDRs[wrd] ⇐ masked_result
FLAGs[flag_group] ⇐ flags
BN.ADDI
Add Immediate. Adds a zero-extended unsigned immediate to the value of a WDR, writes the result to the destination WDR, and updates the flags.
Errors
BN.ADDI cannot cause any software errors.
Syntax
BN.ADDI <wrd>, <wrs>, <imm>[, FG<flag_group>]
Operands
Operand | Description |
---|---|
|
Name of the destination WDR Decode as |
|
Name of the source WDR Decode as |
|
Immediate value Valid range: Decode as |
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.ADDI | FG | 0 | IMM | WRS | 1 | 0 | 0 | WRD | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
Operation
a = WDRs[wrs]
b = imm
full_result = a + b
mask256 = (1 << 256) - 1
masked_result = full_result & mask256
carry_flag = bool((full_result >> 256) & 1)
flags = FlagReg.mlz_for_result(carry_flag, masked_result)
WDRs[wrd] ⇐ masked_result
FLAGs[flag_group] ⇐ flags
BN.ADDM
Pseudo-Modulo Add. Add two WDR values, modulo the MOD WSR.
The values in <wrs1>
and <wrs2>
are summed to get an intermediate result (of width WLEN + 1
).
If this result is greater than MOD then MOD is subtracted from it.
The result is then truncated to 256 bits and stored in <wrd>
.
This operation correctly implements addition modulo MOD, providing that the intermediate result is less than 2 * MOD
.
The intermediate result is small enough if both inputs are less than MOD
.
Flags are not used or saved.
Errors
BN.ADDM cannot cause any software errors.
Syntax
BN.ADDM <wrd>, <wrs1>, <wrs2>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.ADDM | 0 | WRS2 | WRS1 | 1 | 0 | 1 | WRD | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
Operation
a = WDRs[wrs1]
b = WDRs[wrs2]
result = a + b
mod_val = MOD
if result >= mod_val:
result -= mod_val
result = result & ((1 << 256) - 1)
WDRs[wrd] ⇐ result
BN.MULQACC
Quarter-word Multiply and Accumulate.
Multiplies two WLEN/4
WDR values, shifts the product by acc_shift_imm
bits, and adds the result to the accumulator.
For versions of the instruction with writeback, see BN.MULQACC.WO
and BN.MULQACC.SO
.
Errors
BN.MULQACC cannot cause any software errors.
Syntax
BN.MULQACC[<zero_acc>] <wrs1>.<wrs1_qwsel>, <wrs2>.<wrs2_qwsel>, <acc_shift_imm>
Operands
Operand | Description |
---|---|
|
Zero the accumulator before accumulating the multiply result. To specify, use the literal syntax Decode as |
|
First source WDR Decode as |
|
Quarter-word select for Valid values:
Valid range: Decode as |
|
Second source WDR Decode as |
|
Quarter-word select for Valid values:
Valid range: Decode as |
|
The number of bits to shift the Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.MULQACC | 0 | 0 | Q2 | Q1 | WRS2 | WRS1 | SHIFT | ZA | 0 | 1 | 1 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operand zero_acc
is referred to by its integer value.
The operand table above shows how this corresponds to assembly syntax.
a = WDRs[wrs1]
b = WDRs[wrs2]
a_qw = extract_quarter_word(a, wrs1_qwsel)
b_qw = extract_quarter_word(b, wrs2_qwsel)
mul_res = a_qw * b_qw
acc = ACC
if zero_acc:
acc = 0
acc += (mul_res << acc_shift_imm)
truncated = acc & ((1 << 256) - 1)
ACC ⇐ truncated
BN.MULQACC.WO
Quarter-word Multiply and Accumulate with full-word writeback.
Multiplies two WLEN/4
WDR values, shifts the product by acc_shift_imm
bits, and adds the result to the accumulator.
Writes the resulting accumulator to wrd
.
Errors
BN.MULQACC.WO cannot cause any software errors.
Syntax
BN.MULQACC.WO[<zero_acc>] <wrd>, <wrs1>.<wrs1_qwsel>, <wrs2>.<wrs2_qwsel>, <acc_shift_imm>[, FG<flag_group>]
Operands
Operand | Description |
---|---|
|
Zero the accumulator before accumulating the multiply result. To specify, use the literal syntax Decode as |
|
Destination WDR. Decode as |
|
First source WDR Decode as |
|
Quarter-word select for Valid values:
Valid range: Decode as |
|
Second source WDR Decode as |
|
Quarter-word select for Valid values:
Valid range: Decode as |
|
The number of bits to shift the Valid range: Decode as |
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.MULQACC.WO | FG | 0 | 1 | Q2 | Q1 | WRS2 | WRS1 | SHIFT | ZA | WRD | 0 | 1 | 1 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operand zero_acc
is referred to by its integer value.
The operand table above shows how this corresponds to assembly syntax.
a = WDRs[wrs1]
b = WDRs[wrs2]
a_qw = extract_quarter_word(a, wrs1_qwsel)
b_qw = extract_quarter_word(b, wrs2_qwsel)
mul_res = a_qw * b_qw
acc = ACC
if zero_acc:
acc = 0
acc += (mul_res << acc_shift_imm)
truncated = acc & ((1 << 256) - 1)
WDRs[wrd] ⇐ truncated
ACC ⇐ truncated
state.set_mlz_flags(flag_group, truncated)
BN.MULQACC.SO
Quarter-word Multiply and Accumulate with half-word writeback.
Multiplies two WLEN/4
WDR values, shifts the product by acc_shift_imm
bits and adds the result to the accumulator.
Next, shifts the resulting accumulator right by half a word (128 bits).
The bits that are shifted out are written to a half-word of wrd
, selected with wrd_hwsel
.
This instruction never changes the C
flag.
If wrd_hwsel
is zero (so the instruction is updating the lower half-word of wrd
), it updates the L
and Z
flags and leaves M
unchanged.
The L
flag is set iff the bottom bit of the shifted-out result is zero.
The Z
flag is set iff the shifted-out result is zero.
If wrd_hwsel
is one (so the instruction is updating the upper half-word of wrd
), it updates the M
and Z
flags and leaves L
unchanged.
The M
flag is set iff the top bit of the shifted-out result is zero.
The Z
flag is left unchanged if the shifted-out result is zero and cleared if not.
Errors
BN.MULQACC.SO cannot cause any software errors.
Syntax
BN.MULQACC.SO[<zero_acc>] <wrd>.<wrd_hwsel>, <wrs1>.<wrs1_qwsel>, <wrs2>.<wrs2_qwsel>, <acc_shift_imm>[, FG<flag_group>]
Operands
Operand | Description | ||||||
---|---|---|---|---|---|---|---|
|
Zero the accumulator before accumulating the multiply result. To specify, use the literal syntax Decode as | ||||||
|
Updated WDR. Decode as | ||||||
|
Half-word select for
Decode as | ||||||
|
First source WDR Decode as | ||||||
|
Quarter-word select for Valid values:
Valid range: Decode as | ||||||
|
Second source WDR Decode as | ||||||
|
Quarter-word select for Valid values:
Valid range: Decode as | ||||||
|
The number of bits to shift the Valid range: Decode as | ||||||
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.MULQACC.SO | FG | 1 | DH | Q2 | Q1 | WRS2 | WRS1 | SHIFT | ZA | WRD | 0 | 1 | 1 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operands zero_acc
and wrd_hwsel
are referred to by their integer value.
The operand table above shows how this corresponds to assembly syntax.
a = WDRs[wrs1]
b = WDRs[wrs2]
a_qw = extract_quarter_word(a, wrs1_qwsel)
b_qw = extract_quarter_word(b, wrs2_qwsel)
mul_res = a_qw * b_qw
acc = ACC
if zero_acc:
acc = 0
acc += (mul_res << acc_shift_imm)
truncated = acc & ((1 << 256) - 1)
# Split the result into low and high parts
lo_part = truncated & ((1 << 128) - 1)
hi_part = truncated >> 128
# Shift out the low part of the result
hw_shift = 128 * wrd_hwsel
hw_mask = ((1 << 128) - 1) << hw_shift
old_wrd = WDRs[wrd]
new_wrd = (old_wrd & ~hw_mask) | (lo_part << hw_shift)
WDRs[wrd] ⇐ new_wrd
# Write back the high part of the result
ACC ⇐ hi_part
old_flags = FLAGs[flag_group]
if wrd_hwsel:
new_flags = FlagReg(C=old_flags.C,
M=bool((lo_part >> 127) & 1),
L=old_flags.L,
Z=old_flags.Z and lo_part == 0)
else:
new_flags = FlagReg(C=old_flags.C,
M=old_flags.M,
L=bool(lo_part & 1),
Z=lo_part == 0)
FLAGs[flag_group] ⇐ new_flags
BN.SUB
Subtraction. Subtracts the second WDR value from the first one, writes the result to the destination WDR and updates flags. The content of the second source WDR can be shifted by an unsigned immediate before it is consumed by the operation.
Errors
BN.SUB cannot cause any software errors.
Syntax
BN.SUB <wrd>, <wrs1>, <wrs2>[ <shift_type> <shift_bits>][, FG<flag_group>]
Operands
Operand | Description | ||||||
---|---|---|---|---|---|---|---|
|
Name of the destination WDR Decode as | ||||||
|
Name of the first source WDR Decode as | ||||||
|
Name of the second source WDR Decode as | ||||||
|
The direction of an optional shift applied to
Decode as | ||||||
|
Number of bits by which to shift Valid range: Decode as | ||||||
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.SUB | FG | ST | SB | WRS2 | WRS1 | 0 | 0 | 1 | WRD | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operand shift_type
is referred to by its integer value.
The operand table above shows how this corresponds to assembly syntax.
a = WDRs[wrs1]
b = WDRs[wrs2]
b_shifted = logical_byte_shift(b, shift_type, shift_bytes)
full_result = a - b_shifted
mask256 = (1 << 256) - 1
masked_result = full_result & mask256
carry_flag = bool((full_result >> 256) & 1)
flags = FlagReg.mlz_for_result(carry_flag, masked_result)
WDRs[wrd] ⇐ masked_result
FLAGs[flag_group] ⇐ flags
BN.SUBB
Subtract with borrow. Subtracts the second WDR value and the Carry from the first one, writes the result to the destination WDR, and updates the flags. The content of the second source WDR can be shifted by an unsigned immediate before it is consumed by the operation.
Syntax
BN.SUBB <wrd>, <wrs1>, <wrs2>[ <shift_type> <shift_bits>][, FG<flag_group>]
Operands
Operand | Description | ||||||
---|---|---|---|---|---|---|---|
|
Name of the destination WDR Decode as | ||||||
|
Name of the first source WDR Decode as | ||||||
|
Name of the second source WDR Decode as | ||||||
|
The direction of an optional shift applied to
Decode as | ||||||
|
Number of bits by which to shift Valid range: Decode as | ||||||
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.SUBB | FG | ST | SB | WRS2 | WRS1 | 0 | 1 | 1 | WRD | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operand shift_type
is referred to by its integer value.
The operand table above shows how this corresponds to assembly syntax.
a = WDRs[wrs1]
b = WDRs[wrs2]
b_shifted = logical_byte_shift(b, shift_type, shift_bytes)
borrow = int(FLAGs[flag_group].C)
full_result = a - b_shifted - borrow
mask256 = (1 << 256) - 1
masked_result = full_result & mask256
carry_flag = bool((full_result >> 256) & 1)
flags = FlagReg.mlz_for_result(carry_flag, masked_result)
WDRs[wrd] ⇐ masked_result
FLAGs[flag_group] ⇐ flags
BN.SUBI
Subtract Immediate. Subtracts a zero-extended unsigned immediate from the value of a WDR, writes the result to the destination WDR, and updates the flags.
Errors
BN.SUBI cannot cause any software errors.
Syntax
BN.SUBI <wrd>, <wrs>, <imm>[, FG<flag_group>]
Operands
Operand | Description |
---|---|
|
Name of the destination WDR Decode as |
|
Name of the source WDR Decode as |
|
Immediate value Valid range: Decode as |
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.SUBI | FG | 1 | IMM | WRS | 1 | 0 | 0 | WRD | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
Operation
a = WDRs[wrs]
b = imm
full_result = a - b
mask256 = (1 << 256) - 1
masked_result = full_result & mask256
carry_flag = bool((full_result >> 256) & 1)
flags = FlagReg.mlz_for_result(carry_flag, masked_result)
WDRs[wrd] ⇐ masked_result
FLAGs[flag_group] ⇐ flags
BN.SUBM
Pseudo-modulo subtraction.
Subtract <wrs2>
from <wrs1>
, modulo the MOD
WSR.
The intermediate result is treated as a signed number (of width WLEN + 1
).
If it is negative, MOD
is added to it.
The 2’s-complement result is then truncated to 256 bits and stored in <wrd>
.
This operation correctly implements subtraction modulo MOD
, providing that the intermediate result at least -MOD
and at most MOD - 1
.
This is guaranteed if both inputs are less than MOD
.
Flags are not used or saved.
Errors
BN.SUBM cannot cause any software errors.
Syntax
BN.SUBM <wrd>, <wrs1>, <wrs2>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.SUBM | 1 | WRS2 | WRS1 | 1 | 0 | 1 | WRD | 0 | 1 | 0 | 1 | 0 | 1 | 1 |
Operation
a = WDRs[wrs1]
b = WDRs[wrs2]
mod_val = MOD
diff = a - b
if diff < 0:
diff += mod_val
result = diff & ((1 << 256) - 1)
WDRs[wrd] ⇐ result
BN.AND
Bitwise AND.
Performs a bitwise and operation.
Takes the values stored in registers referenced by wrs1
and wrs2
and stores the result in the register referenced by wrd
.
The content of the second source register can be shifted by an immediate before it is consumed by the operation.
The M, L and Z flags in flag group flag_group
are updated with the result of the operation.
Errors
BN.AND cannot cause any software errors.
Syntax
BN.AND <wrd>, <wrs1>, <wrs2>[ <shift_type> <shift_bits>][, FG<flag_group>]
Operands
Operand | Description | ||||||
---|---|---|---|---|---|---|---|
|
Name of the destination WDR Decode as | ||||||
|
Name of the first source WDR Decode as | ||||||
|
Name of the second source WDR Decode as | ||||||
|
The direction of an optional shift applied to
Decode as | ||||||
|
Number of bits by which to shift Valid range: Decode as | ||||||
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.AND | FG | ST | SB | WRS2 | WRS1 | 0 | 1 | 0 | WRD | 1 | 1 | 1 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operand shift_type
is referred to by its integer value.
The operand table above shows how this corresponds to assembly syntax.
a = WDRs[wrs1]
b = WDRs[wrs2]
b_shifted = logical_byte_shift(b, shift_type, shift_bytes)
result = a & b_shifted
WDRs[wrd] ⇐ result
state.set_mlz_flags(flag_group, result)
BN.OR
Bitwise OR.
Performs a bitwise or operation.
Takes the values stored in WDRs referenced by wrs1
and wrs2
and stores the result in the WDR referenced by wrd
.
The content of the second source WDR can be shifted by an immediate before it is consumed by the operation.
The M, L and Z flags in flag group flag_group
are updated with the result of the operation.
Errors
BN.OR cannot cause any software errors.
Syntax
BN.OR <wrd>, <wrs1>, <wrs2>[ <shift_type> <shift_bits>][, FG<flag_group>]
Operands
Operand | Description | ||||||
---|---|---|---|---|---|---|---|
|
Name of the destination WDR Decode as | ||||||
|
Name of the first source WDR Decode as | ||||||
|
Name of the second source WDR Decode as | ||||||
|
The direction of an optional shift applied to
Decode as | ||||||
|
Number of bits by which to shift Valid range: Decode as | ||||||
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.OR | FG | ST | SB | WRS2 | WRS1 | 1 | 0 | 0 | WRD | 1 | 1 | 1 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operand shift_type
is referred to by its integer value.
The operand table above shows how this corresponds to assembly syntax.
a = WDRs[wrs1]
b = WDRs[wrs2]
b_shifted = logical_byte_shift(b, shift_type, shift_bytes)
result = a | b_shifted
WDRs[wrd] ⇐ result
state.set_mlz_flags(flag_group, result)
BN.NOT
Bitwise NOT.
Negates the value in wrs
and stores the result in the register referenced by wrd
.
The source value can be shifted by an immediate before it is consumed by the operation.
The M, L and Z flags in flag group flag_group
are updated with the result of the operation.
Errors
BN.NOT cannot cause any software errors.
Syntax
BN.NOT <wrd>, <wrs>[ <shift_type> <shift_bits>][, FG<flag_group>]
Operands
Operand | Description | ||||||
---|---|---|---|---|---|---|---|
|
Name of the destination WDR Decode as | ||||||
|
Name of the source WDR Decode as | ||||||
|
The direction of an optional shift applied to
Decode as | ||||||
|
Number of bits by which to shift Valid range: Decode as | ||||||
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.NOT | FG | ST | SB | WRS | 1 | 0 | 1 | WRD | 1 | 1 | 1 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operand shift_type
is referred to by its integer value.
The operand table above shows how this corresponds to assembly syntax.
a = WDRs[wrs]
a_shifted = logical_byte_shift(a, shift_type, shift_bytes)
result = a_shifted ^ ((1 << 256) - 1)
WDRs[wrd] ⇐ result
state.set_mlz_flags(flag_group, result)
BN.XOR
Bitwise XOR.
Performs a bitwise xor operation.
Takes the values stored in WDRs referenced by wrs1
and wrs2
and stores the result in the WDR referenced by wrd
.
The content of the second source WDR can be shifted by an immediate before it is consumed by the operation.
The M, L and Z flags in flag group flag_group
are updated with the result of the operation.
Errors
BN.XOR cannot cause any software errors.
Syntax
BN.XOR <wrd>, <wrs1>, <wrs2>[ <shift_type> <shift_bits>][, FG<flag_group>]
Operands
Operand | Description | ||||||
---|---|---|---|---|---|---|---|
|
Name of the destination WDR Decode as | ||||||
|
Name of the first source WDR Decode as | ||||||
|
Name of the second source WDR Decode as | ||||||
|
The direction of an optional shift applied to
Decode as | ||||||
|
Number of bits by which to shift Valid range: Decode as | ||||||
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.XOR | FG | ST | SB | WRS2 | WRS1 | 1 | 1 | 0 | WRD | 1 | 1 | 1 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operand shift_type
is referred to by its integer value.
The operand table above shows how this corresponds to assembly syntax.
a = WDRs[wrs1]
b = WDRs[wrs2]
b_shifted = logical_byte_shift(b, shift_type, shift_bytes)
result = a ^ b_shifted
WDRs[wrd] ⇐ result
state.set_mlz_flags(flag_group, result)
BN.RSHI
Concatenate and right shift immediate.
Concatenates the content of WDRs referenced by wrs1
and wrs2
(wrs1
forms the upper part), shifts it right by an immediate value and truncates to WLEN bit.
The result is stored in the WDR referenced by wrd
.
Errors
BN.RSHI cannot cause any software errors.
Syntax
BN.RSHI <wrd>, <wrs1>, <wrs2> >> <imm>
Operands
Operand | Description |
---|---|
|
Name of the destination WDR Decode as |
|
Name of the first source WDR Decode as |
|
Name of the second source WDR Decode as |
|
Number of bits to shift the second source register by. Valid range: 0..(WLEN-1). Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.RSHI | IMM_1 | WRS2 | WRS1 | IMM_0 | 1 | 1 | WRD | 1 | 1 | 1 | 1 | 0 | 1 | 1 |
Operation
a = WDRs[wrs1]
b = WDRs[wrs2]
result = (((a << 256) | b) >> imm) & ((1 << 256) - 1)
WDRs[wrd] ⇐ result
BN.SEL
Flag Select. Returns in the destination WDR the value of the first source WDR if the flag in the chosen flag group is set, otherwise returns the value of the second source WDR.
Errors
BN.SEL cannot cause any software errors.
Syntax
BN.SEL <wrd>, <wrs1>, <wrs2>, [FG<flag_group>.]<flag>
Operands
Operand | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
|
Name of the destination WDR Decode as | ||||||||||
|
Name of the first source WDR Decode as | ||||||||||
|
Name of the second source WDR Decode as | ||||||||||
|
Flag group to use. Defaults to 0. Valid range: Decode as | ||||||||||
|
Flag to check. Valid values:
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.SEL | FG | FLAG | WRS2 | WRS1 | 0 | 0 | 0 | WRD | 0 | 0 | 0 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operand flag
is referred to by its integer value.
The operand table above shows how this corresponds to assembly syntax.
flag_is_set = FLAGs[flag_group].get_by_idx(flag)
wrs = wrs1 if flag_is_set else wrs2
value = WDRs[wrs]
WDRs[wrd] ⇐ value
BN.CMP
Compare. Subtracts the second WDR value from the first one and updates flags. This instruction is identical to BN.SUB, except that no result register is written.
Errors
BN.CMP cannot cause any software errors.
Syntax
BN.CMP <wrs1>, <wrs2>[ <shift_type> <shift_bits>][, FG<flag_group>]
Operands
Operand | Description | ||||||
---|---|---|---|---|---|---|---|
|
Name of the first source WDR Decode as | ||||||
|
Name of the second source WDR Decode as | ||||||
|
The direction of an optional shift applied to
Decode as | ||||||
|
Number of bits by which to shift Valid range: Decode as | ||||||
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.CMP | FG | ST | SB | WRS2 | WRS1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operand shift_type
is referred to by its integer value.
The operand table above shows how this corresponds to assembly syntax.
a = WDRs[wrs1]
b = WDRs[wrs2]
b_shifted = logical_byte_shift(b, shift_type, shift_bytes)
full_result = a - b_shifted
mask256 = (1 << 256) - 1
masked_result = full_result & mask256
carry_flag = bool((full_result >> 256) & 1)
flags = FlagReg.mlz_for_result(carry_flag, masked_result)
FLAGs[flag_group] ⇐ flags
BN.CMPB
Compare with Borrow. Subtracts the second WDR value from the first one and updates flags. This instruction is identical to BN.SUBB, except that no result register is written.
Errors
BN.CMPB cannot cause any software errors.
Syntax
BN.CMPB <wrs1>, <wrs2>[ <shift_type> <shift_bits>][, FG<flag_group>]
Operands
Operand | Description | ||||||
---|---|---|---|---|---|---|---|
|
Name of the first source WDR Decode as | ||||||
|
Name of the second source WDR Decode as | ||||||
|
The direction of an optional shift applied to
Decode as | ||||||
|
Number of bits by which to shift Valid range: Decode as | ||||||
|
Flag group to use. Defaults to 0. Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.CMPB | FG | ST | SB | WRS2 | WRS1 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operand shift_type
is referred to by its integer value.
The operand table above shows how this corresponds to assembly syntax.
a = WDRs[wrs1]
b = WDRs[wrs2]
b_shifted = logical_byte_shift(b, shift_type, shift_bytes)
borrow = int(FLAGs[flag_group].C)
full_result = a - b_shifted - borrow
mask256 = (1 << 256) - 1
masked_result = full_result & mask256
carry_flag = bool((full_result >> 256) & 1)
flags = FlagReg.mlz_for_result(carry_flag, masked_result)
FLAGs[flag_group] ⇐ flags
BN.LID
Load Word (indirect source, indirect destination). Load a WLEN-bit little-endian value from data memory.
The load address is offset
plus the value in the GPR grs1
.
The loaded value is stored into the WDR given by the bottom 5 bits of the GPR grd
.
After the operation, either the value in the GPR grs1
, or the value in grd
can be optionally incremented.
Specifying both grd_inc
and grs1_inc
results in an error (with error code ErrCodeIllegalInsn
).
- If
grd_inc
is set,grd
is updated to be*grd + 1
. - If
grs1_inc
is set, the value ingrs1
is incremented by value WLEN/8 (one word).
The memory address must be aligned to WLEN bits.
Any address that is unaligned or is above the top of memory results in an error (setting bit bad_data_addr
in ERR_BITS
).
Any *grd
value greater than 31 before executing the instruction results in an error (setting bit illegal_insn
in ERR_BITS
) and no load or optional increment occurring.
This instruction takes 2 cycles.
Errors
BN.LID might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
orgrd
when the call stack is empty. - An
ILLEGAL_INSN
error if bothgrd_inc
andgrs1_inc
are set. - An
ILLEGAL_INSN
error if the value in GPRgrd
is greater than 31. - A
BAD_DATA_ADDR
error if the computed address is not a valid DMEM address aligned to WLEN bits.
Syntax
BN.LID <grd>[<grd_inc>], <offset>(<grs1>[<grs1_inc>])
Operands
Operand | Description |
---|---|
|
Name of the GPR referencing the destination WDR Decode as |
|
Name of the GPR containing the memory byte address. The value contained in the referenced GPR must be WLEN-aligned. Decode as |
|
Offset value. Must be WLEN-aligned. Valid range: Decode as |
|
Increment the value in To specify, use the literal syntax Decode as |
|
Increment the value in To specify, use the literal syntax Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.LID | OFF_0 | GRD | GRS1 | 1 | 0 | 0 | OFF_1 | INC1 | INCD | 0 | 0 | 0 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operands grs1_inc
and grd_inc
are referred to by their integer value.
The operand table above shows how this corresponds to assembly syntax.
# BN.LID executes over two cycles. On the first cycle, we read the base
# address, compute the load address and check it for correctness,
# increment any GPRs, then perform the load itself. On the second
# cycle, update the WDR with the result.
if grs1_inc and grd_inc:
state.stop_at_end_of_cycle(ErrBits.ILLEGAL_INSN)
return None
grs1_val = GPRs[grs1]
addr = (grs1_val + offset) & ((1 << 32) - 1)
grd_val = GPRs[grd]
bad_grs1 = state.gprs.call_stack_err and (grs1 == 1)
bad_grd = state.gprs.call_stack_err and (grd == 1)
saw_err = False
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
saw_err = True
if grd_val > 31 and not bad_grd:
state.stop_at_end_of_cycle(ErrBits.ILLEGAL_INSN)
saw_err = True
if not DMEM.is_valid_256b_addr(addr) and not bad_grs1:
state.stop_at_end_of_cycle(ErrBits.BAD_DATA_ADDR)
saw_err = True
if saw_err:
return None
wrd = grd_val & 0x1f
value = DMEM.load_u256(addr)
if grd_inc:
new_grd_val = grd_val + 1
GPRs[grd] ⇐ new_grd_val
if grs1_inc:
new_grs1_val = (grs1_val + 32) & ((1 << 32) - 1)
GPRs[grs1] ⇐ new_grs1_val
# Stall for a single cycle for memory to respond
yield None
if value is None:
state.stop_at_end_of_cycle(ErrBits.DMEM_INTG_VIOLATION)
return None
WDRs[wrd] ⇐ value
return None
BN.SID
Store Word (indirect source, indirect destination). Store a WDR to memory as a WLEN-bit little-endian value.
The store address is offset
plus the value in the GPR grs1
.
The value to store is taken from the WDR given by the bottom 5 bits of the GPR grs2
.
After the operation, either the value in the GPR grs1
, or the value in grs2
can be optionally incremented.
Specifying both grs1_inc
and grs2_inc
results in an error (with error code ErrCodeIllegalInsn
).
- If
grs1_inc
is set, the value ingrs1
is incremented by the value WLEN/8 (one word). - If
grs2_inc
is set, the value ingrs2
is updated to be*grs2 + 1
.
The memory address must be aligned to WLEN bits.
Any address that is unaligned or is above the top of memory results in an error (setting bit bad_data_addr
in ERR_BITS
).
Any *grs2
value greater than 31 before executing the instruction results in an error (setting bit illegal_insn
in ERR_BITS
) and no store or optional increment occurring.
Errors
BN.SID might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs1
orgrs2
when the call stack is empty. - An
ILLEGAL_INSN
error if bothgrs1_inc
andgrs2_inc
are set. - An
ILLEGAL_INSN
error if the value in GPRgrs2
is greater than 31. - A
BAD_DATA_ADDR
error if the computed address is not a valid DMEM address aligned to WLEN bits.
Syntax
BN.SID <grs2>[<grs2_inc>], <offset>(<grs1>[<grs1_inc>])
Operands
Operand | Description |
---|---|
|
Name of the GPR containing the memory byte address. The value contained in the referenced GPR must be WLEN-aligned. Decode as |
|
Name of the GPR referencing the source WDR. Decode as |
|
Offset value. Must be WLEN-aligned. Valid range: Decode as |
|
Increment the value in To specify, use the literal syntax Decode as |
|
Increment the value in To specify, use the literal syntax Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.SID | OFF_0 | GRS2 | GRS1 | 1 | 0 | 1 | OFF_1 | INC1 | INC2 | 0 | 0 | 0 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operands grs1_inc
and grs2_inc
are referred to by their integer value.
The operand table above shows how this corresponds to assembly syntax.
if grs1_inc and grs2_inc:
state.stop_at_end_of_cycle(ErrBits.ILLEGAL_INSN)
return None
grs1_val = GPRs[grs1]
addr = (grs1_val + offset) & ((1 << 32) - 1)
grs2_val = GPRs[grs2]
bad_grs1 = state.gprs.call_stack_err and (grs1 == 1)
bad_grs2 = state.gprs.call_stack_err and (grs2 == 1)
saw_err = False
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
saw_err = True
if grs2_val > 31 and not bad_grs2:
state.stop_at_end_of_cycle(ErrBits.ILLEGAL_INSN)
saw_err = True
if not DMEM.is_valid_256b_addr(addr) and not bad_grs1:
state.stop_at_end_of_cycle(ErrBits.BAD_DATA_ADDR)
saw_err = True
if saw_err:
return None
if grs1_inc:
new_grs1_val = (grs1_val + 32) & ((1 << 32) - 1)
GPRs[grs1] ⇐ new_grs1_val
if grs2_inc:
new_grs2_val = grs2_val + 1
GPRs[grs2] ⇐ new_grs2_val
yield None
wrs = grs2_val & 0x1f
wrs_val = WDRs[wrs]
DMEM.store_u256(addr, wrs_val)
return None
BN.MOV
Copy content between WDRs (direct addressing).
Errors
BN.MOV cannot cause any software errors.
Syntax
BN.MOV <wrd>, <wrs>
Operands
Operand | Description |
---|---|
|
Decode as |
|
Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.MOV | 0 | WRS | 1 | 1 | 0 | WRD | 0 | 0 | 0 | 1 | 0 | 1 | 1 |
Operation
value = WDRs[wrs]
WDRs[wrd] ⇐ value
BN.MOVR
Copy content between WDRs (register-indirect addressing). Copies WDR contents between registers with indirect addressing.
After the operation, either the value in the GPR grd
, or the value in grs
can be optionally incremented.
Specifying both grd_inc
and grs_inc
results in an error (with error code ErrCodeIllegalInsn
).
- If
grd_inc
is set,grd
is updated to be*grd + 1
. - If
grs_inc
is set,grs
is updated to be*grs + 1
.
Any *grd
or *grs
value greater than 31 results in an error (setting bit illegal_insn
in ERR_BITS
)
Errors
BN.MOVR might cause the following software errors:
- A
CALL_STACK
error from usingx1
asgrs
orgrd
when the call stack is empty. - An
ILLEGAL_INSN
error if either the value in GPRgrd
or the value in GPRgrs
is greater than 31. - An
ILLEGAL_INSN
error if bothgrs_inc
andgrd_inc
are set.
Syntax
BN.MOVR <grd>[<grd_inc>], <grs>[<grs_inc>]
Operands
Operand | Description |
---|---|
|
Name of the GPR containing the destination WDR. Decode as |
|
Name of the GPR referencing the source WDR. Decode as |
|
Increment the value in To specify, use the literal syntax Decode as |
|
Increment the value in To specify, use the literal syntax Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.MOVR | 1 | GRD | GRS | 1 | 1 | 0 | GRS_INC | GRD_INC | 0 | 0 | 0 | 1 | 0 | 1 | 1 |
Operation
In the listing below, operands grd_inc
and grs_inc
are referred to by their integer value.
The operand table above shows how this corresponds to assembly syntax.
if grs_inc and grd_inc:
state.stop_at_end_of_cycle(ErrBits.ILLEGAL_INSN)
return None
grd_val = GPRs[grd]
grs_val = GPRs[grs]
bad_grs = state.gprs.call_stack_err and (grs == 1)
bad_grd = state.gprs.call_stack_err and (grd == 1)
saw_err = False
if state.gprs.call_stack_err:
state.stop_at_end_of_cycle(ErrBits.CALL_STACK)
saw_err = True
if grd_val > 31 and not bad_grd:
state.stop_at_end_of_cycle(ErrBits.ILLEGAL_INSN)
saw_err = True
if grs_val > 31 and not bad_grs:
state.stop_at_end_of_cycle(ErrBits.ILLEGAL_INSN)
saw_err = True
if saw_err:
return None
wrd = grd_val & 0x1f
wrs = grs_val & 0x1f
if grd_inc:
new_grd_val = grd_val + 1
GPRs[grd] ⇐ new_grd_val
if grs_inc:
new_grs_val = grs_val + 1
GPRs[grs] ⇐ new_grs_val
yield None
value = WDRs[wrs]
WDRs[wrd] ⇐ value
return None
BN.WSRR
Read WSR to register.
Reads a WSR to a WDR.
If wsr
isn’t the index of a valid WSR, this results in an error (setting bit illegal_insn
in ERR_BITS
).
Errors
BN.WSRR might cause the following software errors:
- An
ILLEGAL_INSN
error ifwsr
doesn’t name a valid WSR.
Syntax
BN.WSRR <wrd>, <wsr>
Operands
Operand | Description |
---|---|
|
Destination WDR Decode as |
|
The WSR to read Valid range: Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.WSRR | 0 | WSR | 1 | 1 | 1 | WRD | 0 | 0 | 0 | 1 | 0 | 1 | 1 |
Operation
# The first, and possibly only, cycle of execution.
if not state.wsrs.check_idx(wsr):
# Invalid WSR index. Stop with an illegal instruction error.
state.stop_at_end_of_cycle(ErrBits.ILLEGAL_INSN)
return None
if wsr == 0x1:
# A read from RND. If a RND value is not available, request_value()
# initiates or continues an EDN request and returns False. If a RND
# value is available, it returns True.
while not state.wsrs.RND.request_value():
# There's a pending EDN request. Stall for a cycle.
yield None
# At this point, the WSR is ready. Does it have a valid value? (It
# might not if this is a sideload key register and keymgr hasn't
# provided us with a value). If not, fail with a KEY_INVALID error.
if not state.wsrs.has_value_at_idx(wsr):
state.stop_at_end_of_cycle(ErrBits.KEY_INVALID)
return None
# The WSR is ready and has a value. Read it.
val = WSRs[wsr]
WDRs[wrd] ⇐ val
return None
BN.WSRW
Write WSR from register.
Writes a WDR to a WSR.
If wsr
isn’t the index of a valid WSR, this results in an error (setting bit illegal_insn
in ERR_BITS
).
Errors
BN.WSRW might cause the following software errors:
- An
ILLEGAL_INSN
error ifwsr
doesn’t name a valid WSR.
Syntax
BN.WSRW <wsr>, <wrs>
Operands
Operand | Description |
---|---|
|
The WSR to read Valid range: Decode as |
|
Source WDR Decode as |
Encoding
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |
BN.WSRW | 1 | WSR | WRS | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 1 |
Operation
val = WDRs[wrs]
WSRs[wsr] ⇐ val