Mask ROM Shutdown Module

Objective

The Shutdown module is responsible for securely shutting down the OpenTitan chip in the event of an unrecoverable fault during the secure boot process. A secure shutdown should result in a system end-state where a chip reset is the only viable option.

Dependencies

Hardware:

  • Alert Handler
  • OTP Controller & Shutdown OTP policy words defined.
  • Lifecycle controller

Software:

  • Alert Handler Driver
  • OTP Driver
  • PMP
  • Lifecycle Driver

Implementation Details

The shutdown module will make use of the alert handler to detect shutdown-worthy events within the hardware. Hardware and software shutdown events will use the alert manager to cause an escalation resulting in chip shutdown.

Alert Class Specification

The Mask ROM will categorize shutdown events into the categories shown below. OTP words will be used to configure how the ROM will configure Alert Handler’s response to each class of alert.

Alert Handler ClassSeverityDescription
Class AFatalChip cannot boot. Immediate shutdown & reboot recommended.
Class BNear-FatalIntrusion detected. Immediate shutdown & reboot recommended.
Class CSevereMalfunction detected.
Class DMinorRecoverable error detected.

Alert Escalation Per-Class

The ROM’s alert escalation policy will be defined by OTP words. The suggested default policy is detailed in the following table. Class A configured alerts will have their configuration locked by the Mask ROM. All other classes may be reconfigured and locked later by either ROM_EXT or owner code.

Alert Handler ClassEscalationEnabled by ROMLocked by ROM
Class AShutdownYesYes
Class BShutdownYesNo
Class CNo ResponseNoNo
Class DNo ResponseNoNo
Class XDisabledNoNo

Note: Class X isn’t a real alert handler classification. It refers to a configuration where an alert is neither classified nor armed.

Alert Escalation Phase Actions

The alert handler allows each alert class to enable a number of escalation phases which map to certain actions. The specific action at each escalation phase (0-3) is left as a top-level integration decision. In the OpenTitan first silicon, the escalation actions are detailed in the following table.

Escalation PhaseAction
Phase 0NMI to CPU
Phase 1Wipe Secrets
Phase 2Virtual Scrap
Phase 3Reset

Alert Escalation Phase Configuration

For each class of alert, there are tunable counts and timers which control how the alert escalation starts and proceeds through the escalation phases. For each class of alert, we’ll provide OTP configurations for the following:

NameClass AClass BClass C(No response)Class D(No response)
Accumulation Threshold0000
Timeout Cycles0000
Phase 0 Cycles0000
Phase 1 Cycles101000
Phase 2 Cycles101000
Phase 3 Cycles0xFFFFFFFF0xFFFFFFFF00

Alert Event Classification

There are approximately 30 different alert sources for the Alert Handler. The ROM will use OTP words to assign each of these alerts a class, and thus, an escalation response. The escalation response can vary based on the lifecycle state of the chip. The suggested default classification is detailed in the following table.

NameIDRAW/TESTPRODPROD_ENDDEVRMASCRAP
Uart0FatalFault0XCCXXX
Uart1FatalFault1XCCXXX
Uart2FatalFault2XCCXXX
Uart3FatalFault3XCCXXX
GpioFatalFault4XCCXXX
SpiDeviceFatalFault5XCCXXX
SpiHost0FatalFault6XCCDXX
SpiHost1FatalFault7XCCDXX
I2c0FatalFault8XCCXXX
I2c1FatalFault9XCCXXX
I2c2FatalFault10XCCXXX
PattgenFatalFault11XCCXXX
RvTimerFatalFault12XCCXXX
UsbdevFatalFault13XCCXXX
OtpCtrlFatalMacroError14XAADXX
OtpCtrlFatalCheckError15XAADXX
OtpCtrlFatalBusIntegError16XAADXX
LcCtrlFatalProgError17XAADXX
LcCtrlFatalStateError18XAADXX
LcCtrlFatalBusIntegError19XAADXX
PwrmgrAonFatalFault20XCCXXX
RstmgrAonFatalFault21XCCXXX
ClkmgrAonRecovFault22XCCXXX
ClkmgrAonFatalFault23XCCXXX
SysrstCtrlAonFatalFault24XCCXXX
AdcCtrlAonFatalFault25XCCXXX
PwmAonFatalFault26XCCXXX
PinmuxAonFatalFault27XCCXXX
AonTimerAonFatalFault28XCCXXX
SensorCtrlRecovAlert29XCCXXX
SensorCtrlFatalAlert30XCCXXX
SramCtrlRetAonFatalIntgError31XBBDXX
FlashCtrlRecovErr32XDDDXX
FlashCtrlFatalErr33XAADXX
RvDmFatalFault34XCCXXX
RvPlicFatalFault35XCCXXX
AesRecovCtrlUpdateErr36XDDDXX
AesFatalFault37XAADXX
HmacFatalFault38XAADXX
KmacFatalFault39XAADXX
KeymgrFatalFaultErr40XAADXX
KeymgrRecovOperationErr41XDDDXX
CsrngRecovAlert42XDDDXX
CsrngFatalAlert43XAADXX
EntropySrcRecovAlert44XDDDXX
EntropySrcFatalAlert45XAADXX
Edn0FatalAlert46XAADXX
Edn0RecovAlert47XDDDXX
Edn1FatalAlert48XAADXX
Edn1RecovAlert49XDDDXX
SramCtrlMainFatalError50XAADXX
OtbnFatal51XAADXX
OtbnRecov52XDDDXX
RomCtrlFatal53XAADXX
RvCoreIbexFatalSwErr54XAADXX
RvCoreIbexRecovSwErr55XDDDXX
RvCoreIbexFatalHwErr56XAADXX
RvCoreIbexRecovHwErr57XDDDXX
NameIDRAW/TESTPRODPROD_ENDDEVRMASCRAP
Alert Ping FailXXX
Alert Escalation FailXXX
Alert Integrity FailXXX
Integrity Escalation FailXXX
Bus Integrity FailureXXX
Shadow Reg Update FailureXXX
Shadow Reg Storage ErrorXXX

Notes

  • In the RAW and TEST states, OTP may not be programmed yet.
  • In the RMA state, the chip has been returned for failure analysis. No alerts should be configured.
  • In the SCRAP state the CPU cannot execute code and the cryptographic peripherals cannot perform any operations so alert configuration is irrelevant.

Mask ROM (software) Events

There are a number of software conditions which may cause the Mask ROM to need to shut down the chip. These shutdown events will cause the Mask ROM to use the software alert mechanism to shut down the chip. In addition to scheduling a software alert, the shutdown module will make a best-effort to place the chip into a non-functional state.

These events include:

  • No bootable image in either flash partition.
  • Boot Policy forbids booting (e.g. policy says boot “A” with no fallback to B, but A is invalid).
  • Errors during initialization of any ROM modules.
  • Any interrupt or CPU exception.

Software Shutdown

In the event that a software shutdown event is detected, software will perform the following actions to safely shutdown the chip:

  1. Schedule a software alert escalation.
  2. Disable all PMP regions except ROM. (may need to be re-ordered).
  3. Disable crypto and keymanager blocks (send to virtual scrap state).
  4. Disable all memories except ROM.
  5. Shutdown
  6. Hardened spin-loop to prevent glitching to further execution.

The software shutdown handler will be strategically located in the ROM to allow for the most severe PMP lockdown and to make glitching out more difficult.

Proposed API

Public APIs

// Reads OTP, initializes the Alert Handler, configures all interrupts
// and exceptions to enter shutdown_finalize.
void shutdown_init(void);

// Schedules a software alert escalation, then:
// - Disable all PMP regions except ROM.
// - Disable crypto and keymgr blocks (sent to virtual scrap state).
// - Disable all memories except ROM.
// - Shutdown the chip.
// - Wait for watchdog reboot.
// - while(true) { asm("wfi"); }  // Note: this needs to be hardened against glitches.
void shutdown_finalize(rom_error_t reason);

Internal APIs & Pseudo Code

rom_error_t shutdown_alert_init(lifecycle_state_t state) {
  escalation = otp_read(ROM_ALERT_ESCALATION);
  // escalation is 4 packed byte-enums.
  // Set the escalation policy per class (A/B/C/D)

  for i in range(number_of_alerts) {
    alert_class = otp_read(ROM_ALERT_CLASSIFICATION[i]);
    // based on lifecycle state, program alert class.
  }

  enables = otp_read(ROM_ALERT_CLASS_EN);
  // enables is 4 packed byte-enums.
  // Enable and maybe lock per class (A/B/C/D).
}

rom_error_t shutdown_interrupt_init(void) {
  // setup an interrupt vector table which directs every interrupt
  // to enter shutdown_finalize with a reason code of
  // kErrorShutdownInterruptn where n is the interrupt number.
}

Testing

Unit tests will be used to validate the initialize and shutdown flows.

Functional tests will be written to verify that the known software fault events enter the shutdown handler and cause the chip to shutdown.