ESP32-S3 Device Provisioning System
6-step idempotent provisioning pipeline with 0 bricked devices across production, featuring eFuse security, SHA256-based serial generation, and automated AWS IoT integration
Context
- System: ESP32-S3 based robotic devices (Ceily ceiling robot, Wally wall-mounted robot)
- Requirement: Secure, repeatable factory provisioning for mass production
- Constraint: Each device needs unique identity, encrypted firmware, and cloud connectivity
Core Problem
Production IoT devices require secure provisioning that:
- Burns security keys to one-time programmable eFuses
- Enables Secure Boot and Flash Encryption without bricking devices
- Generates unique device identities from hardware characteristics
- Provisions AWS IoT certificates for cloud connectivity
Why this was hard: eFuse burning is irreversible. A single mistake in the sequence or configuration permanently bricks the device. The process must work reliably across thousands of units in a factory environment with varying operator skill levels.
Key Insight
Design the provisioning as 6 independent, idempotent steps. Each step verifies prerequisites before execution and can be safely retried.
Why 6 steps? Each step has different failure modes and recovery strategies:
| Step | Failure Mode | Recovery |
|---|---|---|
| 1. Security eFuse | Wrong key burned | Device scrapped (irreversible) |
| 2. Factory firmware | Flash corruption | Re-flash from S3 |
| 3. Serial provision | Communication error | Retry (eFuse detects duplicate) |
| 4. AWS IoT setup | Network failure | Retry (AWS detects existing thing) |
| 5. Certificate flash | Power loss | Re-flash from local cache |
| 6. Production firmware | Any error | Re-flash from S3 |
Grouping them would lose granular recovery. Splitting further would add unnecessary complexity.
Approach
1) Flash Memory Layout
ESP32-S3 Flash (8MB)
+------------------+ 0x00000000
| Bootloader | Signed, encrypted
+------------------+ 0x00010000
| Partition Table | Signed
+------------------+ 0x0001A000
| AWS Certificates| 16KB, encrypted
+------------------+ 0x00020000
| Application | Signed, encrypted
+------------------+ 0x00800000
eFuse Layout:
BLOCK_KEY0: Secure Boot digest (32 bytes)
BLOCK_KEY1: XTS-AES-256 key part 1 (16 bytes)
BLOCK_KEY2: XTS-AES-256 key part 2 (16 bytes)
BLOCK3: Custom MAC / Serial (8 bytes)
2) Security Provisioning (eFuse Burning)
Configure ESP32-S3 Secure Boot V2 and XTS-AES-256 Flash Encryption:
def generate_commands(self, mode):
commands = []
# Secure Boot key digest → BLOCK_KEY0
if not self.efuse_status.get("key0_burned"):
commands.append({
"cmd": ["espefuse", "burn_key", "BLOCK_KEY0",
"secure_boot_key_digest.bin", "SECURE_BOOT_DIGEST0"],
})
# XTS-AES-256 requires TWO key blocks (ESP32-S3 specific)
if not self.efuse_status.get("key1_burned"):
commands.append({
"cmd": ["espefuse", "burn_key", "BLOCK_KEY1",
"flash_encryption_key.bin", "XTS_AES_256_KEY_1"],
})
if not self.efuse_status.get("key2_burned"):
commands.append({
"cmd": ["espefuse", "burn_key", "BLOCK_KEY2",
"flash_encryption_key.bin", "XTS_AES_256_KEY_2"],
})
# Enable security features
commands.append(["burn_efuse", "SECURE_BOOT_EN", "1"])
commands.append(["burn_efuse", "SPI_BOOT_CRYPT_CNT", "1"])
return commands
Why XTS-AES-256? ESP32-S3 hardware encryption requires 256-bit key split across two eFuse blocks. Single-block AES-128 is weaker and not recommended for production.
3) Serial Number Generation
Deterministic serial from hardware MAC address using SHA256:
class FactoryProvisioner:
FACTORY_SECRET = os.environ["FACTORY_SECRET"] # From secure storage
def generate_hash64_id(self, wifi_mac):
# Combine: WiFi MAC (6 bytes) + factory secret
combined = wifi_mac + self.FACTORY_SECRET.encode("utf-8")
# SHA256 → take first 8 bytes for 64-bit ID
hash_bytes = hashlib.sha256(combined).digest()
hash64 = hash_bytes[:8]
return hash64.hex() # 16-char hex: "4c694f4b2e2cc7b2"
Why hash instead of raw MAC?
- MAC addresses can be spoofed; hash with secret cannot be forged
- Same MAC always produces same serial (deterministic)
- 64-bit collision probability: 1 in 2^64 (~18 quintillion)
4) Certificate Partition Structure
Certificates stored in dedicated 16KB partition:
CERT_MAGIC = 0x41575343 # "AWSC" in little-endian
CERT_VERSION = 1
PARTITION_OFFSET = 0x1A000
PARTITION_SIZE = 0x4000 # 16KB
# Partition structure:
# +0x000: Magic (4B) + Version (4B)
# +0x008: Cert offset (4B) + Cert length (4B)
# +0x010: Key offset (4B) + Key length (4B)
# +0x018: CA offset (4B) + CA length (4B)
# +0x020: Reserved (8B)
# +0x028: Endpoint URL (128B, null-terminated)
# +0x0A8: Device certificate (~1.2KB)
# +0x????: Private key (~1.7KB)
# +0x????: CA certificate (~1.2KB)
Why separate partition? Certificates are ~4KB total. Embedding in app binary would require rebuilding firmware per device. Separate partition allows certificate-only updates.
5) Idempotent Orchestration
Each step checks completion before execution:
class FullProvisioner:
TOTAL_STEPS = 6
def step1_security_provision(self):
# Check: Are eFuses already burned?
output = run(["espefuse", "--port", self.port, "summary"])
if "SECURE_BOOT_EN" in output and "= True" in output:
print("✓ Security already provisioned, skipping")
return True
# Execute: Burn eFuses
return self.burn_security_efuses()
def step3_serial_provision(self):
# Check: Is serial already in eFuse?
existing = self.read_serial_from_efuse()
if existing:
self.serial = existing
print(f"✓ Serial exists: {existing}")
return True
# Execute: Generate and burn serial
mac = self.get_wifi_mac()
self.serial = self.generate_hash64_id(mac)
return self.burn_serial_to_efuse(self.serial)
Tradeoffs
| Decision | Rationale | Tradeoff |
|---|---|---|
| XTS-AES-256 (2 key blocks) | ESP32-S3 hardware encryption standard | More complex than single-block AES-128 |
| Hash(MAC + secret) for serial | Deterministic, unforgeable, hardware-tied | Requires secure secret management |
| Dev mode encryption (CNT=1) | Allows field firmware updates via OTA | 3 re-flash opportunities vs unlimited |
| JTAG enabled in production | Field debugging for RMA devices | Attack vector if device physically compromised |
| 6 idempotent steps | Granular recovery, power-loss safe | 6 status checks per device (~2s overhead) |
| S3 firmware staging | Version control, instant rollback | AWS dependency, network required |
| 16KB cert partition | Fits X.509 chain + endpoint + padding | Wastes 12KB (only 4KB used) |
Results
Performance (measured on production line):
- Total provision time: ~30 seconds per device
- Security eFuse: ~5s (burn + verify)
- Factory firmware flash: ~8s (cached locally)
- Serial provision: ~2s
- AWS IoT setup: ~4s (API calls)
- Certificate flash: ~3s
- Production firmware: ~8s (cached locally)
- Throughput: ~120 devices/hour per station
Reliability:
- Bricked devices: 0 across all production runs
- Retry rate: <2% (mostly network timeouts on AWS step)
- Power-loss recovery: 100% successful on next attempt
Provisioning Status Codes:
| Code | Meaning | Action |
|---|---|---|
| EFUSE_BURNED | Security already configured | Skip step 1 |
| SERIAL_EXISTS | Device has serial in eFuse | Skip steps 2-3 |
| THING_EXISTS | AWS IoT thing created | Skip step 4 |
| CERTS_VALID | Certificates in partition | Skip step 5 |
| PROVISION_COMPLETE | All steps done | Ready for packaging |
Key Takeaway
Irreversible operations (eFuse burning) require defensive design: verify state before write, verify after write, make every step independently retriable. The 6-step idempotent architecture ensures that power loss, network failures, or operator errors never result in scrapped devices—only the security eFuse step is truly unrecoverable, and it’s validated with triple-check before execution.