Fleet Provisioning by Claim Fails: MQTT Subscribe Timeout on AWS IoT Core Response Topics

 

Fleet Provisioning by Claim Fails: MQTT Subscribe Timeout on AWS IoT Core Response Topics

Aaron Rose

Aaron Rose       
Software Engineer & Technology Writer



Problem

When implementing Fleet Provisioning by Claim in AWS IoT Core, devices successfully connect using claim certificates and can publish provisioning requests, but subscribing to the response topics (accepted and rejected) times out. The device experiences repeated disconnect/reconnect cycles, no provisioning response is received, and no new Thing or certificate gets created.

Clarifying the Issue

This isn't a basic MQTT connectivity problem. The device establishes a secure TLS 1.2 connection with mutual authentication and successfully publishes to the provisioning request topic. The failure occurs specifically when attempting to subscribe to Fleet Provisioning response topics:

$aws/provisioning-templates/<TemplateName>/provision/accepted
$aws/provisioning-templates/<TemplateName>/provision/rejected

Key symptoms include:

  • MQTT subscribe() operations timing out after 5 seconds
  • Immediate disconnections following subscription attempts
  • "Subscribe timed out" errors in SDK logs
  • Successful connection and publish, but failed subscription
  • No SUBACK acknowledgment from AWS IoT Core

Why It Matters

Fleet Provisioning by Claim enables scalable device onboarding without pre-registering individual certificates. When subscription to response topics fails, the entire automated provisioning workflow breaks down. Devices can't receive their permanent certificates and Thing configurations, forcing manual intervention for every device deployment. This defeats the purpose of automated fleet provisioning and creates operational bottlenecks in production environments.

Key Terms

  • Fleet Provisioning by Claim – AWS IoT Core feature that provisions devices dynamically using shared claim certificates
  • Claim Certificate – Temporary certificate with limited permissions used only for initiating provisioning
  • Provisioning Template – JSON template defining how Things, certificates, and policies are created during provisioning
  • SUBACK – MQTT protocol acknowledgment that confirms successful subscription
  • topicfilter ARN – IAM resource identifier required for MQTT subscribe permissions (distinct from topic ARN)

Steps at a Glance

  1. Add missing topicfilter ARN to claim certificate policy
  2. Subscribe to response topics before publishing provisioning request
  3. Verify exact template name matching across topics and payload
  4. Increase MQTT operation timeouts for provisioning workflows
  5. Test subscription permissions using AWS IoT MQTT test client

Detailed Steps

Step 1: Add missing topicfilter ARN to claim certificate policy

The most common cause is missing topicfilter permissions. AWS IoT requires both topic and topicfilter ARNs for subscribe operations:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["iot:Connect", "iot:Publish", "iot:Subscribe", "iot:Receive"],
      "Resource": "*"
    },
    {
      "Effect": "Allow", 
      "Action": ["iot:Publish", "iot:Subscribe", "iot:Receive"],
      "Resource": [
        "arn:aws:iot:us-east-1:<account-id>:topic/$aws/provisioning-templates/MyFleetProvisioningTemplate/*",
        "arn:aws:iot:us-east-1:<account-id>:topicfilter/$aws/provisioning-templates/MyFleetProvisioningTemplate/*"
      ]
    }
  ]
}

Critical: The topicfilter ARN is essential for iot:Subscribe actions. Without it, subscribe operations will timeout.


Step 2: Subscribe to response topics before publishing provisioning request

Establish subscriptions first to ensure you're listening when AWS IoT Core sends the response:

# Subscribe to response topics FIRST
claim_client.subscribe(accepted_topic, 1, on_message_callback)
claim_client.subscribe(rejected_topic, 1, on_message_callback)

# Allow time for subscription setup
time.sleep(2)

# Then publish the provisioning request
claim_client.publish(provisioning_topic, json.dumps(payload), 1)

Step 3: Verify exact template name matching

Template names are case-sensitive. Copy the exact name from IoT Core → Fleet Provisioning → Templates:

# Must match exactly - case sensitive
TEMPLATE_NAME = "MyFleetProvisioningTemplate"  # Not "myfleetprovisioningtemplate"

accepted_topic = f"$aws/provisioning-templates/{TEMPLATE_NAME}/provision/accepted"
rejected_topic = f"$aws/provisioning-templates/{TEMPLATE_NAME}/provision/rejected"
provisioning_topic = f"$aws/provisioning-templates/{TEMPLATE_NAME}/provision/json"

Step 4: Increase MQTT operation timeouts

Fleet provisioning can take longer than standard MQTT operations. Increase timeouts:

claim_client.configureConnectDisconnectTimeout(30)  # Increased from 10
claim_client.configureMQTTOperationTimeout(10)      # Increased from 5  
claim_client.configureKeepAliveIntervalSecond(120)  # Prevent premature disconnects

Step 5: Test with AWS IoT MQTT test client

Before deploying device code, verify permissions using the AWS Console:

  1. Navigate to IoT Core → Test → MQTT test client
  2. Use your claim certificate credentials
  3. Subscribe to: $aws/provisioning-templates/MyFleetProvisioningTemplate/provision/+
  4. Publish test payload to: $aws/provisioning-templates/MyFleetProvisioningTemplate/provision/json
{
  "parameters": {
    "ThingName": "TestDevice001",
    "SerialNumber": "TEST-001"
  }
}

If subscription fails in the test client, the issue is definitely permissions-related.


Complete Working Example

A full working Python script that implements all the fixes above is available in this GitHub Gist. The script includes proper error handling, thread-safe response processing, and detailed logging to help debug any remaining issues.

Conclusion

MQTT subscribe timeouts in Fleet Provisioning by Claim typically stem from missing topicfilter permissions in the claim certificate policy. The key fix is ensuring both topic and topicfilter ARNs are included for the provisioning response topics. Additionally, subscribing before publishing and using appropriate timeouts prevents timing-related failures.

The most critical takeaway: AWS IoT Core requires topicfilter ARNs for subscribe operations - this isn't obvious from error messages but causes immediate timeouts when missing. Following the steps above, especially the permissions fix and proper sequencing, resolves the majority of Fleet Provisioning subscription failures.

Once these fundamentals are correct, Fleet Provisioning becomes a reliable method for automated device onboarding at scale.

Once these fundamentals are correct, Fleet Provisioning becomes a reliable method for automated device onboarding at scale.


Aaron Rose is a software engineer and technology writer at tech-reader.blog.

Comments

Popular posts from this blog

The New ChatGPT Reason Feature: What It Is and Why You Should Use It

Raspberry Pi Connect vs. RealVNC: A Comprehensive Comparison

Running AI Models on Raspberry Pi 5 (8GB RAM): What Works and What Doesn't