Fleet Provisioning by Claim Fails: MQTT Subscribe Timeout on AWS IoT Core Response Topics
Aaron Rose
Software Engineer & Technology Writer
Problem
When implementing Fleet Provisioning by Claim in AWS IoT Core, devices successfully connect using claim certificates and can publish provisioning requests, but subscribing to the response topics (accepted
and rejected
) times out. The device experiences repeated disconnect/reconnect cycles, no provisioning response is received, and no new Thing or certificate gets created.
Clarifying the Issue
This isn't a basic MQTT connectivity problem. The device establishes a secure TLS 1.2 connection with mutual authentication and successfully publishes to the provisioning request topic. The failure occurs specifically when attempting to subscribe to Fleet Provisioning response topics:
$aws/provisioning-templates/<TemplateName>/provision/accepted
$aws/provisioning-templates/<TemplateName>/provision/rejected
Key symptoms include:
- MQTT
subscribe()
operations timing out after 5 seconds - Immediate disconnections following subscription attempts
- "Subscribe timed out" errors in SDK logs
- Successful connection and publish, but failed subscription
- No SUBACK acknowledgment from AWS IoT Core
Why It Matters
Fleet Provisioning by Claim enables scalable device onboarding without pre-registering individual certificates. When subscription to response topics fails, the entire automated provisioning workflow breaks down. Devices can't receive their permanent certificates and Thing configurations, forcing manual intervention for every device deployment. This defeats the purpose of automated fleet provisioning and creates operational bottlenecks in production environments.
Key Terms
- Fleet Provisioning by Claim – AWS IoT Core feature that provisions devices dynamically using shared claim certificates
- Claim Certificate – Temporary certificate with limited permissions used only for initiating provisioning
- Provisioning Template – JSON template defining how Things, certificates, and policies are created during provisioning
- SUBACK – MQTT protocol acknowledgment that confirms successful subscription
- topicfilter ARN – IAM resource identifier required for MQTT subscribe permissions (distinct from topic ARN)
Steps at a Glance
- Add missing
topicfilter
ARN to claim certificate policy - Subscribe to response topics before publishing provisioning request
- Verify exact template name matching across topics and payload
- Increase MQTT operation timeouts for provisioning workflows
- Test subscription permissions using AWS IoT MQTT test client
Detailed Steps
Step 1: Add missing topicfilter ARN to claim certificate policy
The most common cause is missing topicfilter
permissions. AWS IoT requires both topic
and topicfilter
ARNs for subscribe operations:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["iot:Connect", "iot:Publish", "iot:Subscribe", "iot:Receive"],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": ["iot:Publish", "iot:Subscribe", "iot:Receive"],
"Resource": [
"arn:aws:iot:us-east-1:<account-id>:topic/$aws/provisioning-templates/MyFleetProvisioningTemplate/*",
"arn:aws:iot:us-east-1:<account-id>:topicfilter/$aws/provisioning-templates/MyFleetProvisioningTemplate/*"
]
}
]
}
Critical: The topicfilter
ARN is essential for iot:Subscribe
actions. Without it, subscribe operations will timeout.
Step 2: Subscribe to response topics before publishing provisioning request
Establish subscriptions first to ensure you're listening when AWS IoT Core sends the response:
# Subscribe to response topics FIRST
claim_client.subscribe(accepted_topic, 1, on_message_callback)
claim_client.subscribe(rejected_topic, 1, on_message_callback)
# Allow time for subscription setup
time.sleep(2)
# Then publish the provisioning request
claim_client.publish(provisioning_topic, json.dumps(payload), 1)
Step 3: Verify exact template name matching
Template names are case-sensitive. Copy the exact name from IoT Core → Fleet Provisioning → Templates:
# Must match exactly - case sensitive
TEMPLATE_NAME = "MyFleetProvisioningTemplate" # Not "myfleetprovisioningtemplate"
accepted_topic = f"$aws/provisioning-templates/{TEMPLATE_NAME}/provision/accepted"
rejected_topic = f"$aws/provisioning-templates/{TEMPLATE_NAME}/provision/rejected"
provisioning_topic = f"$aws/provisioning-templates/{TEMPLATE_NAME}/provision/json"
Step 4: Increase MQTT operation timeouts
Fleet provisioning can take longer than standard MQTT operations. Increase timeouts:
claim_client.configureConnectDisconnectTimeout(30) # Increased from 10
claim_client.configureMQTTOperationTimeout(10) # Increased from 5
claim_client.configureKeepAliveIntervalSecond(120) # Prevent premature disconnects
Step 5: Test with AWS IoT MQTT test client
Before deploying device code, verify permissions using the AWS Console:
- Navigate to IoT Core → Test → MQTT test client
- Use your claim certificate credentials
- Subscribe to:
$aws/provisioning-templates/MyFleetProvisioningTemplate/provision/+
- Publish test payload to:
$aws/provisioning-templates/MyFleetProvisioningTemplate/provision/json
{
"parameters": {
"ThingName": "TestDevice001",
"SerialNumber": "TEST-001"
}
}
If subscription fails in the test client, the issue is definitely permissions-related.
Complete Working Example
A full working Python script that implements all the fixes above is available in this GitHub Gist. The script includes proper error handling, thread-safe response processing, and detailed logging to help debug any remaining issues.
Conclusion
MQTT subscribe timeouts in Fleet Provisioning by Claim typically stem from missing topicfilter
permissions in the claim certificate policy. The key fix is ensuring both topic
and topicfilter
ARNs are included for the provisioning response topics. Additionally, subscribing before publishing and using appropriate timeouts prevents timing-related failures.
The most critical takeaway: AWS IoT Core requires topicfilter
ARNs for subscribe operations - this isn't obvious from error messages but causes immediate timeouts when missing. Following the steps above, especially the permissions fix and proper sequencing, resolves the majority of Fleet Provisioning subscription failures.
Once these fundamentals are correct, Fleet Provisioning becomes a reliable method for automated device onboarding at scale.
Once these fundamentals are correct, Fleet Provisioning becomes a reliable method for automated device onboarding at scale.
Aaron Rose is a software engineer and technology writer at tech-reader.blog.
Comments
Post a Comment