Summary
Enabling job_retry causes terraform apply to fail when creating the SQS event source mapping for the retry Lambda.
The failure is:
InvalidParameterValueException: The function execution role does not have permissions to call ReceiveMessage on SQS
What went wrong
The job-retry submodule creates these resources in the same apply:
- the retry SQS queue
- the retry Lambda
- the retry Lambda IAM role
- the inline IAM policy that grants the retry Lambda access to the retry queue
- the Lambda event source mapping from the retry queue to the retry Lambda
The retry policy is defined correctly and already includes the required permissions:
sqs:ReceiveMessage
sqs:GetQueueAttributes
sqs:DeleteMessage
However, the event source mapping does not explicitly depend on that IAM policy resource.
Because of that, Terraform can create the event source mapping before the retry Lambda role has the queue permissions attached. AWS validates the execution role during CreateEventSourceMapping, does not see ReceiveMessage yet, and rejects the mapping.
Observed behavior
In my case this was not intermittent. With job_retry enabled, apply failed consistently. It did not work even once before the dependency fix.
After adding an explicit dependency from the event source mapping to the retry IAM policy, the same apply succeeded cleanly.
Reproduction
- Use the
multi-runner module
- Enable
job_retry on one or more runner configs
- Run
terraform apply
Expected failure during creation of one or more *-job-retry event source mappings.
Expected behavior
The retry Lambda IAM policy should be attached before the SQS event source mapping is created.
Proposed fix
Add an explicit dependency in modules/runners/job-retry/main.tf:
resource "aws_lambda_event_source_mapping" "job_retry" {
event_source_arn = aws_sqs_queue.job_retry_check_queue.arn
function_name = module.job_retry.lambda.function.arn
batch_size = var.config.lambda_event_source_mapping_batch_size
maximum_batching_window_in_seconds = var.config.lambda_event_source_mapping_maximum_batching_window_in_seconds
depends_on = [aws_iam_role_policy.job_retry]
}
Notes
This looks like a deterministic apply-ordering problem, not a missing-permission definition. The retry IAM policy itself already grants the correct SQS actions.
Summary
Enabling
job_retrycausesterraform applyto fail when creating the SQS event source mapping for the retry Lambda.The failure is:
What went wrong
The
job-retrysubmodule creates these resources in the same apply:The retry policy is defined correctly and already includes the required permissions:
sqs:ReceiveMessagesqs:GetQueueAttributessqs:DeleteMessageHowever, the event source mapping does not explicitly depend on that IAM policy resource.
Because of that, Terraform can create the event source mapping before the retry Lambda role has the queue permissions attached. AWS validates the execution role during
CreateEventSourceMapping, does not seeReceiveMessageyet, and rejects the mapping.Observed behavior
In my case this was not intermittent. With
job_retryenabled, apply failed consistently. It did not work even once before the dependency fix.After adding an explicit dependency from the event source mapping to the retry IAM policy, the same apply succeeded cleanly.
Reproduction
multi-runnermodulejob_retryon one or more runner configsterraform applyExpected failure during creation of one or more
*-job-retryevent source mappings.Expected behavior
The retry Lambda IAM policy should be attached before the SQS event source mapping is created.
Proposed fix
Add an explicit dependency in
modules/runners/job-retry/main.tf:Notes
This looks like a deterministic apply-ordering problem, not a missing-permission definition. The retry IAM policy itself already grants the correct SQS actions.