3.6.8. Retry Failed Steps
In this chapter, you’ll learn how to configure steps to allow retrial on failure.
What is a Step Retrial?#
A step retrial is a mechanism that allows a step to be retried automatically when it fails. This is useful for handling transient errors, such as network issues or temporary unavailability of a service.
When a step fails, the workflow engine can automatically retry the step a specified number of times before marking the workflow as failed. This can help improve the reliability and resilience of your workflows.
You can also configure the interval between retries, allowing you to wait for a certain period before attempting the step again. This is useful when the failure is due to a temporary issue that may resolve itself after some time.
For example, if a step captures a payment, you may want to retry it the next day until the payment is successful or the maximum number of retries is reached.
Configure a Step’s Retrial#
By default, when an error occurs in a step, the step and the workflow fail, and the execution stops.
You can configure the step to retry on failure. The createStep
function can accept a configuration object instead of the step’s name as a first parameter.
For example:
5} from "@medusajs/framework/workflows-sdk"6 7const step1 = createStep(8 {9 name: "step-1",10 maxRetries: 2,11 },12 async () => {13 console.log("Executing step 1")14 15 throw new Error("Oops! Something happened.")16 }17)18 19const myWorkflow = createWorkflow(20 "hello-world", 21 function () {22 const str1 = step1()23 24 return new WorkflowResponse({25 message: str1,26 })27})28 29export default myWorkflow
The step’s configuration object accepts a maxRetries
property, which is a number indicating the number of times a step can be retried when it fails.
When you execute the above workflow, you’ll see the following result in the terminal:
The first line indicates the first time the step was executed, and the next two lines indicate the times the step was retried. After that, the step and workflow fail.
Step Retry Intervals#
By default, a step is retried immediately after it fails. To specify a wait time before a step is retried, pass a retryInterval
property to the step's configuration object. Its value is a number of seconds to wait before retrying the step.
For example:
In this example, if the step fails, it will be retried after two seconds.
Maximum Retry Interval#
The retryInterval
property's maximum value is Number.MAX_SAFE_INTEGER. So, you can set a very long wait time before the step is retried, allowing you to retry steps after a long period.
For example, to retry a step after a day:
In this example, if the step fails, it will be retried after 86400
seconds (one day).
Interval Changes Workflow to Long-Running#
By setting retryInterval
on a step, a workflow that uses that step becomes a long-running workflow that runs asynchronously in the background. This is useful when creating workflows that may fail and should run for a long time until they succeed, such as waiting for a payment to be captured or a shipment to be delivered.
However, since the long-running workflow runs in the background, you won't receive its result or errors immediately when you execute the workflow.
Instead, you must subscribe to the workflow's execution using the Workflow Engine Module Service. Learn more about it in this chapter.