Jets Prewarming
Jets supports prewarming by periodically hitting the controller Lambda function with a noop prewarm request. This mitigates the Lambda cold-start issue and can be a more cost-effective way of avoiding the Lambda cold-start than using Provisioned Concurrency. It’s a smart or poor man’s version of Provisioned Concurrency.
Only the Lambda Function that handles Controller requests are prewarmed. Prewarming is enabled by default. To adjust the prewarming settings:
config/jets/deploy.rb
Jets.deploy.configure do
config.prewarm.enable = true # default: true
config.prewarm.rate = "1m" # default: 1 minute
# config.prewarm.cron = "0 */12 * * ? *" # when configured takes higher precedence than prewarm.rate
# config.prewarm.threads = 2 # default: 2
end
Rate vs Threads
Concurrency can be helpful if requests are coming in at the same time in parallel. Example: The Lambda function gets 60 requests/minute, and each request takes 1 second to process.
- Case 1: All 60 requests come in simultaneously within the first second. The desired concurrency should be 60; otherwise, 59 of 60 requests will be hit with a cold start (in the worst-case scenario).
- Case 2: The 60 requests come in serially each second for a minute. In theory, the same Lambda “container” can serve all the requests without a cold start.
The same applies to traditional servers like Puma where you need more Puma threads and processes to handle increased concurrency. With servers, depending on the web server settings, requests can be queued and eventually returned with slow response times. With Lambda, you get a 429 Too Many Requests Error
Status Code. In either case, the user probably gave up on the webpage and left.
Option | Explanation |
---|---|
rate | Controls how often the prewarming job runs. |
threads | Controls how many Ruby threads to call the Prewarm event in parallel. |
For example, with a rate of 2 hours and threads of 2, the Lambda functions are called with a prewarm request 48 times after 24 hours (24 hours x 2).
Threads < Reserved Concurrency
Theoretically, config.prewarm.threads
should be less than config.lambda.controller.reserved_concurrency
Reserved Concurrency, otherwise the prewarm request could create 429 Too Many Requests
errors. In practice, I’ve found that since the prewarm request is a noop operation that’s so fast, it does not cause throttling. It’s likely because the AWS Lambda scaling algorithm buffers a bit before deciding to scale. It’s probably safest to keep config.prewarm.threads <= config.lambda.controller.reserved_concurrency
since other requests in your app can take some time and keep the Lambda busy. It’s also more cost-effective.
Prewarm After Deployment
After a deployment finishes, Jets automatically prewarms the app immediately. This keeps your application nice and fast.
Prewarm Custom Headers
Jets appends an x-jets-prewarm-*
headers to the response to help you see if the lambda function was prewarmed. The headers looks something like this:
x-jets-boot-at: 2024-04-17 18:08:45 UTC
x-jets-prewarm-at: 2024-04-17 18:31:22 UTC
x-jets-prewarm-count: 22
x-jets-gid: cb205b47
Here’s a curl command that is useful to see this info:
curl -svo /dev/null <REPLACE_URL> 2>&1 | grep 'x-jets'
We can see that the Lambda function had been prewarmed, and the boot-at header shows the last time AWS Lambda recycled the Lambda function.
Cost Analysis
Here’s some basic cost analysis of using Jets prewarm feature vs Provisioned Concurrency. For these numbers, I’m tweaking the calculations by adding an extra 1M request to remove the free tier, but I am removing the 1M from the numbers below. The goal here is to get an idea of ballpark costs.
We’ll use the helpful AWS Lambda Calculator with baseline assumptions: Using ARM architecture, a 1.5GB Lambda Function with Average Duration 150ms.
If we prewarm at different rates per minute and number of threads
Rate and Threads | Requests/Mo | Cost/Mo |
---|---|---|
1 request/min with threads 1 | 43,800 | $0.01/mo |
1 request/min with threads 10 | 438,000 | $0.09/mo |
1 request/min with threads 30 | 1,314,000 | $1.71/mo |
10 requests/min with threads 30 | 10,314,000 | $29.03/mo |
1 request/5min with threads 10 | 87,600 | $0.02/mo |
With Provisioned Concurrency, we’ll have a constant number of running threads. Remember, they are always on Lambdas. Here’s the baseline cost with no requests.
Provisioned Concurrency | Cost/Mo |
---|---|
1 | $12.66/mo |
30 | $379.70/mo |
As you can see, having always running Lambdas costs you a lot more upfront.
The devil is in the details, as Provisioned Concurrency requests cost less than on-demand on a per-request basis. If you have a ton of requests, you eventually break even. For small and medium sites, on-demand Lambdas are more cost-effective. Please experiment with the calculator yourself to see the numbers. Good math students double-check their numbers.
Note: AWS updates their pricing periodically, so always check the numbers yourself. These calculations will likely be out-of-date.
Reference
The table below covers each setting. Each option is configured with config.OPTION
. The config.
portion is not shown for conciseness. IE: logger.level
vs config.logger.level
.
Name | Default | Description |
---|---|---|
prewarm.cron | nil | When set takes higher precedence than rate. |
prewarm.reserved_concurrency | 2 | The reserved concurrency for the prewarm Lambda Function. Not a lot is needed because only it gets called a controlled scheduled interval and post deployment. |
prewarm.enable | true | Enables prewarming. |
prewarm.memory | 1024 | Memory setting for the prewarm event. |
prewarm.rate | 1m | A rate expression. Both 1 minute and 1m work. See: AWS Scheduled Rate Expressions and fugit |
prewarm.threads | 2 | Number of threads to send at the same time to prewarm the controller Lambda function. prewarm.threads should be less than or equal to lambda.controller.reserved_concurrency . |
prewarm.timeout | 5m | Timeout of the prewarm event. |