Scaling speed increased by 12 times! Big improvements to AWS Lambda functions!

2024.02.09


cloud computingcloud native
The new independent scaling system allows functions to reach concurrency targets faster, which will be even better for scenarios such as breaking news or flash sales.

Compiled | Xingxuan

Produced by | 51CTO technology stack (WeChat ID: blog51cto)

Marcia is a Principal Developer Advocate for Amazon Web Services and has 20 years of experience building and scaling applications in the software industry. She is passionate about designing systems that take full advantage of the cloud and embrace a DevOps culture. Recently she published a blog post that brought a major improvement to AWS Lambda: the expansion speed increased by 12 times!

1. Lambda function updated, expansion speed doubled

AWS Lambda now scales 12x faster. Each Lambda function called synchronously now scales to 1000 concurrent executions every 10 seconds until the aggregate concurrency of all functions reaches the account's concurrency limit. Additionally, every feature in an account can now be scaled independently of each other, regardless of how they are called. These improvements come at no additional cost and do not require any configuration in existing functionality.

picturepicture

Building scalable and high-performance applications using traditional architectures can be challenging, often requiring over-provisioning of computing resources or complex caching solutions to meet peak demand and unpredictable traffic. Many developers choose Lambda because it can scale on demand when applications face unpredictable traffic.

Prior to this update, Lambda functions could initially scale to 500-3,000 concurrent executions at the account level in the first minute (depending on region), then 500 concurrent executions per minute until the account's concurrency limit was reached.

Because this scaling limit is shared by all features in the same account and region, if one feature experiences an influx of traffic, it may impact the throughput of other features in the same account. This increases the engineering effort to monitor some features that may exceed account limits, causing noisy neighbor scenarios and reducing overall concurrency for other features in the same account.

Now, with these scaling improvements, customers with large changes in traffic can hit their concurrency goals faster than before. For example, a news website that publishes breaking news stories or an online store that runs a flash sale will experience an influx of visitors. Thanks to these improvements, they now scale up to 12 times faster than before.

In addition, customers who use services such as Amazon Athena and Amazon Redshift and Lambda-based scalar UDFs to perform data enrichment or data transformation will benefit from these improvements. These services rely on batching data and passing it to Lambda in chunks, calling multiple parallel functions simultaneously. Enhanced concurrent scaling behavior ensures that Lambda can scale quickly and meet service level agreements (SLAs).

2. How does it look in practice?

The following image shows a function receiving and processing requests every 10 seconds. The account concurrency limit is set to 7,000 concurrent requests and is shared among all features in the same account. The scaling rate for each function is fixed at 1,000 concurrent executions every 10 seconds. This rate is independent of other features in the same account, making it easier to predict how this feature will scale and throttle requests if needed.

picturepicture

  • 09:00:00 – The function has been running for some time and has 1,000 concurrent executions being processed.
  • 09:00:10 – Ten seconds later, another burst of 1,000 new requests. The function can handle them without any problem, as the function can scale up to 1,000 concurrent executions every 10 seconds.
  • 09:00:20 – The same thing happens here: a thousand new requests.
  • 09:00:30 – The function now receives 1,500 new requests. Because the function's maximum scaling capacity is 1,000 requests every 10 seconds, 500 of those requests will be limited.
  • 09:01:00 – At this point, the function has processed 4,500 concurrent requests. But suddenly 3,000 new requests appeared. Lambda handles 1,000 new requests and limits 2,000 because the function can scale to 1,000 requests every 10 seconds.
  • 09:01:10 – 10 seconds later, another burst of 2,000 requests occurs, and the function can now handle 1,000 more requests. However, the remaining 1,000 requests are throttled because the function scales to 1,000 requests every 10 seconds.
  • 09:01:20 – The function is now handling 6,500 concurrent requests and has 1,000 incoming requests. The first 500 of these requests are processed, but the other 500 are throttled because the function reaches the account concurrency limit of 7,000 requests. It's important to remember that you can increase your account concurrency limit by creating a support ticket in the AWS Management Console.

If you have multiple functions in your account, these functions will scale independently until the account's total concurrency limit is reached. After that, all new calls will be restricted.

3. Available scope and specific rules

These extension improvements are enabled by default for all features. It is reported that from November 26 last year to mid-December, AWS will gradually roll out these expansion improvements to all AWS regions except China and GovCloud regions.

The specific rules are as follows:

Lambda does not accumulate the unused portion of the concurrent scaling rate. This means that, at any moment, your scaling rate is always a maximum of 1000 concurrent units. For example, if none of the available 1000 concurrent units are used during a 10-second interval, no additional 1000 units will be added during the next 10-second interval. For the next 10 seconds, the concurrent scaling rate remains at 1000.

As long as your function continues to receive more and more requests, Lambda will scale as fast as possible, up to your account's concurrency limit. You can limit the amount of concurrency that a single function can use by configuring preserved concurrency. If requests are made faster than the function can scale, or if the function is at maximum concurrency, other requests will fail with a throttling error (429 status code). 

4. Summary

Previously, Lambda functions shared scaling limits at the account level, leading to potential throughput issues if one function encountered high traffic. AWS Lambda functions scale up to 12 times faster after update. Each function now scales to 1,000 concurrent executions every 10 seconds, independent of other functions in the same account, until the account's aggregate concurrency limit is reached. This update requires no additional cost or configuration changes and will greatly benefit applications facing unpredictable traffic by allowing rapid scaling.

The new independent scaling system allows functions to reach concurrency targets faster, which will be even better for scenarios such as breaking news or flash sales.

In addition, services such as Amazon Athena and Amazon Redshift that leverage Lambda for data processing will receive performance enhancements as a result of this update. These improvements are enabled by default and will be rolled out to all AWS regions except China and GovCloud regions.