Amazon’s cloud computing arm today introduced a variety of new products at its AWS Summit in San Francisco, including two serverless offerings. The first is the general availability of Amazon Aurora Serverless V2, a serverless database service that can now scale up and down substantially quicker than its predecessor and in more fine-grained increments. The other is SageMaker Serverless Inference’s general availability. Both of these services were initially made available in beta during AWS re: Invent in December. According to Swami Sivasubramanian, AWS’s VP of database, analytics, and machine learning, more than 100,000 AWS customers use Aurora for their database workloads today, and the service is still the fastest-growing AWS offering.
He explained that with version 1, increasing the database capacity took anything from five to forty seconds, and clients had to double the capacity. “Customers didn’t need to worry about maintaining database capacity since it’s serverless,” Sivasubramanian added. “However, when we were talking to customers more and more, they stated that to run a wide range of production workloads with [Aurora] Serverless V1, consumers need the capacity to expand in fractions of a second and then in much more fine-grained increments, not just doubling in terms of capacity.”
When compared to the cost of provisioning for pre-capacity, Sivasubramanian claims that this new technique may save users up to 90% of their database costs. Moving to v2 has no disadvantages, according to him, and all of the benefits of v1 are still available. However, the team modified the underlying computing platform and storage engine, making it able to expand in these modest increments and do so much more quickly. “The crew did a very outstanding piece of engineering,” he remarked.
Customers including Venmo, Pagely, and Zendesk are already utilizing the new system, which was released in beta last December. AWS claims that migrating workloads from Amazon Aurora Serverless v1 to v2 isn’t difficult. Sivasubramanian said that SageMaker Serverless Inference, which is now broadly accessible, enables organizations a pay-as-you-go solution for deploying their machine learning models — particularly those that are often idle — into production.
AWS currently has four inferencing options: Serverless Inference, Real-Time Inference for workloads where low latency is critical, SageMaker Batch Transform for dealing with data in batches, and SageMaker Asynchronous Inference for workloads with high payload sizes that may need extended processing periods. With so many options, it’s no wonder that AWS also provides the SageMaker Inference Recommender to assist customers in determining the optimal way to deploy their models.