Amazon Web Services announced an AI chatbot for enterprise use, new generations of its AI training chips, expanded partnerships and more during AWS re:Invent, held from November 27 to December 1, in Las Vegas.
The focus of AWS CEO Adam Selipsky’s keynote held on day two of the conference was on generative AI and how to enable organizations to train powerful models through cloud services.
Jump to:
AWS announced new generations of its Graviton chips, which are server processors for cloud workloads and Trainium, which provides compute power for AI foundation model training.
Graviton4 (Figure A) has 30% better compute performance, 50% more cores and 75% more memory bandwidth than Graviton3, Selipsky said. The first instance based on Graviton4 will be the R8g Instances for EC2 for memory-intensive workloads, available through AWS.
Trainium2 is coming to Amazon EC2 Trn2 instances, and each instance will be able to scale up to 100,000 Trainium2 chips. That provides the ability to train a 300-billion parameter large language model in weeks, AWS stated in a press release.
Figure A
Anthropic will use Trainium and Amazon’s high-performance machine learning chip Inferentia for its AI models, Selipsky and Dario Amodei, chief executive officer and co-founder of Anthropic, announced. These chips may help Amazon muscle into Microsoft’s space in the AI chip market.
Selipsky made several announcements about Amazon Bedrock, the foundation model building service, during re:Invent:
Amazon launched its own generative AI assistant, Amazon Q, designed for natural language interactions and content generation for work. It can fit into existing identities, roles and permissions in enterprise security permissions.
Amazon Q can be used throughout an organization and can access a wide range of other business software. Amazon is pitching Amazon Q as business-focused and specialized for individual employees who may ask specific questions about their sales or tasks.
Amazon Q is especially suited for developers and IT pros working within AWS CodeCatalyst because it can help troubleshoot errors or network connections. Amazon Q will exist in the AWS management console and documentation within CodeWhisperer, in the serverless computing platform AWS Lambda, or in workplace communication apps like Slack (Figure B).
Figure B
Amazon Q has a feature that allows application developers to update their applications using natural language instructions. This feature of Amazon Q is available in preview in AWS CodeCatalyst today and will soon be coming to supported integrated development environments.
SEE: Data governance is one of the many factors that needs to be considered during generative AI deployment. (TechRepublic)
Many Amazon Q features within other Amazon services and products are available in preview today. For example, contact center administrators can access Amazon Q in Amazon Connect now.
The Amazon S3 Express One Zone, now in general availability, is a new S3 storage class purpose-built for high-performance and low-latency cloud object storage for frequently-accessed data, Selipsky said. It’s designed for workloads that require single-digit millisecond latency such as finance or machine learning. Today, customers move data from S3 to custom caching solutions; with the Amazon S3 Express One Zone, they can choose their own geographical availability zone and bring their frequently accessed data next to their high-performance computing. Selipsky said Amazon S3 Express One Zone can be run with 50% lower access costs than the standard Amazon S3.
On Nov. 27, AWS announced Salesforce’s partnership with Amazon will expand to certain Salesforce CRM products accessed on AWS Marketplace. Specifically, Salesforce’s Data Cloud, Service Cloud, Sales Cloud, Industry Clouds, Tableau, MuleSoft, Platform and Heroku will be available for joint customers of Salesforce and AWS in the U.S. More products are expected to be available, and the geographical availability is expected to be expanded next year.
New options include:
“Salesforce and AWS make it easy for developers to securely access and leverage data and generative AI technologies to drive rapid transformation for their organizations and industries,” Selipsky said in a press release.
Conversely, AWS will be using Salesforce products such as Salesforce Data Cloud more often internally.
ETL can be a cumbersome part of coding with transactional data. Last year, Amazon announced a zero-ETL integration between Amazon Aurora, MySQL and Amazon Redshift.
Today AWS introduced more zero-ETL integrations with Amazon Redshift:
All three are available globally in preview now.
The next thing Amazon wanted to do is make search in transactional data more smooth; many people use Amazon OpenSearch Service for this. In response, Amazon announced DynamoDB zero-ETL with OpenSearch Service is available today.
Plus, in an effort to make data more discoverable in Amazon DataZone, Amazon added a new capability to add business descriptions to data sets using generative AI.
Amazon One Enterprise enables security management for access to physical locations in industries such as hospitality, education or technologies. It’s a fully-managed online service paired with the AWS One palm scanner for biometric authentication administered through the AWS Management Console. Amazon One Enterprise is currently available in preview in the U.S.
NVIDIA announced a new set of GPUs available through AWS, the NVIDIA L4 GPUs, NVIDIA L40S GPUs and NVIDIA H200 GPUs. AWS will be the first cloud provider to bring the H200 chips with NV link to the cloud. Through this link, the GPU and CPU can share memory to speed up processing, NVIDIA CEO Jensen Huang explained during Selipsky’s keynote. Amazon EC2 G6e instances featuring NVIDIA L40S GPUs and Amazon G6 instances powered by L4 GPUs will start to roll out in 2024.
In addition, the NVIDIA DGX Cloud, NVIDIA’s AI building platform, is coming to AWS. An exact date for its availability hasn’t yet been announced.
NVIDIA brought on AWS as a primary partner in Project Ceiba, NVIDIA’s 65 exaflop supercomputer including 16,384 NVIDIA GH200 Superchips.
Another announcement made during re:Invent is the NVIDIA NeMo Retriever, which allows enterprise customers to provide more accurate responses from their multimodal generative AI applications using retrieval-augmented generation.
Specifically, NVIDIA NeMo Retriever is a semantic-retrieval microservice that connects custom LLMs to applications. NVIDIA NeMo Retriever’s embedding models determine the semantic relationships between words. Then, that data is fed into an LLM, which processes and analyzes the textual data. Business customers can connect that LLM to their own data sources and knowledge bases.
NVIDIA NeMo Retriever is available in early access now through the NVIDIA AI Enterprise Software platform wherever it can be accessed through the AWS Marketplace.
Early partners working with NVIDIA on retrieval-augmented generation services include Cadence, Dropbox, SAP and ServiceNow.
Note: TechRepublic is covering AWS re:Invent virtually.
24World Media does not take any responsibility of the information you see on this page. The content this page contains is from independent third-party content provider. If you have any concerns regarding the content, please free to write us here: contact@24worldmedia.com
Why You Need To Improve Drainage on Your Property
Essential Tips To Shield Your Car Windows From Damage
Warehouse Optimization Tips To Improve Performance
How High-Humidity Climates Affect Pressure Gauges
How Is Global Health Improving Year After Year
Ways That You Can Make Your Land More Useful
Essential Materials Used in the Construction Industry
A Look Into 3 Aspects of Maintaining Wind Turbines
Key Factors To Know Before Using IoT Solutions
Avoiding Hazards: How Vehicle Manufacturers Keep People Safe