We are excited to announce that Hugging Face’s popular open model is now available for purchase on Amazon Bedrock in the new Bedrock Marketplace. AWS customers can now use Bedrock Marketplace to deploy 83 open models to build Generative AI applications.
Internally, endpoints for Bedrock Marketplace models are managed by Amazon Sagemaker Jumpstart. With Bedrock Marketplace, you can now combine the ease of use of SageMaker JumpStart with the fully managed infrastructure of Amazon Bedrock, including compatibility with high-level APIs such as agents, knowledge bases, guardrails, and model evaluation. Ta.
When you register a Sagemaker Jumpstart endpoint with Amazon Bedrock, you pay only for the Sagemaker compute resources, and regular Amazon Bedrock API pricing applies.
This blog describes how to deploy the Gemma 2 27B Instruct and use the model with the Amazon Bedrock API. Learn how to:
Deploying Google Gemma 2 27B Instructions Submitting Requests Using Amazon Bedrock API Cleanup
Google Gemma 2 27B installation instructions
There are two ways to deploy open models for use with Amazon Bedrock.
You can deploy open models from the Bedrock model catalog. You can use Amazon Jumpstart to deploy open models and register them with Bedrock.
Since both methods are similar, we will discuss the Bedrock Model catalog.
First, make sure you’re in one of the 14 regions where Bedrock Marketplace is available in the Amazon Bedrock console. Then select (Model Catalog) in the (Basic Models) section of the navigation pane. Here you can find both serverless models and models available on Amazon Bedrock Marketplace. Filter your results by the “Hugging Face” provider to see the 83 open models available.
For example, search for “Google Gemma 2 27B Instruct” and select it.
Selecting a model opens the model details page where you can see detailed information from the model provider, including model highlights and usage, including sample API calls.
Click “Deploy” in the top right.
The deployment page appears, allowing you to select the endpoint name, instance configuration, advanced settings related to network configuration, and the service role that Sagemaker will use to perform the deployment. Let’s use the default advanced settings and recommended instance type.
You must also accept the model provider’s end user license agreement.
Click “Deploy” at the bottom right.
I started deploying a GoogleGemma 2 27B Instruct model on a ml.g5.48xlarge instance hosted in an Amazon Sagemaker tenant and compatible with the Amazon Bedrock API.
Deploying the endpoints may take several minutes. This is displayed on the (Marketplace Deployments) page in the (Foundation Models) section of the navigation pane.
Use models with the Amazon Bedrock API
You can quickly test your model in the playground using the UI. However, to programmatically call a model that was deployed using the Amazon Bedrock API, you need to obtain the endpoint ARN.
Select your model deployment from the list of managed deployments and copy its endpoint ARN.
You can use the AWS SDK in your preferred language or use the AWS CLI to query your endpoints.
Below is an example of using the Bedrock Converse API through the AWS SDK for Python (boto3).
import boto3 bedrock_runtime = boto3.client(“Bedrock Runtime”) endpoint_arn = “arn:aws:sagemaker:::endpoint/”
inference_config = {
“Max Token”: 256,
“temperature”: 0.1,
“Top P”: 0.999} additional model fields = {“parameter”: {“Repetition Penalty”: 0.9, “Top k”: 250, “do_sample”: truth}} response = bedrock_runtime.converse( modelId=endpoint_arn,messages=( {
“role”: “user”,
“content”🙁 {
“Sentence”: “What is Amazon doing in the field of generative AI?”}, ) }, ), inferenceConfig=inference_config, AdditionalModelRequestFields=Additional_model_fields, )
print(response(“output”)(“message”)(“content”)(0)(“Sentence”))
“Amazon is making great strides in the field of generative AI and applying it to a variety of products and services. Here’s a breakdown of the company’s key efforts:\n\n**1. Amazon Bedrock:**\n\n* This is Amazon’s foundational **fully managed service** that allows developers to build and scale generated AI applications using models from Amazon and other leading AI companies. Yes.\n* Provides access to a large family of language models, including Amazon Titan (LLM) Cohere Generations and Models for Text.
that’s it! For more information, please refer to the Bedrock documentation.
cleaning
Remember to delete the endpoint at the end of the experiment to avoid incurring costs. At the top right of the page where you retrieve the endpoint ARN, you can delete the endpoint by clicking Delete.