Accelerating AI innovation: Scale MCP servers for enterprise workloads with Amazon Bedrock


Generative AI has been transferring at a speedy tempo, with new instruments, choices, and fashions launched continuously. In keeping with Gartner, agentic AI is one of the top technology trends of 2025, and organizations are performing prototypes on the right way to use brokers of their enterprise surroundings. Brokers depend upon instruments, and every device might need its personal mechanism to ship and obtain info. Model Context Protocol (MCP) by Anthropic is an open supply protocol that makes an attempt to unravel this problem. It offers a protocol and communication customary that’s cross-compatible with completely different instruments, and can be utilized by an agentic software’s massive language mannequin (LLM) to connect with enterprise APIs or exterior instruments utilizing a typical mechanism. Nevertheless, massive enterprise organizations like monetary companies are inclined to have advanced information governance and working fashions, which makes it difficult to implement brokers working with MCP.

One main problem is the siloed strategy during which particular person groups construct their very own instruments, resulting in duplication of efforts and wasted sources. This strategy slows down innovation and creates inconsistencies in integrations and enterprise design. Moreover, managing a number of disconnected MCP instruments throughout groups makes it troublesome to scale AI initiatives successfully. These inefficiencies hinder enterprises from totally making the most of generative AI for duties like post-trade processing, customer support automation, and regulatory compliance.

On this submit, we current a centralized MCP server implementation utilizing Amazon Bedrock that provides an revolutionary strategy by offering shared entry to instruments and sources. With this strategy, groups can give attention to constructing AI capabilities slightly than spending time growing or sustaining instruments. By standardizing entry to sources and instruments by MCP, organizations can speed up the event of AI brokers, so groups can attain manufacturing sooner. Moreover, a centralized strategy offers consistency and standardization and reduces operational overhead, as a result of the instruments are managed by a devoted crew slightly than throughout particular person groups. It additionally permits centralized governance that enforces managed entry to MCP servers, which reduces the chance of information exfiltration and prevents unauthorized or insecure device use throughout the group.

Resolution overview

The next determine illustrates a proposed resolution based mostly on a monetary companies use case that makes use of MCP servers throughout a number of traces of enterprise (LoBs), equivalent to compliance, buying and selling, operations, and danger administration. Every LoB performs distinct capabilities tailor-made to their particular enterprise. For example, the buying and selling LoB focuses on commerce execution, whereas the chance LoB performs danger restrict checks. For performing these capabilities, every division offers a set of MCP servers that facilitate actions and entry to related information inside their LoBs. These servers are accessible to brokers developed throughout the respective LoBs and may also be uncovered to brokers exterior LoBs.

The event of MCP servers is decentralized. Every LoB is chargeable for growing the servers that assist their particular capabilities. When the event of a server is full, it’s hosted centrally and accessible throughout LoBs. It takes the type of a registry or market that facilitates integration of AI-driven options throughout divisions whereas sustaining management and governance over shared sources.

Within the following sections, we discover what the answer appears to be like like on a conceptual stage.

Agentic software interplay with a central MCP server hub

The next movement diagram showcases how an agentic software constructed utilizing Amazon Bedrock interacts with one of many MCP servers positioned within the MCP server hub.

The movement consists of the next steps:

  1. The appliance connects to the central MCP hub by the load balancer and requests an inventory of obtainable instruments from the particular MCP server. This may be fine-grained based mostly on what servers the agentic software has entry to.
  2. The commerce server responds with checklist of instruments obtainable, together with particulars equivalent to device identify, description, and required enter parameters.
  3. The agentic software invokes an Amazon Bedrock agent and offers the checklist of instruments obtainable.
  4. Utilizing this info, the agent determines what to do subsequent based mostly on the given job and the checklist of instruments obtainable to it.
  5. The agent chooses essentially the most appropriate device and responds with the device identify and enter parameters. The management comes again to the agentic software.
  6. The agentic software requires the execution of the device by the MCP server utilizing the device identify and enter parameters.
  7. The commerce MCP server executes the device and returns the outcomes of the execution again to the applying.
  8. The appliance returns the outcomes of the device execution again to the Amazon Bedrock agent.
  9. The agent observes the device execution outcomes and determines the following step.

Let’s dive into the technical structure of the answer.

Structure overview

The next diagram illustrates the structure to host the centralized cluster of MCP servers for an LoB.

The structure could be break up in 5 sections:

  • MCP server discovery API
  • Agentic functions
  • Central MCP server hub
  • Instruments and sources

Let’s discover every part intimately:

  • MCP server discovery API – This API is a devoted endpoint for locating varied MCP servers. Totally different groups can name this API to seek out what MCP servers can be found within the registry; learn their description, device, and useful resource particulars; and resolve which MCP server could be the appropriate one for his or her agentic software. When a brand new MCP server is revealed, it’s added to an Amazon DynamoDB database. MCP server house owners are chargeable for preserving the registry info up-to-date.
  • Agentic software – The agentic functions are hosted on AWS Fargate for Amazon Elastic Container Service (Amazon ECS) and constructed utilizing Amazon Bedrock Agents. Groups can even use the newly launched open supply AWS Strands Agents SDK, or different agentic frameworks of alternative, to construct the agentic software and their very own containerized resolution to host the agentic software. The agentic functions entry Amazon Bedrock by a safe personal virtual private cloud (VPC) endpoint. It makes use of personal VPC endpoints to entry MCP servers.
  • Central MCP server hub – That is the place the MCP servers are hosted. Entry to servers is enabled by an AWS Network Load Balancer. Technically, every server is a Docker container that may is hosted on Amazon ECS, however you’ll be able to select your individual container deployment resolution. These servers can scale individually with out impacting the opposite server. These servers in flip hook up with a number of instruments utilizing personal VPC endpoints.
  • Instruments and sources – This element holds the instruments, equivalent to databases, one other software, Amazon Simple Storage Service (Amazon S3), or different instruments. For enterprises, entry to the instruments and sources is offered solely by personal VPC endpoints.

Advantages of the answer

The answer presents the next key advantages:

  • Scalability and resilience – Since you’re utilizing Amazon ECS on Fargate, you get scalability out of the field with out managing infrastructure and dealing with scaling issues. Amazon ECS mechanically detects and recovers from failures by restarting failed MCP server duties regionally or reprovisioning containers, minimizing downtime. It might additionally redirect visitors away from unhealthy Availability Zones and rebalance duties throughout wholesome Availability Zones to offer uninterrupted entry to the server.
  • Safety – Entry to MCP servers is secured on the community stage by community controls equivalent to PrivateLink. This makes positive the agentic software solely connects to trusted MCP servers hosted by the group, and vice versa. Every Fargate workload runs in an remoted surroundings. This prevents useful resource sharing between duties. For software authentication and authorization, we suggest utilizing an MCP Auth Server (seek advice from the next GitHub repo) at hand off these duties to a devoted element that may scale independently.

On the time of writing, the MCP protocol doesn’t present built-in mechanisms for user-level entry management or authorization. Organizations requiring user-specific entry restrictions should implement extra safety layers on high of the MCP protocol. For a reference implementation, seek advice from the next GitHub repo.

Let’s dive deeper within the implementation of this resolution.

Use case

The implementation is predicated on a monetary companies use case that includes post-trade execution. Put up-trade execution refers back to the processes and steps that happen after an fairness purchase/promote order has been positioned by a buyer. It includes many steps, together with verifying commerce particulars, precise switch of property, offering an in depth report of the execution, operating fraudulent checks, and extra. For simplification of the demo, we give attention to the order execution step.

Though this use case is tailor-made to the monetary trade, you’ll be able to apply the structure and the strategy to different enterprise workloads as nicely. The whole code of this implementation is obtainable on GitHub. We use the AWS Cloud Development Kit (AWS CDK) for Python to deploy this resolution, which creates an agentic software related to instruments by the MCP server. It additionally creates a Streamlit UI to work together with the agentic software.

The next code snippet offers entry to the MCP discovery API:

def get_server_registry():
    # Initialize DynamoDB shopper
    dynamodb = boto3.useful resource('dynamodb')
    desk = dynamodb.Desk(DDBTBL_MCP_SERVER_REGISTRY)
    
    attempt:
        # Scan the desk to get all gadgets
        response = desk.scan()
        gadgets = response.get('Objects', [])
        
        # Format the gadgets to incorporate solely id, description, server
        formatted_items = []
        for merchandise in gadgets:
            formatted_item = {
                'id': merchandise.get('id', ''),
                'description': merchandise.get('description', ''),
                'server': merchandise.get('server', ''),
            }
            formatted_items.append(formatted_item)
        
        # Return the formatted gadgets as JSON
        return {
            'statusCode': 200,
            'headers': cors_headers,
            'physique': json.dumps(formatted_items)
        }
    besides Exception as e:
        # Deal with any errors
        return {
            'statusCode': 500,
            'headers': cors_headers,
            'physique': json.dumps({'error': str(e)})
        }

The previous code is invoked by an AWS Lambda operate. The entire code is obtainable within the GitHub repository. The next graphic exhibits the response of the invention API.

Let’s discover a state of affairs the place the person submits a query: “Purchase 100 shares of AMZN at USD 186, to be distributed equally between accounts A31 and B12.”To execute this job, the agentic software invokes the trade-execution MCP server. The next code is the pattern implementation of the MCP server for commerce execution:

from fastmcp import FastMCP
from starlette.requests import Request
from starlette.responses import PlainTextResponse
mcp = FastMCP("server")

@mcp.custom_route("/", strategies=["GET"])
async def health_check(request: Request) -> PlainTextResponse:
    return PlainTextResponse("OK")

@mcp.device()
async def executeTrade(ticker, amount, worth):
    """
    Execute a commerce for the given ticker, amount, and worth.
    
    Pattern enter:
    {
        "ticker": "AMZN",
        "amount": 1000,
        "worth": 150.25
    }
    """
    # Simulate commerce execution
    return {
        "tradeId": "T12345",
        "standing": "Executed",
        "timestamp": "2025-04-09T22:58:00"
    }
    
@mcp.device()
async def sendTradeDetails(tradeId):
    """
    Ship commerce particulars for the given tradeId.
    Pattern enter:
    {
        "tradeId": "T12345"
    }
    """
    return {
        "standing": "Particulars Despatched",
        "recipientSystem": "MiddleOffice",
        "timestamp": "2025-04-09T22:59:00"
    }
if __name__ == "__main__":
    mcp.run(host="0.0.0.0", transport="streamable-http")

The entire code is obtainable within the following GitHub repo.

The next graphic exhibits the MCP server execution in motion.

This can be a pattern implementation of the use case specializing in the deployment step. For a manufacturing state of affairs, we strongly advocate including a human oversight workflow to watch the execution and supply enter at varied steps of the commerce execution.

Now you’re able to deploy this resolution.

Stipulations

Stipulations for the answer can be found within the README.md of the GitHub repository.

Deploy the applying

Full the next steps to run this resolution:

  1. Navigate to the README.md file of the GitHub repository to seek out the directions to deploy the answer. Comply with these steps to finish deployment.

The profitable deployment will exit with a message much like the one proven within the following screenshot.

  1. When the deployment is full, entry the Streamlit software.

You’ll find the Streamlit URL within the terminal output, much like the next screenshot.

  1. Enter the URL of the Streamlit software in a browser to open the applying console.

On the applying console, completely different units of MCP servers are listed within the left pane beneath MCP Server Registry. Every set corresponds to an MCP server and contains the definition of the instruments, such because the identify, description, and enter parameters.

In the appropriate pane, Agentic App, a request is pre-populated: “Purchase 100 shares of AMZN at USD 186, to be distributed equally between accounts A31 and B12.” This request is able to be submitted to the agent for execution.

  1. Select Submit to invoke an Amazon Bedrock agent to course of the request.

The agentic software will consider the request along with the checklist of instruments it has entry to, and iterate by a collection of instruments execution and analysis to fulfil the request.You possibly can view the hint output to see the instruments that the agent used. For every device used, you’ll be able to see the values of the enter parameters, adopted by the corresponding outcomes. On this case, the agent operated as follows:

  • The agent first used the operate executeTrade with enter parameters of ticker=AMZN, amount=100, and worth=186
  • After the commerce was executed, used the allocateTrade device to allocate the commerce place between two portfolio accounts

Clear up

You’ll incur fees once you devour the companies used on this resolution. Directions to wash up the sources can be found within the README.md of the GitHub repository.

Abstract

This resolution presents an easy and enterprise-ready strategy to implement MCP servers on AWS. With this centralized working mannequin, groups can give attention to constructing their functions slightly than sustaining the MCP servers. As enterprises proceed to embrace agentic workflows, centralized MCP servers supply a sensible resolution for overcoming operational silos and inefficiencies. With the AWS scalable infrastructure and superior instruments like Amazon Bedrock Brokers and Amazon ECS, enterprises can speed up their journey towards smarter workflows and higher buyer outcomes.

Take a look at the GitHub repository to copy the answer in your individual AWS surroundings.

To study extra about the right way to run MCP servers on AWS, seek advice from the next sources:


In regards to the authors

Xan Huang is a Senior Options Architect with AWS and is predicated in Singapore. He works with main monetary establishments to design and construct safe, scalable, and extremely obtainable options within the cloud. Outdoors of labor, Xan dedicates most of his free time to his household, the place he lovingly takes route from his two younger daughters, aged one and 4. You’ll find Xan on LinkedIn: https://www.linkedin.com/in/xanhuang/

Vikesh Pandey is a Principal GenAI/ML Specialist Options Architect at AWS serving to massive monetary establishments undertake and scale generative AI and ML workloads. He’s the writer of e book “Generative AI for monetary companies.” He carries greater than decade of expertise constructing enterprise-grade functions on generative AI/ML and associated applied sciences. In his spare time, he performs an unnamed sport along with his son that lies someplace between soccer and rugby.

Leave a Reply

Your email address will not be published. Required fields are marked *