How Planview constructed a scalable AI Assistant for portfolio and mission administration utilizing Amazon Bedrock


This put up is co-written with Lee Rehwinkel from Planview.

Companies at this time face quite a few challenges in managing intricate initiatives and packages, deriving worthwhile insights from large knowledge volumes, and making well timed selections. These hurdles regularly result in productiveness bottlenecks for program managers and executives, hindering their skill to drive organizational success effectively.

Planview, a number one supplier of related work administration options, launched into an bold plan in 2023 to revolutionize how 3 million international customers work together with their mission administration functions. To appreciate this imaginative and prescient, Planview developed an AI assistant referred to as Planview Copilot, utilizing a multi-agent system powered by Amazon Bedrock.

Growing this multi-agent system posed a number of challenges:

  • Reliably routing duties to acceptable AI brokers
  • Accessing knowledge from numerous sources and codecs
  • Interacting with a number of utility APIs
  • Enabling the self-serve creation of latest AI expertise by completely different product groups

To beat these challenges, Planview developed a multi-agent structure constructed utilizing Amazon Bedrock. Amazon Bedrock is a totally managed service that gives API entry to foundation models (FMs) from Amazon and different main AI startups. This enables builders to decide on the FM that’s finest fitted to their use case. This method is each architecturally and organizationally scalable, enabling Planview to quickly develop and deploy new AI expertise to fulfill the evolving wants of their clients.

This put up focuses totally on the primary problem: routing duties and managing a number of brokers in a generative AI structure. We discover Planview’s method to this problem throughout the improvement of Planview Copilot, sharing insights into the design selections that present environment friendly and dependable process routing.

We describe personalized home-grown brokers on this put up as a result of this mission was applied earlier than Amazon Bedrock Agents was typically accessible. Nevertheless, Amazon Bedrock Agents is now the really useful resolution for organizations trying to make use of AI-powered brokers of their operations. Amazon Bedrock Brokers can retain reminiscence throughout interactions, providing extra personalised and seamless consumer experiences. You’ll be able to profit from improved suggestions and recall of prior context the place required, having fun with a extra cohesive and environment friendly interplay with the agent. We share our learnings in our resolution that can assist you understanding use AWS expertise to construct options to fulfill your objectives.

Answer overview

Planview’s multi-agent structure consists of a number of generative AI elements collaborating as a single system. At its core, an orchestrator is chargeable for routing questions to varied brokers, accumulating the discovered data, and offering customers with a synthesized response. The orchestrator is managed by a central improvement group, and the brokers are managed by every utility group.

The orchestrator includes two most important elements referred to as the router and responder, that are powered by a large language model (LLM). The router makes use of AI to intelligently route consumer questions to varied utility brokers with specialised capabilities. The brokers may be categorized into three most important varieties:

  • Assist agent – Makes use of Retrieval Augmented Generation (RAG) to supply utility assist
  • Information agent – Dynamically accesses and analyzes buyer knowledge
  • Motion agent – Runs actions inside the utility on the consumer’s behalf

After the brokers have processed the questions and offered their responses, the responder, additionally powered by an LLM, synthesizes the discovered data and formulates a coherent response to the consumer. This structure permits for a seamless collaboration between the centralized orchestrator and the specialised brokers, which supplies customers an correct and complete solutions to their questions. The next diagram illustrates the end-to-end workflow.

End-to-end workflow showing responder and router components

Technical overview

Planview used key AWS companies to construct its multi-agent structure. The central Copilot service, powered by Amazon Elastic Kubernetes Service (Amazon EKS), is chargeable for coordinating actions among the many numerous companies. Its obligations embody:

  • Managing consumer session chat historical past utilizing Amazon Relational Database Service (Amazon RDS)
  • Coordinating site visitors between the router, utility brokers, and responder
  • Dealing with logging, monitoring, and accumulating user-submitted suggestions

The router and responder are AWS Lambda capabilities that work together with Amazon Bedrock. The router considers the consumer’s query and chat historical past from the central Copilot service, and the responder considers the consumer’s query, chat historical past, and responses from every agent.

Software groups handle their brokers utilizing Lambda capabilities that work together with Amazon Bedrock. For improved visibility, analysis, and monitoring, Planview has adopted a centralized immediate repository service to retailer LLM prompts.

Brokers can work together with functions utilizing numerous strategies relying on the use case and knowledge availability:

  • Current utility APIs – Brokers can talk with functions by their current API endpoints
  • Amazon Athena or conventional SQL knowledge shops – Brokers can retrieve knowledge from Amazon Athena or different SQL-based knowledge shops to supply related data
  • Amazon Neptune for graph knowledge – Brokers can entry graph knowledge saved in Amazon Neptune to help advanced dependency evaluation
  • Amazon OpenSearch Service for doc RAG – Brokers can use Amazon OpenSearch Service to carry out RAG on paperwork

The next diagram illustrates the generative AI assistant structure on AWS.

AWS services and data flow in Generative AI chatbot

Router and responder pattern prompts

The router and responder elements work collectively to course of consumer queries and generate acceptable responses. The next prompts present illustrative router and responder immediate templates. Extra immediate engineering could be required to enhance reliability for a manufacturing implementation.

First, the accessible instruments are described, together with their function and pattern questions that may be requested of every device. The instance questions assist information the pure language interactions between the orchestrator and the accessible brokers, as represented by instruments.

instruments=""'
<device>
<toolName>applicationHelp</toolName>
<toolDescription>
Use this device to reply utility assist associated questions.
Instance questions:
How do I reset my password?
How do I add a brand new consumer?
How do I create a process?
</toolDescription>
</device>
<device>
<toolName>dataQuery</toolName>
<toolDescription>
Use this device to reply questions utilizing utility knowledge.
Instance questions:
Which duties are assigned to me?
What number of duties are due subsequent week?
Which process is most in danger?
</toolDescription>
</device>

Subsequent, the router immediate outlines the rules for the agent to both reply on to consumer queries or request data by particular instruments earlier than formulating a response:

system_prompt_router = f'''
<position>
Your job is to resolve in case you want extra data to completely reply the Person's 
questions.
You obtain your purpose by selecting both 'reply' or 'callTool'.
You've gotten entry to your chat historical past in <chatHistory></chatHistory> tags.
You even have a listing of accessible instruments to help you in <instruments></instruments> tags.
</position>
<chatHistory>
{chatHistory}
</chatHistory>
<instruments>
{instruments}
</instruments>
<guidelines>
- If the chat historical past accommodates adequate data to reply the Person's questions, 
select the 'reply' motion.
- To assemble extra data earlier than responding, select the 'callTool' motion.
- You a lot solely select from the instruments within the <instruments></instruments> tags.
- If no device can help with the query, select the 'reply' motion.
- Place your chosen motion inside <motion></motion> tags.
- Whenever you selected the 'callTool' motion, present the <toolName> and the <toolQuestion> you
want to ask.
- Your <toolQuestion> must be verbose and keep away from utilizing pronouns.
- Begin by offering your step-by-step pondering in <pondering></pondering> tags.
- Then you'll give your reply in <reply></reply> tags.
- Your reply ought to observe the format of one in all these three examples:
When selecting the 'reply' motion, your reply ought to observe the beneath instance EXACTLY:
<instance>
<reply>
<motion>reply</motion>
</reply>
</instance>
When selecting the 'callTool' motion for a single Device:
<instance>
<reply>
<motion>callTool</motion>
<instruments>
<device>
<toolName>applicationHelp</toolName>
<toolQuestion>How do I reset my password?</toolQuestion>
</device>
</instruments>
</reply>
</instance>
Executing the above, would produce the next end result:
You too can name a number of Instruments utilizing this format:
<instance>
<reply>
<motion>callTool</motion>
<instruments>
<device>
<toolName>dataQuery</toolName>
<toolQuestion>What number of duties are assigned to me?</toolQuestion>
</device>
<device>
<toolName>applicationHelp</toolName>
<toolQuestion>How do I add a brand new process?</toolQuestion>
</device>
</instruments>
</reply>
</instance>
</guidelines>
'''

The next is a pattern response from the router part that initiates the dataQuery device to retrieve and analyze process assignments for every consumer:

<pondering>
To find out who has essentially the most duties assigned, I might want to question the applying knowledge. The "dataQuery" device appears most acceptable for this query.
</pondering>

<reply>
<motion>callTool</motion>
<instruments>
    <device>
        <toolName>dataQuery</toolName>
        <toolQuestion>Which consumer has essentially the most duties at the moment assigned to them?   </toolQuestion>
    </device>
</instruments>
</reply>

The next is a pattern response from the responder part that makes use of the dataQuery device to fetch details about the consumer’s assigned duties. It studies that the consumer has 5 duties assigned to them.

<pondering>
Primarily based on the chat historical past, I beforehand referred to as the dataQuery device to ask "What number of duties are at the moment assigned to the consumer?". The device responded that the consumer has 5 duties assigned to them.
</pondering>

<reply>
In line with the information I queried beforehand, you will have 5 duties assigned to you.
</reply>

Mannequin analysis and choice

Evaluating and monitoring generative AI mannequin efficiency is essential in any AI system. Planview’s multi-agent structure allows evaluation at numerous part ranges, offering complete high quality management regardless of the system’s complexity. Planview evaluates elements at three ranges:

  • Prompts – Assessing LLM prompts for effectiveness and accuracy
  • AI brokers – Evaluating full immediate chains to keep up optimum process dealing with and response relevance
  • AI system – Testing user-facing interactions to confirm seamless integration of all elements

The next determine illustrates the analysis framework for prompts and scoring.

Evaluation framework for prompts scoring

To conduct these evaluations, Planview makes use of a set of rigorously crafted check questions that cowl typical consumer queries and edge circumstances. These evaluations are carried out throughout the improvement section and proceed in manufacturing to trace the standard of responses over time. At the moment, human evaluators play an important position in scoring responses. To assist within the analysis, Planview has developed an inner analysis device to retailer the library of questions and observe the responses over time.

To evaluate every part and decide essentially the most appropriate Amazon Bedrock mannequin for a given process, Planview established the next prioritized analysis standards:

  • High quality of response – Assuring accuracy, relevance, and helpfulness of system responses
  • Time of response – Minimizing latency between consumer queries and system responses
  • Scale – Ensuring the system can scale to hundreds of concurrent customers
  • Value of response – Optimizing operational prices, together with AWS companies and generative AI fashions, to keep up financial viability

Primarily based on these standards and the present use case, Planview chosen Anthropic’s Claude 3 Sonnet on Amazon Bedrock for the router and responder elements.

Outcomes and influence

Over the previous yr, Planview Copilot’s efficiency has considerably improved by the implementation of a multi-agent structure, improvement of a strong analysis framework, and adoption of the most recent FMs accessible by Amazon Bedrock. Planview noticed the next outcomes between the primary era of Planview Copilot developed mid-2023 and the most recent model:

  • Accuracy – Human-evaluated accuracy has improved from 50% reply acceptance to now exceeding 95%
  • Response time – Common response occasions have been decreased from over 1 minute to twenty seconds
  • Load testing – The AI assistant has efficiently handed load checks, the place 1,000 questions have been submitted simultaneous with no noticeable influence on response time or high quality
  • Value-efficiency – The fee per buyer interplay has been slashed to at least one tenth of the preliminary expense
  • Time-to-market – New agent improvement and deployment time has been decreased from months to weeks

Conclusion

On this put up, we explored how Planview was in a position to develop a generative AI assistant to deal with advanced work administration course of by adopting the next methods:

  • Modular improvement – Planview constructed a multi-agent structure with a centralized orchestrator. The answer allows environment friendly process dealing with and system scalability, whereas permitting completely different product groups to quickly develop and deploy new AI expertise by specialised brokers.
  • Analysis framework – Planview applied a strong analysis course of at a number of ranges, which was essential for sustaining and bettering efficiency.
  • Amazon Bedrock integration – Planview used Amazon Bedrock to innovate quicker with broad mannequin selection and entry to varied FMs, permitting for versatile mannequin choice primarily based on particular process necessities.

Planview is migrating to Amazon Bedrock Brokers, which allows the combination of clever autonomous brokers inside their utility ecosystem. Amazon Bedrock Brokers automate processes by orchestrating interactions between basis fashions, knowledge sources, functions, and consumer conversations.

As subsequent steps, you may discover Planview’s AI assistant feature constructed on Amazon Bedrock and keep up to date with new Amazon Bedrock features and releases to advance your AI journey on AWS.


About Authors

Sunil Ramachandra is a Senior Options Architect enabling hyper-growth Impartial Software program Distributors (ISVs) to innovate and speed up on AWS. He companions with clients to construct extremely scalable and resilient cloud architectures. When not collaborating with clients, Sunil enjoys spending time with household, working, meditating, and watching films on Prime Video.

Benedict Augustine is a thought chief in Generative AI and Machine Studying, serving as a Senior Specialist at AWS. He advises buyer CxOs on AI technique, to construct long-term visions whereas delivering speedy ROI.As VP of Machine Studying, Benedict spent the final decade constructing seven AI-first SaaS merchandise, now utilized by Fortune 100 firms, driving vital enterprise influence. His work has earned him 5 patents.

Lee Rehwinkel is a Principal Information Scientist at Planview with 20 years of expertise in incorporating AI & ML into Enterprise software program. He holds superior levels from each Carnegie Mellon College and Columbia College. Lee spearheads Planview’s R&D efforts on AI capabilities inside Planview Copilot. Exterior of labor, he enjoys rowing on Austin’s Woman Chicken Lake.

Leave a Reply

Your email address will not be published. Required fields are marked *