Personalizing UX for Agentic AI

Rifx.Online
Chatbots , Autonomous Systems , Machine Learning
19 Jan, 2025

Fine-tuning AI Agents based on User personas for Enterprise Use-cases

1. Introduction

The discussion around ChatGPT (in general, Generative AI), has now evolved into Agentic AI. While ChatGPT is primarily a chatbot that can generate text responses, AI agents can execute complex tasks autonomously, e.g., make a sale, plan a trip, make a flight booking, book a contractor to do a house job, order a pizza. Fig. 1 below illustrates the evolution of agentic AI systems.

Bill Gates recently envisioned a future where we would have an AI agent that is able to process and respond to natural language and accomplish a number of different tasks. Gates used planning a trip as an example.

Ordinarily, this would involve booking your hotel, flights, restaurants, etc. on your own. But an AI agent would be able to use its knowledge of your preferences to book and purchase those things on your behalf.

We focus on this agent personalization aspect (based on user preferences) in this article.

An AI agent today is able to decompose a given task, monitor long running sub-tasks, and autonomously adapt its execution strategy to achieve its goal. This has led to the rise of AI agents optimized to perform specific tasks, potentially provided by different vendors, and published / catalogued in an agent marketplace.

Analogous to fine-tuning of large language models (LLMs) to domain specific LLMs / small language models (SLMs),

we argue that customization / fine-tuning of these (generic) AI agents will be needed with respect to enterprise specific context (of applicable user personas and use-cases) to drive their enterprise adoption.

The key benefits of AI agent personalization include:

Personalized interaction: The AI agent adapts its language, tone, and complexity based on user preferences and interaction history. This ensures that the conversation is more aligned with the user’s expectations and communication style.
Use-case context: The AI agent is aware of the underlying enterprise use-case processes, so that it can prioritize or highlight process features, relevant pieces of content, etc. — optimizing the interaction to achieve the use-case goal more efficiently.
Proactive Assistance: The AI agent anticipates the needs of different users and offers proactive suggestions, resources, or reminders tailored to their specific profiles or tasks.

To summarize, while the current focus to realise AI agents remains on the functional aspects (and, rightfully so);

we highlight in this article that UI/UX for AI agents is equally important as the last mile to drive enterprise adoption.

Towards this end, we outline a reference architecture for agentic AI platforms in Section 2, extending the same to provide the technical details of implementing an agent personalization layer in Section 3. Finally, we discuss agentic AI design principles and best practices in Section 4, enabling change management and driving successful rollout of agentic AI use-cases in enterprises.

2. Agent AI Platform Reference Architecture

In this section, we highlight the key components of a reference AI agent platform — illustrated in Fig. 2:

Agent marketplace
Orchestration layer
Integration layer
Shared memory layer
Governance layer, including explainability, privacy, security, etc.

with the Personalization layer added in Section 3.

Given a user task, we prompt a LLM for the task decomposition — this is the overlap with Gen AI. Unfortunately, this also means that agentic AI systems today are limited by the reasoning capabilities of large language models (LLMs). For ex., the GPT4 task decomposition of the prompt

Generate a tailored email campaign to achieve sales of USD 1 Million in 1 month, The applicable products and their performance metrics are available at [url]. Connect to CRM system [integration] for customer names, email addresses, and demographic details.

is detailed in Fig. 3: (Analyze products) — (Identify target audience) — (Create tailored email campaign).

The LLM then monitors the execution / environment and adapts autonomously as needed. In this case, the agent realised that it is not going to achieve its sales goal and autonomously added the tasks: (Find alternative products) — (Utilize customer data to personalize the emails) — (Perform A/B testing).

Given the need to orchestrate multiple agents, there is a need for an integration layer supporting different agent interaction patterns, e.g., agent-to-agent API, agent API providing output for human consumption, human triggering an AI agent, AI agent-to-agent with human in the Loop. The integration patterns need to be supported by the underlying AgentOps platform. Andrew Ng recently talked about this aspect from a performance perspective:

Today, a lot of LLM output is for human consumption. But in an agentic workflow, an LLM might be prompted repeatedly to reflect on and improve its output, use tools, plan and execute multiple steps, or implement multiple agents that collaborate. So, we might generate hundreds of thousands of tokens or more before showing any output to a user. This makes fast token generation very desirable and makes slower generation a bottleneck to taking better advantage of existing models.

It is also important to mention that integration with enterprise systems (e.g., CRM in this case) will be needed for most use-cases. For instance, refer to the Model Context Protocol (MCP) proposed by Anthropic recently to connect AI agents to external systems where enterprise data resides.

Given the long-running nature of such complex tasks, memory management is key for Agentic AI systems. Once the initial email campaign is launched, the agent needs to monitor the campaign for 1-month.

This entails both context sharing between tasks and maintaining execution context over long periods.

The standard approach here is to save the embedding representation of agent information into a vector store database that can support maximum inner product search (MIPS). For fast retrieval, the approximate nearest neighbors (ANN) algorithm is used that returns approximately top k-nearest neighbors with an accuracy trade-off versus a huge speed gain.

Finally, the governance layer. We need to ensure that data shared by the user specific to a task, or user profile data that cuts across tasks; is only shared with the relevant agents (privacy, authentication and access control). Refer to my previous article on Responsible AI Agents for a discussion on the key dimensions needed to enable a well governed AI agent platform in terms of hallucination guardrails, data quality, privacy, reproducibility, explainability, etc.

3. User Persona based Agentic AI Personalization

Users today expect a seamless and personalized experience with customized execution to meet their specific requirements. However, enterprise user and process specific AI agent personalization remains challenging due to scale, performance, and privacy challenges.

User persona based agent personalization aims to overcome these challenges by segmenting the end-users of a service into a manageable set of user categories, which represent the demographics and preferences of majority of users. For example, the typical personas in an AI agent enabled IT service desk (one of the areas with highest Gen AI adoption) scenario include:

Leadership: Senior individuals (e.g., VPs, Directors) who require priority support with secure access to sensitive data, and assistance with high-level presentations and video conferencing.
Knowledge workers: employees who rely heavily on technology to perform their daily tasks (e.g., analysts, engineers, designers).
Field workers: employees who work primarily outside the office (e.g., sales representatives, service technicians). As such, their requirements are mostly focused on remote access to corporate systems, reliable VPNs, and support with offline work capabilities.
Administrative / HR: support staff responsible for various administrative tasks (e.g., HR, Finance) with primary requirements around assistance with MS Office software, access to specific business applications, and quick resolution of routine IT issues.
New employees / interns: individuals who are new to the organization and may not be fully familiar with the company’s IT systems. As such, their queries mostly focus on onboarding related queries.

In this article, we focus on LLM agents, which loosely translate to invoking (prompting) an LLM to perform natural language processing (NLP) tasks, e.g., processing documents, summarizing them, generating responses based on the retrieved data. For example, refer to the “researcher” agent scenario outlined by LangGraph.

Given this, the solution architecture to perform user persona based fine-tuning of AI agents is illustrated in Fig. 4***.***

The fine-tuning process consists of first parameterizing (aggregated) user data and conversation history and storing it as memory in the LLM via adapters, followed by fine-tuning the LLM for personalized response generation. The agent — user persona router helps in performing user segmentation (scoring) and routing the tasks / prompts to the most relevant agent persona.

For example, refer to the papers below for details of persona based LLM fine-tuning in educational and medical contexts, respectively.

EduChat: considers pre-training models on an educational corpus to establish a foundational knowledge base, and subsequently fine-tuning them on personalized tasks, e.g., essay assessment.
LLM based Medical Assistant Personalizationcombines parameter-efficient fine-tuning (PEFT) with a memory retrieval module to generate personalized medical responses.

3.1 User Data Embeddings

In this section, we focus on generating the agent — user interaction embeddings, which is a pre-requisite for both fine-tuning and/or real-time retrieval-augmented-generation (RAG) prompt context augmentation.

Fine-tuning AI agents on raw user data is often too complex, even if it is at the (aggregated) persona level.

This is primarily due to the following reasons:

Agent interaction data usually spans multiple journeys with sparse data points, various interaction types (multimodal), and potential noise or inconsistencies with incomplete queries — responses.
Moreover, effective personalization often requires a deep understanding of the latent intent / sentiment behind user actions, which can pose difficulties for generic (pre-trained) LLMs — LLM agents.
Finally, fine-tuning is computationally intensive. Agent-user interaction data can be lengthy. Processing and modeling such long sequences (e.g., multi-years’ worth of interaction history) with LLMs can be practically infeasible.

A good solution reference to overcome the above issues is Google’s work on User-LLMs. According to the authors,

USER-LLM distills compressed representations from diverse and noisy user interactions, effectively capturing the essence of a user’s behavioral patterns and preferences across various interaction modalities.

This approach empowers LLMs with a deeper understanding of users’ latent intent (inc. sentiment) and historical patterns (e.g., temporal evolution of user queries — responses) enabling LLMs to tailor responses and generate personalized outcomes.

3.2 Reinforcement Learning based Personalization

In this section, we show how LLM generated responses can be personalized based on a reinforcement learning (RL) enabled recommendation engine (RE).

RL is a powerful technique that is able to achieve complex goals by maximizing a reward function in real-time. The reward function works similar to incentivizing a child with candy and spankings, such that the algorithm is penalized when it takes a wrong decision and rewarded when it takes a right one — this is reinforcement.

High-level, the RL based LLM response / action RE works as follows:

The (current) user sentiment and agent interaction history are combined to quantify the user sentiment curve and discount any sudden changes in user sentiment;
leading to the aggregate reward value corresponding to the last LLM response provided to the user.
This reward value is then provided as feedback to the RL agent — to choose the next optimal LLM generated response / action to be provided to the user.

More concretely, we can formulate the integration of an RL enabled RE with an LLM based chat app as follows — illustrated in Fig. 5:

Action (a): An action a in this case corresponds to a LLM generated response delivered to the user in response to a user task / prompt — as part of an ongoing agent interaction.

Agent (A): is the one performing actions. In this case, the agent is the chat app delivering LLM responses to the users, where an action is selected based on its policy (described below).

Environment: refers to the world with which the agent interacts, and which responds to the agent’s actions. In our case, the environment corresponds to the user U interacting with the chat app. U responds to A’s actions, by providing different types of feedback, both explicit (in the form of a chat response) and implicit (e.g., change in user sentiment).

Policy (𝜋): is the strategy that the agent employs to select the next based action (NBA). Given a user profile Uₚ, (current) sentiment Uₛ, and query / task Uᵩ; the policy function computes the product of the response scores returned by the NLP and RE respectively, selecting the response with the highest score as the NBA:

The NLP engine (NE) parses the task / prompt and outputs a ranked list of responses.
The recommendation engine (RE) provides a score for each response based on the reward function, and taking into account the use-case context, user profile, preferences, sentiment, and conversation history. The policy function can be formalized as follows:

Reward (r): refers the feedback by which we measure the success or failure of an agent’s recommended action (response). The feedback can e.g. refer to the amount of time that a user spends reading a recommended article, or the change in user sentiment upon receiving a response. We consider a 2-step reward function computation where the feedback fₐ received with respect to a recommended action is first mapped to a sentiment score, which is then mapped to a reward

r(a, fₐ*) = s(fₐ)*

where r and s refer to the reward and sentiment functions, respectively.

4. Change Management for Agentic AI to drive Enterprise Adoption

In this section, we discuss design principles to enable the successful rollout of agentic AI use-cases in the enterprise. Given their complexity, there is a need for change management, to actively educate the users reg. the AI agent’s capabilities (and limitations) — setting realistic user expectations.

Rather than trying to invent a new framework here, we take inspiration from the “enterprise friendly” Microsoft, “developer friendly” Google and “user friendly” Apple — to seamlessly drive agentic AI adoption for enterprise use-cases.

Let us have a look at the AI design frameworks recommended by these 3 leaders:

Guidelines for Human-AI Interaction by Microsoft
People + AI Guidebook by Google
Machine Learning: Human Interface Guidelines by Apple

The table below consolidates the principles and best practices from the 3 specifications along the different phases of an AgentOps pipeline.

5. Conclusion

In this article, we considered personalization of AI agent interactions based on user personas for enterprise use-cases. Agentic AI personalization has the potential to significantly accelerate agentic AI adoption by improving user satisfaction rates.

We proposed a reference architecture for an agentic AI platform, and provided the details to implement a personalization layer for the platform with (a) agent-user router to perform user segmentation and map tasks / prompts to the most relevant agent persona, and (b) leveraging agent-user interaction embeddings.

Finally, we discussed design principles to enable successful rollout of agentic AI use-cases in the enterprise. This covers change management and best-practices / guidelines to actively engage with the end-users during all stages of the AgentOps lifecycle to drive their adoption.