Type something to search...
Using agents to breathe life into NPCs using CrewAI — Initial Conversation Analysis

Using agents to breathe life into NPCs using CrewAI — Initial Conversation Analysis

Conversation Analysis using output from: Using Agents to Breathe life into NPCs using CrewAI

Analysis

  • Simulation1: Population of a Software Engineer, Computer Scientist, Computer Engineer
  • Conclusions
  • Supporting

Methods of Analysis

  • Features extracted
  • Splitting global_conversations.txt
  • Sentiment, topics, lexical diversity, emotion
  • Self-similarity
  • Notebook

Background

Previously I discussed why I had interest in simulating 2d societies in my articles Using Agents to Breathe Life into NPCs and Using Agents to Breathe life into NPCs using CrewAI. Let’s analyze a brief conversation using the conversation-party simulator. Note, this is before any NPC-world interactions have been developed.

Population of a Software Engineer, Computer Scientist, Computer Engineer

We add NPCs who have bios in neighboring fields of study. Their initial tasks should allow them to easily discuss with one-another

NPCs

Non-playable characters in the simulation

Carly Cummings — Computer Scientist, Initial Task: Apply bitwise operations

Katherine Jones — Computer Engineer, Initial Task: Create a half-adder

Ashley Brown — Electical Engineer, Initial Task: Apply boolean logic

Conclusions

  • The three NPCs stayed on topic, or more specifically they strayed very little from their starting topics.
  • The range of word use was impressively diverse for a given line of conversation; however, there were echos of the discussion in responses which is evident when comparing similarities between texts.
  • Topic analysis revealed two conversations. It should be noted that this may be because 2 people didn’t have interactions. The set of bidirectional conversations are : Carly<->Katherine, Carly<-Ashley

Supporting

Text

Conversation text can be found in the beginning of this notebook

Sentiment Analysis:

Remember: -1 is bad, 0 is neutral, and +1 is positive

Looking at the conversations analyzed, we can see some very neutral, and relatively positive conversations occuring.

Topics

Two topic groupings were identified:

Topic 0 includes: ‘queries’, ‘caching’, ‘query’, ‘optimizing’, ‘database’, ‘efficient’, ‘strategies’, ‘performance’, ‘retrieval’, ‘data’

Topic 1 includes: ‘enhance’, ‘like’, ‘algorithm’, ‘data’, ‘efficient’, ‘boolean’, ‘operations’, ‘design’, ‘cpu’, ‘logic’]

Lexical diversity

Remember: this is a ratio of the set of words over all words. In otherwords, as lexical diversity approach 1(max uniqueness) more words are unique and as lexical diversity approaches 0(min uniqueness) more words are repeated.

We can see from this graph that there is mid to high uniqueness in word usage per conversation.

Emotional Recognition

Emotion Recognition using Hugging Face Transformers: emotion_pipeline = pipeline(‘sentiment-analysis’, model=’bhadresh-savani/distilbert-base-uncased-emotion’)

Surprise and Joy were the emotions recognized in the exchange.

Semantic Similarity

Remember: 1 is perfectly similar and -1 is perfectly dissimilar

Clearly the conversations stayed on a similar topic; for example,our most dissimilar Cosine Similarity is around 0.8, which is fairly similar.

Methods of Analysis

Features extracted

  1. Sentiment Analysis: Uses TextBlob for a quick polarity assessment
  2. Topic Modeling: Uses LDA to identify topics; ensure data preprocessing is done for better results (e.g., stopword removal, lemmatization).
  3. Lexical Diversity: Simple type-token ratio.
  4. Emotion Recognition: Uses a BERT-based model available on Hugging Face for recognizing emotions beyond simple sentiment.
  5. Semantic Similarity: A basic example leveraging BERT for embedding similarity, which indicates the context alignment between conversational turns.

Splitting global_conversations.txt

We have a pattern for who is talking to whom, so we split on those vectors; ignoring speaker and listener

import re
def split_conversation(raw_conversation):
  
  # Our conversations go something like "<person talking> (talking to <person>):..." 
  # Let's split on that 
  lines = re.split(r'^.*?\(talking to.*?\)\:', raw_conversation, flags=re.MULTILINE)
 
  
  # remove empty lines and leading/trailing whitespace from each line
  return [line.strip() for line in lines if line.strip()]

Sentiment Topics and Emotions

import nltk
nltk.download('punkt_tab')
nltk.download('punkt')
nltk.download('stopwords')
nltk.download('vader_lexicon')
import gensim
import numpy as np
import pandas as pd
from textblob import TextBlob
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.decomposition import LatentDirichletAllocation
from nltk.corpus import stopwords
import transformers
from transformers import pipeline, BertTokenizer, BertModel
from gensim import corpora, models
from collections import Counter
import networkx as nx

def sentiment_analysis(conversations):
    sentiments = []
    for conversation in conversations:
        blob = TextBlob(conversation)
        sentiments.append(blob.sentiment.polarity)
    return sentiments

def topic_modeling(conversations, num_topics=2):
    cv = CountVectorizer(stop_words='english')
    dtm = cv.fit_transform(conversations)
    lda = LatentDirichletAllocation(n_components=num_topics, random_state=0)
    lda.fit(dtm)
    topic_results = lda.transform(dtm)
    topic_words = {}
    for i, topic in enumerate(lda.components_):
        topic_words[f"Topic {i}"] = [cv.get_feature_names_out()[j] 
                                     for j in topic.argsort()[-10:]]
    return topic_words

def lexical_diversity(conversations):
    diversities = []
    for conversation in conversations:
        words = nltk.word_tokenize(conversation)
        diversity = len(set(words)) / len(words) #"set of words"/"num words"
        diversities.append(diversity)
    return diversities

## emotion recognition using Hugging Face Transformers
emotion_pipeline = pipeline('sentiment-analysis', 
                            model='bhadresh-savani/distilbert-base-uncased-emotion')

def detect_emotions(conversation_texts):
    return emotion_pipeline(conversation_texts)

sentiments = sentiment_analysis(conversations)
topics = topic_modeling(conversations)
diversities = lexical_diversity(conversations)
emotions = detect_emotions(conversations)

print(f'Sentiments: {sentiments}')
print(f'Topics: {topics}')
print(f'Lexical Diversities: {diversities}')
print(f'Emotions: {emotions}')

Self-similarity

We use Hugging Face transformers to initialize a BERT tokenizer and model, which are pre-trained on the “bert-base-uncased’ corpus, to compute semantic similarity among a given list of texts. The get_embeddings() function tokenizes a sentence, generates its embeddings using BERT and averages the vectors to produce a single embedding. The calculate_cosine_similarity() function computes the cosine similarity betwee two embedding vectors, quantifying their semantic similarity. The semantic_similarity() function processes all conversations, gets their embeddings, and populates a matrix with pairwise cosine similarities. When populating the matrix, its careful to set self-similar comparisons to 1.0. This matrix is the basis for the heatmap where yellow means more similar and blue means less similar.

from transformers import BertTokenizer, BertModel
import torch
import numpy as np
import matplotlib.pyplot as plt

## init BERT tokenizer and model
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')

def get_embeddings(sentence):
    inputs = tokenizer(sentence, return_tensors='pt', truncation=True, padding=True)
    with torch.no_grad():
        outputs = model(**inputs)
    return outputs.last_hidden_state.mean(dim=1).squeeze()

def calculate_cosine_similarity(embed1, embed2):
    cos_sim = np.dot(embed1, embed2) / (np.linalg.norm(embed1) * np.linalg.norm(embed2))
    return cos_sim

def semantic_similarity(conversations):
    embeddings = [get_embeddings(conversation).numpy() for conversation in conversations]
    similarities = np.zeros((len(conversations), len(conversations)))

    for i in range(len(conversations)):
        for j in range(len(conversations)):
            if i != j:
                similarities[i][j] = calculate_cosine_similarity(embeddings[i], embeddings[j])
            else:
                similarities[i][j] = 1.0  # Similarity of a sentence with itself

    return similarities

## calculate semantic similarities for the entire conversation set
similarities = semantic_similarity(conversations)

## semantic similarity as a heatmap
plt.figure(figsize=(8, 6))
plt.imshow(similarities, cmap='viridis', interpolation='nearest')
plt.colorbar(label='Cosine Similarity')
plt.xticks(ticks=range(len(conversations)), labels=[f"Conv-{i+1}" for i in range(len(conversations))], rotation=45)
plt.yticks(ticks=range(len(conversations)), labels=[f"Conv-{i+1}" for i in range(len(conversations))])
plt.title('Semantic Similarity Heatmap Among Conversations')
plt.xlabel('Conversation Index')
plt.ylabel('Conversation Index')
plt.show()

Notebook

If you found this article insightful, please consider clapping for this piece — it not only supports the author but also helps others discover valuable insights. Additionally, don’t forget to subscribe for more articles that delve into innovative technologies and AI developments. Your engagement is greatly appreciated!

Related Posts

10 Creative Ways to Use ChatGPT Search The Web Feature

10 Creative Ways to Use ChatGPT Search The Web Feature

For example, prompts and outputs Did you know you can use the “search the web” feature of ChatGPT for many tasks other than your basic web search? For those who don't know, ChatGPT’s new

Read More
📚 10 Must-Learn Skills to Stay Ahead in AI and Tech 🚀

📚 10 Must-Learn Skills to Stay Ahead in AI and Tech 🚀

In an industry as dynamic as AI and tech, staying ahead means constantly upgrading your skills. Whether you’re aiming to dive deep into AI model performance, master data analysis, or transform trad

Read More
10 Powerful Perplexity AI Prompts to Automate Your Marketing Tasks

10 Powerful Perplexity AI Prompts to Automate Your Marketing Tasks

In today’s fast-paced digital world, marketers are always looking for smarter ways to streamline their efforts. Imagine having a personal assistant who can create audience profiles, suggest mar

Read More
10+ Top ChatGPT Prompts for UI/UX Designers

10+ Top ChatGPT Prompts for UI/UX Designers

AI technologies, such as machine learning, natural language processing, and data analytics, are redefining traditional design methodologies. From automating repetitive tasks to enabling personal

Read More
100 AI Tools to Finish Months of Work in Minutes

100 AI Tools to Finish Months of Work in Minutes

The rapid advancements in artificial intelligence (AI) have transformed how businesses operate, allowing people to complete tasks that once took weeks or months in mere minutes. From content creat

Read More
17 Mindblowing GitHub Repositories You Never Knew Existed

17 Mindblowing GitHub Repositories You Never Knew Existed

Github Hidden Gems!! Repositories To Bookmark Right Away Learning to code is relatively easy, but mastering the art of writing better code is much tougher. GitHub serves as a treasur

Read More