How to Use Claude 3.5 for Data Analysis(Secret tool)
- Rifx.Online
- Data Science , Technology , Machine Learning
- 20 Nov, 2024
Most of us dont know about this feature
In this article we’ll talk about how Claude, an AI-powered data analyst that can help us analyze data and make cool interactive visualizations.
We’ll cover everything you need to know to get started, from turning on special features to using smaller datasets.
Let’s dive in!
What is Claude: An AI Data Analyst
Imagine having a really smart helper in your pocket who can look at data and give you answers.
That’s what Claude is like! Claude makes data analysis easy for everyone, whether you’re a professional data scientist or just someone who wants to explore data for fun.
Claude can analyze your data, make visualizations, and even write code to make interactive charts — all with just a few clicks.
In this article, we’ll show you how Claude works by analyzing a real dataset from Kaggle.
Keep in mind, this feature is still in a testing mode, so it might not be available right away.
You need to turn on the “feature preview” before starting data analysis.
Getting Started with Claude’s Analysis Tools
First, you need to go to the feature preview section in Claude and turn on the data analysis tool.
Once it’s on, you can upload datasets for Claude to analyze.
But remember — Claude has limits on the size of datasets it can handle.
If the dataset is too big, Claude might show an error like “Conversation exceeds the length limit.”
This means it works better with smaller datasets right now. It can be a bit annoying if you want to use big datasets, but it still does a great job for smaller ones.
Uploading a Dataset and Running the Analysis
For our example, we used a dataset from Kaggle called the “100 Healthiest Foods Dataset.”
This dataset has information like food names, nutrition values, origin, calories, protein, fiber, and other health facts.
Once you upload the dataset, Claude can do some basic analysis and even make charts.
Example 1: Top Foods by Antioxidant Score
The first step was to drag and drop the dataset into Claude.
I typed
“You are a data analyst, do your thing”
and sent it.
Claude quickly analyzed the dataset and showed me the top 10 foods with the highest antioxidant score: cinnamon, turmeric, ginger, cranberries, blueberries, cloves, and more.
code:
import React, { useState, useEffect } from 'react';
import { BarChart, Bar, XAxis, YAxis, CartesianGrid, Tooltip, Legend, ResponsiveContainer, PieChart, Pie, Cell } from 'recharts';
import { Card, CardHeader, CardTitle, CardContent } from '@/components/ui/card';
import Papa from 'papaparse';
import _ from 'lodash';
const COLORS = ['#0088FE', '#00C49F', '#FFBB28', '#FF8042', '#8884d8', '#82ca9d', '#ffc658'];
const HealthFoodsDashboard = () => {
const [data, setData] = useState({
topAntioxidants: [],
regionDistribution: [],
topVitaminC: [],
averages: null
});
useEffect(() => {
const fetchData = async () => {
try {
const response = await window.fs.readFile('Top 100 Healthiest Food in the World.csv', { encoding: 'utf8' });
const parsedData = Papa.parse(response, {
header: true,
dynamicTyping: true,
skipEmptyLines: true
});
// Process data for visualizations
const topAntioxidants = _.orderBy(parsedData.data, ['Antioxidant Score'], ['desc'])
.slice(0, 10)
.map(item => ({
name: item.Food,
value: item['Antioxidant Score']
}));
const regionGroups = _.groupBy(parsedData.data, 'Originated From');
const regionDistribution = Object.entries(regionGroups)
.map(([region, foods]) => ({
name: region,
value: foods.length
}))
.sort((a, b) => b.value - a.value)
.slice(0, 7);
const topVitaminC = _.orderBy(parsedData.data, ['Vitamin C (mg)'], ['desc'])
.slice(0, 5)
.map(item => ({
name: item.Food,
value: item['Vitamin C (mg)']
}));
setData({
topAntioxidants,
regionDistribution,
topVitaminC
});
} catch (error) {
console.error('Error fetching data:', error);
}
};
fetchData();
}, []);
return (
<div className="space-y-4 p-4">
<h1 className="text-2xl font-bold mb-4">World's Healthiest Foods Analysis</h1>
<div className="grid grid-cols-1 md:grid-cols-2 gap-4">
{/* Top Antioxidant Foods */}
<Card className="col-span-1">
<CardHeader>
<CardTitle>Top 10 Foods by Antioxidant Score</CardTitle>
</CardHeader>
<CardContent>
<div className="h-96">
<ResponsiveContainer width="100%" height="100%">
<BarChart
data={data.topAntioxidants}
layout="vertical"
margin={{ top: 5, right: 30, left: 100, bottom: 5 }}
>
<CartesianGrid strokeDasharray="3 3" />
<XAxis type="number" />
<YAxis dataKey="name" type="category" />
<Tooltip />
<Bar dataKey="value" fill="#8884d8" />
</BarChart>
</ResponsiveContainer>
</div>
</CardContent>
</Card>
{/* Region Distribution */}
<Card className="col-span-1">
<CardHeader>
<CardTitle>Foods by Region</CardTitle>
</CardHeader>
<CardContent>
<div className="h-96">
<ResponsiveContainer width="100%" height="100%">
<PieChart>
<Pie
data={data.regionDistribution}
dataKey="value"
nameKey="name"
cx="50%"
cy="50%"
outerRadius={120}
fill="#8884d8"
label={({name, percent}) => `${name} (${(percent * 100).toFixed(0)}%)`}
>
{data.regionDistribution.map((entry, index) => (
<Cell key={`cell-${index}`} fill={COLORS[index % COLORS.length]} />
))}
</Pie>
<Tooltip />
</PieChart>
</ResponsiveContainer>
</div>
</CardContent>
</Card>
{/* Top Vitamin C Foods */}
<Card className="col-span-1">
<CardHeader>
<CardTitle>Top 5 Foods Highest in Vitamin C</CardTitle>
</CardHeader>
<CardContent>
<div className="h-80">
<ResponsiveContainer width="100%" height="100%">
<BarChart
data={data.topVitaminC}
margin={{ top: 5, right: 30, left: 20, bottom: 5 }}
>
<CartesianGrid strokeDasharray="3 3" />
<XAxis dataKey="name" />
<YAxis />
<Tooltip />
<Bar dataKey="value" fill="#82ca9d" />
</BarChart>
</ResponsiveContainer>
</div>
</CardContent>
</Card>
</div>
</div>
);
};
export default HealthFoodsDashboard;
These charts help you see which foods are best for antioxidants, which is great if you’re trying to eat healthier and many other.
Example 2: Foods High in Protein
Claude also looked at which foods had the most protein.
Some of the top foods were sardines, salmon, clams, lentils, Greek yogurt, and kidney beans.
If you’re an athlete or just want more protein in your diet, this is a really quick way to see what to eat.
Claude even made a simple bar chart to compare the protein levels, making it super easy to understand.
Example 3: Regional Analysis of Healthy Foods
Another cool thing Claude did was show which regions have the healthiest foods. For example, areas like the Mediterranean and Southeast Asia had a lot of healthy foods because they have a diverse range of nutrient-rich options.
Claude created a map that showed which parts of the world have the healthiest foods. This is helpful if you want to learn about food cultures or plan a healthy diet.
One problem with Claude is that it can’t handle very large datasets.
For example, I tried using a “Student Performance Dataset” with thousands of rows, and Claude couldn’t process it because of its size.
I ended up breaking the dataset into smaller parts, and that worked well.
Claude vs. ChatGPT: Interactive Visualizations
One of the coolest things about Claude is that it can make interactive visualizations.
It can write code to make charts that you can click on and explore.
On the other hand, ChatGPT’s advanced data analysis (ADA) tool usually makes visualizations using Python, which can be more advanced but isn’t always interactive.
Example 4: Interactive Charts with Claude
Claude can make charts that let you click on the data points to explore.
For instance, when I asked Claude to compare the calorie content of different foods, it made an interactive chart.
code:
import Papa from 'papaparse';
import _ from 'lodash';
const fileContent = await window.fs.readFile('Top 100 Healthiest Food in the World.csv', { encoding: 'utf8' });
const parsedData = Papa.parse(fileContent, {
header: true,
dynamicTyping: true,
skipEmptyLines: true
});
// Get highest and lowest calorie foods
const sortedByCalories = _.orderBy(parsedData.data, ['Calories'], ['desc']);
const highestCalories = sortedByCalories.slice(0, 10);
const lowestCalories = sortedByCalories.slice(-10).reverse();
console.log("Top 10 highest calorie foods:");
highestCalories.forEach(food => {
console.log(`${food.Food}: ${food.Calories} calories (per ${food.Quantity})`);
});
console.log("\nTop 10 lowest calorie foods:");
lowestCalories.forEach(food => {
console.log(`${food.Food}: ${food.Calories} calories (per ${food.Quantity})`);
});
// Calculate calorie statistics
const calorieStats = {
average: _.meanBy(parsedData.data, 'Calories'),
median: _.sortBy(parsedData.data, 'Calories')[Math.floor(parsedData.data.length / 2)].Calories,
max: _.maxBy(parsedData.data, 'Calories').Calories,
min: _.minBy(parsedData.data, 'Calories').Calories
};
console.log("\nCalorie Statistics:");
console.log(calorieStats);
// Group foods by calorie ranges
const calorieRanges = _.groupBy(parsedData.data, food => {
const calories = food.Calories;
if (calories >= 200) return '200+ calories';
if (calories >= 150) return '150-199 calories';
if (calories >= 100) return '100-149 calories';
if (calories >= 50) return '50-99 calories';
return 'Under 50 calories';
});
console.log("\nCalorie distribution:");
Object.entries(calorieRanges).forEach(([range, foods]) => {
console.log(`${range}: ${foods.length} foods`);
});
// Analyze calories by food origin
const caloriesByRegion = _(parsedData.data)
.groupBy('Originated From')
.map((foods, region) => ({
region,
avgCalories: _.meanBy(foods, 'Calories'),
count: foods.length
}))
.orderBy(['avgCalories'], ['desc'])
.value();
console.log("\nAverage calories by region (regions with 2+ foods):");
caloriesByRegion
.filter(region => region.count >= 2)
.forEach(region => {
console.log(`${region.region}: ${region.avgCalories.toFixed(1)} avg calories (${region.count} foods)`);
});
This made it easy to see the differences between foods, which is great for presentations when people want to click and learn more details.
Example 5: Comparing Claude and ChatGPT’s Visualization Approaches
When I compared Claude with ChatGPT’s ADA, I found that ChatGPT made more detailed and advanced charts like box plots and heat maps.
For example, I uploaded a “Customer Purchase Behavior” dataset to ChatGPT, and it made a heat map that showed the relationships between age, spending, and how often people buy things.
This helped me understand the connections between different factors.
While Claude could make simple charts, ChatGPT was better for handling bigger datasets and showing more complex information.
Claude’s interactive charts are easy to share and are great for quick summaries. But if you need really deep insights, ChatGPT is better.
It can do more and handle larger datasets without any problems.
When Should You Use Claude?
Claude’s data analysis tool is perfect for people who want quick and easy charts but don’t want to learn coding.
If you are a small business owner or someone who doesn’t know how to code, Claude can help you understand your data without hiring a data analyst.
Example 6: Small Business Sales Data
Imagine you own a small business and have sales data from the past year. You can upload your monthly sales numbers into Claude, and it can help you see trends.
For example, it can show you which months had the highest sales and which products sold the most.
It can also make interactive charts showing seasonal patterns, helping you decide when to stock up on certain items or run special promotions.
Example 7: School Project or Research
If you’re a student doing a school project, Claude can be really useful. For instance, if you are studying how different weather conditions affect plant growth, you can upload your data, and Claude will make charts that show how things change over time.
It’s a great way to turn your data into easy-to-understand charts that will impress your teacher.
But if you need to do really in-depth analysis or if you already know how to code and want to work with big datasets, ChatGPT’s ADA might be a better choice because it doesn’t have the same size limits that Claude does.
Final Thoughts
Claude’s data analysis feature is very exciting especially because it can make interactive charts.
But it still has limits, like not handling large datasets well.
Right now, Claude is a great choice if you need quick and simple insights. But if you need more advanced analysis, ChatGPT is probably better.
Whether you’re a student, a small business owner, or just curious about data, Claude gives you a simple way to explore and visualize information.
It might not replace a professional data scientist, but it can definitely help you get started.
I hope this guide helped you understand how to use Claude for data analysis!
If you have any questions, feel free to leave a comment or reach out. Happy analyzing, and see you in the next article!