Analysis & Visualization
-
Analyze data and create visualizations to generate actionable insights.
Analysis & Visualization
After Data Cleaning, the next major step in any project is:
Data Analysis
Data VisualizationThese steps help convert raw data into meaningful insights using tools like Pandas, Matplotlib, and Seaborn.
Data Analysis
What is Data Analysis?
Data Analysis is the process of:
Examining cleaned data
Finding patterns
Discovering trends
Testing relationships
Answering business questions
Simple meaning:
"Data se meaningful insights nikalna."Types of Data Analysis
1. Descriptive Analysis
What happened?
Example: Total sales last month.2. Diagnostic Analysis
Why did it happen?
Example: Sales decreased due to low marketing spend.3. Predictive Analysis
What will happen?
Example: Forecast next month's sales.Example (Using Pandas)
Diagnostic and Predictive Analysis using Pandas
The following Python code performs basic descriptive analysis, which is the foundation for both diagnostic and predictive analysis.
import pandas as pd
df = pd.read_csv("sales.csv")
# Total sales
print(df["sales"].sum())
# Average sales
print(df["sales"].mean())
# Group by category
print(df.groupby("category")["sales"].sum())
Data Visualization
What is Data Visualization?
Data Visualization is the graphical representation of data.
It helps to:
Understand trends quickly
Identify patterns
Detect outliers
Present insights clearly
Common Types of Charts
Example — Bar Chart
Sales by Category – Bar Chart Visualization
A bar chart is used to compare values across different categories. In this example: X-axis → Category Y-axis → Sales Each bar represents total or average sales of a category. This type of visualization helps in diagnostic analysis to quickly identify: Which category is performing best Which category has low sales
import seaborn as sns
import matplotlib.pyplot as plt
sns.barplot(x="category", y="sales", data=df)
plt.title("Sales by Category")
plt.show()
- Example — Line Chart
Monthly Sales Trend – Line Chart
A line chart is used to show trends over time. In this example: X-axis → Month Y-axis → Sales The line connects sales values month by month This helps in trend analysis and is useful for predictive analysis, as it shows whether sales are increasing, decreasing, or fluctuating over time.
sns.lineplot(x="month", y="sales", data=df)
plt.title("Monthly Sales Trend")
plt.show()