Data Cleaning & Transformation (Power Query)
-
Clean and transform datasets using Power Query in Power BI.
Importance of Data Cleaning
Raw data is usually:
Inconsistent
Contains duplicates
Has missing values
Wrong data types
Extra spaces / formatting issues
If data is not cleaned:
Wrong totals
Incorrect KPIs
Confusing dashboardsGolden Rule:
Garbage In → Garbage Out
Power Query helps convert messy raw data into structured, reliable data.
Removing Duplicates
Why Remove Duplicates?
Duplicate rows can:
Double-count sales
Inflate revenue
Distort analysis
How to Remove:
Select column(s)
Go to Home → Remove Rows
Click “Remove Duplicates”
Always decide carefully:
Remove duplicates based on ID? Email? Order number?Handling Null / Missing Values
Missing values are common in real datasets.
Problems Caused by Nulls:
Blank visuals
Calculation errors
Incorrect averages
Solutions:
Replace with default value (0, “Unknown”)
Remove rows
Fill Down / Fill Up
Use conditional columnExample:
Replace null in Sales → 0Correcting Data Types
Very important step ⚠
If data type is wrong:
Sum won’t work
Dates won’t group
Sorting fails
Common Data Types:
Text
Whole Number
Decimal Number
Date
Date/Time
Boolean
How to Fix:
Select column → Change Data Type
Example:
If “Sales” is Text → Change to Decimal Number.Split Columns
Used when multiple values exist in one column.
Example:
Full Name → “Daksha Rajput”
Split by space →
First Name | Last NameHow to Split:
Select column → Transform → Split Column →
By Delimiter
By Number of Characters
Useful for:
Dates in text format
Codes
Product categories
Merge Queries
Used to combine data from multiple tables.
Similar to SQL JOIN.
Example:
Sales Table + Customer Table
Merge using Customer ID
Types of Join:
Left Outer (most common)
Inner
Full Outer
How to Merge:
Home → Merge Queries →
Select common columnEssential for building star schema.
Basic Data Transformations
Power Query provides many transformations:
Remove Columns
Delete unnecessary columns to improve performance.
Rename Columns
Use clean naming conventions.
Example:
cust_nm ❌
Customer_Name ✅Filter Rows
Remove unwanted records.
Example:
Remove sales before 2022Group By
Summarize data.
Example:
Total Sales by RegionAdd Custom Column
Create new logic.
Example:
If Sales > 10000 → “High” else “Low”Trim & Clean Text
Remove extra spaces.
Transform → Format → TrimTypical Data Cleaning Workflow
Import Data
⬇
Remove unnecessary columns
⬇
Fix data types
⬇
Handle nulls
⬇
Remove duplicates
⬇
Split / Merge if needed
⬇
Close & Apply