📊 Overview
This dataset represents a simulated FinTech payment processing company.
It is designed for hands-on practice in:
Data Cleaning (Power Query)
Data Modeling (Star Schema)
DAX & Time Intelligence
Performance Optimization
RLS (Row-Level Security)
Power BI Service deployment
The dataset contains realistic transactional data with light data quality issues to simulate real-world business scenarios.
📦 Dataset Size
Fact Table: 100,000 transaction rows
Dimensions: 8 tables
Date Range: 2023 – 2026
Total Tables: 9
🏗️ Data Model Structure
The dataset follows a Star Schema design:
Fact Table
FactTransactions
Dimension Tables
DimDate
DimCompany
DimSalesRep
DimManager
DimProduct
DimChannel
DimPaymentMethod
DimRegion
💼 Business Context
The company processes digital transactions for corporate clients.
Key business metrics included:
GMV (Gross Merchandise Value)
Revenue
Net Revenue
Transaction Status
Refund logic
Payment Channels
Regional performance
Sales hierarchy
Time-based analysis (YTD, YoY, MTD)
🧪 Included Real-World Data Challenges
To simulate real business environments, the dataset intentionally includes:
Duplicate Transaction IDs (small percentage)
Mixed date formats (regional variations)
Currency symbols in numeric fields
Null foreign keys
Negative GMV values (refund scenarios)
Inactive relationship scenario (CreatedDate vs TransactionDate)