Teradata is a massively parallel processing (MPP), enterprise-grade relational database management system (RDBMS) designed specifically for data warehousing and large-scale analytics.
🏗️ Core Purpose
Teradata is built to store and analyze massive volumes of structured data efficiently.
It helps large organizations—especially banks, telecoms, and retailers—run:
Complex SQL analytics
Fast reporting and BI
Data mining, predictive modeling
Regulatory and compliance reporting
🔧 Key Features of Teradata
Feature
Description
MPP Architecture
Data is split across many processing nodes → parallel query execution
Shared Nothing Architecture
Each node works independently → avoids bottlenecks
SQL Support
Full ANSI-compliant SQL + extensions
Linear Scalability
Add more nodes, performance grows almost linearly
Integrated Data Storage + Compute
Tight integration for performance (unlike some cloud models)
High Availability
Redundant nodes, RAID disk config, and fault tolerance
Workload Management
Prioritizes queries based on SLAs and workloads
Security & Auditing
Essential for BFSI industry compliance (GDPR, SOX, etc.)
🔐 Typical Banking Use Cases
Use Case
Example
Risk Analysis
Credit scoring, risk modeling using huge historical data
Customer 360
Build a full view of customer interactions, transactions
Regulatory Reporting
Basel II/III, AML, FATCA compliance data processing
Fraud Detection
Real-time or batch-based transaction pattern checks
Campaign Analytics
Which customers to target for loans/cards
Branch/ATM Performance
Operational reporting and optimization
🛠️ Tools & Technologies with Teradata
Languages
SQL (ANSI standard)
BTEQ (Basic Teradata Query) – scripting tool for batch execution
Teradata SQL Assistant – GUI SQL interface
FastLoad, MultiLoad, TPT – bulk data load utilities
Tools
Tool
Purpose
Teradata Studio / SQL Assistant
SQL query development and execution
Teradata Viewpoint
Monitoring and management dashboard
Teradata Parallel Transporter (TPT)
Fast data movement tool
ETL Tools
Informatica, Talend, DataStage often used for loading Teradata
BI Tools
Tableau, MicroStrategy, Power BI, Cognos connect with Teradata
AMP = Access Module Processor → Each handles a portion of data and processes independently.
BYNET = Teradata’s communication layer between nodes
🏁 Summary
Aspect
Description
What
Scalable RDBMS for analytics and data warehousing
Built for
High-speed SQL analytics on massive datasets
Industry
Widely used in banking, telecom, retail
Type
On-Prem (mostly) – now also has Teradata Vantage for cloud
Strength
Parallelism, speed, reliability, banking-grade features
Absolutely! Let’s dive into a step-by-step walkthrough of how data flows in a banking ecosystem — covering OLTP databases, Data Lakes, Data Warehouses, and analytics use cases.
🏦 1. Where Daily Banking Transactions Are Stored
✅ Source Systems: OLTP Databases
Systems like: Core Banking, Loan Management, ATM Systems, Credit Card Systems, Mobile App Backend, CRM, etc.
Databases Used: Oracle, IBM Db2, PostgreSQL, SQL Server, MySQL
Characteristics:
High speed, low latency
Row-based storage
Handles millions of inserts/updates per day
Highly normalized schema
Not suitable for heavy analytical queries
📝 Examples of Data Stored Here:
Table
Sample Data
Accounts
Account number, customer ID, balance
Transactions
Debit/credit, timestamp, amount, location
Loans
EMI schedule, status, overdue days
Customers
KYC, contact info, risk rating
📤 2. ETL or ELT to Central Storage
At end of day (EOD) or in real-time (for some events), data is:
Extracted from OLTP systems
Transformed: cleaned, enriched, joined
Loaded into:
Data Lake (for raw + semi-structured/unstructured)
or Data Warehouse (for structured, analytical-ready)
🗂️ 3. Data Lake in Banking: Use & Architecture
✅ When Data Lake is Used:
For storing raw, unstructured, semi-structured, and structured data
To keep history of all ingested data
For data science, ML, sandbox experiments
For regulatory traceability (raw audit logs)
🔧 Storage Technology:
On-Prem: Hadoop (HDFS)
Cloud: Azure Data Lake, Amazon S3, Google Cloud Storage
📁 Typical Zones in a Data Lake:
Zone
Purpose
Raw/landing
Untouched source data from OLTP
Cleansed
Cleaned/standardized files
Curated
Joined, enriched, business-ready
Analytics/Sandbox
Data Science, ML model data
Archived
Compressed historical logs
📝 Data Formats:
CSV, JSON, Avro, Parquet
Images, PDFs (e.g., cheque scans), logs
📦 Examples:
Log files from ATMs
Mobile app clickstream events
OCR from cheque images
Web scraping for credit intelligence
🏛️ 4. Data Warehouse in Banking
✅ When Data Warehouse is Used:
For business reporting, dashboards, regulatory reporting
For fast SQL queries across large volumes of data
When data is cleaned, modeled, and schema is defined
Leave a Reply