💡 What is Teradata?

Teradata is a massively parallel processing (MPP), enterprise-grade relational database management system (RDBMS) designed specifically for data warehousing and large-scale analytics.


🏗️ Core Purpose

Teradata is built to store and analyze massive volumes of structured data efficiently.

It helps large organizations—especially banks, telecoms, and retailers—run:

  • Complex SQL analytics
  • Fast reporting and BI
  • Data mining, predictive modeling
  • Regulatory and compliance reporting

🔧 Key Features of Teradata

FeatureDescription
MPP ArchitectureData is split across many processing nodes → parallel query execution
Shared Nothing ArchitectureEach node works independently → avoids bottlenecks
SQL SupportFull ANSI-compliant SQL + extensions
Linear ScalabilityAdd more nodes, performance grows almost linearly
Integrated Data Storage + ComputeTight integration for performance (unlike some cloud models)
High AvailabilityRedundant nodes, RAID disk config, and fault tolerance
Workload ManagementPrioritizes queries based on SLAs and workloads
Security & AuditingEssential for BFSI industry compliance (GDPR, SOX, etc.)

🔐 Typical Banking Use Cases

Use CaseExample
Risk AnalysisCredit scoring, risk modeling using huge historical data
Customer 360Build a full view of customer interactions, transactions
Regulatory ReportingBasel II/III, AML, FATCA compliance data processing
Fraud DetectionReal-time or batch-based transaction pattern checks
Campaign AnalyticsWhich customers to target for loans/cards
Branch/ATM PerformanceOperational reporting and optimization

🛠️ Tools & Technologies with Teradata

Languages

  • SQL (ANSI standard)
  • BTEQ (Basic Teradata Query) – scripting tool for batch execution
  • Teradata SQL Assistant – GUI SQL interface
  • FastLoad, MultiLoad, TPT – bulk data load utilities

Tools

ToolPurpose
Teradata Studio / SQL AssistantSQL query development and execution
Teradata ViewpointMonitoring and management dashboard
Teradata Parallel Transporter (TPT)Fast data movement tool
ETL ToolsInformatica, Talend, DataStage often used for loading Teradata
BI ToolsTableau, MicroStrategy, Power BI, Cognos connect with Teradata

📦 Teradata Architecture (Simplified)

 +----------------+       +------------------+
 | Client (SQL)   | <-->  |  Parsing Engine  |  <-- manages SQL, session
 +----------------+       +------------------+
                                 |
                         +---------------+
                         | BYNET Network |
                         +---------------+
                                 |
              +-------------------+-------------------+
              |                   |                   |
        +----------+        +----------+        +----------+
        | AMP Node |        | AMP Node |  ...   | AMP Node |
        +----------+        +----------+        +----------+
        | Storage  |        | Storage  |        | Storage  |
  • AMP = Access Module Processor → Each handles a portion of data and processes independently.
  • BYNET = Teradata’s communication layer between nodes

🏁 Summary

AspectDescription
WhatScalable RDBMS for analytics and data warehousing
Built forHigh-speed SQL analytics on massive datasets
IndustryWidely used in banking, telecom, retail
TypeOn-Prem (mostly) – now also has Teradata Vantage for cloud
StrengthParallelism, speed, reliability, banking-grade features

Absolutely! Let’s dive into a step-by-step walkthrough of how data flows in a banking ecosystem — covering OLTP databases, Data Lakes, Data Warehouses, and analytics use cases.


🏦 1. Where Daily Banking Transactions Are Stored

Source Systems: OLTP Databases

  • Systems like: Core Banking, Loan Management, ATM Systems, Credit Card Systems, Mobile App Backend, CRM, etc.
  • Databases Used: Oracle, IBM Db2, PostgreSQL, SQL Server, MySQL
  • Characteristics:
    • High speed, low latency
    • Row-based storage
    • Handles millions of inserts/updates per day
    • Highly normalized schema
    • Not suitable for heavy analytical queries

📝 Examples of Data Stored Here:

TableSample Data
AccountsAccount number, customer ID, balance
TransactionsDebit/credit, timestamp, amount, location
LoansEMI schedule, status, overdue days
CustomersKYC, contact info, risk rating

📤 2. ETL or ELT to Central Storage

At end of day (EOD) or in real-time (for some events), data is:

  • Extracted from OLTP systems
  • Transformed: cleaned, enriched, joined
  • Loaded into:
    • Data Lake (for raw + semi-structured/unstructured)
    • or Data Warehouse (for structured, analytical-ready)

🗂️ 3. Data Lake in Banking: Use & Architecture

✅ When Data Lake is Used:

  • For storing raw, unstructured, semi-structured, and structured data
  • To keep history of all ingested data
  • For data science, ML, sandbox experiments
  • For regulatory traceability (raw audit logs)

🔧 Storage Technology:

  • On-Prem: Hadoop (HDFS)
  • Cloud: Azure Data Lake, Amazon S3, Google Cloud Storage

📁 Typical Zones in a Data Lake:

ZonePurpose
Raw/landingUntouched source data from OLTP
CleansedCleaned/standardized files
CuratedJoined, enriched, business-ready
Analytics/SandboxData Science, ML model data
ArchivedCompressed historical logs

📝 Data Formats:

  • CSV, JSON, Avro, Parquet
  • Images, PDFs (e.g., cheque scans), logs

📦 Examples:

  • Log files from ATMs
  • Mobile app clickstream events
  • OCR from cheque images
  • Web scraping for credit intelligence

🏛️ 4. Data Warehouse in Banking

✅ When Data Warehouse is Used:

  • For business reporting, dashboards, regulatory reporting
  • For fast SQL queries across large volumes of data
  • When data is cleaned, modeled, and schema is defined
  • Used by Finance, Risk, Compliance, BI Teams

🔧 DW Technologies:

  • On-Prem: Teradata, Netezza, Oracle Exadata
  • Cloud: Snowflake, Redshift, Azure Synapse, BigQuery

🧱 Typical Design:

  • Star / Snowflake Schema
  • Facts + Dimensions (e.g., transactions fact, account dimension)

🔁 Data Flow:

Daily Transactions → ETL → Data Warehouse (facts/dims) → Power BI / Tableau dashboards

📝 Example Queries:

  • “Show top 100 customers by loan amount”
  • “Find accounts inactive for 180 days”
  • “Credit card spend trend by customer segment”

🔄 5. Data Flow Overview in a Bank

+-------------+        +-----------+        +-----------+        +------------+
| OLTP System | -----> | ETL Tools | -----> | Data Lake | -----> | Data Warehouse |
+-------------+        +-----------+        +-----------+        +------------+
   (Oracle,          (Informatica, Talend)   (Azure ADLS, S3)     (Snowflake, Teradata)
    SQL Server)

       ↓                                       ↑            ↑
 Real-Time Kafka → Stream Processing (Spark, Flink) →      |
       ↓                                                    |
      ML Models, AI Apps, Risk Engine                     BI Tools

🧠 Common Tools in Each Layer

LayerExamples
OLTP DBsOracle, SQL Server, PostgreSQL, MySQL
ETL ToolsInformatica, Talend, SSIS, Apache Nifi
Data LakeHadoop HDFS, Azure ADLS, Amazon S3
Data WarehouseTeradata, Snowflake, Redshift, BigQuery
StreamingKafka, Spark Streaming, Flink
BI ToolsPower BI, Tableau, QlikView
ML/AIDatabricks, Python, SAS, H2O.ai

📌 When to Use What?

ScenarioUse
Raw data backup / ingest✅ Data Lake
Real-time fraud detection✅ Streaming + ML
Executive dashboards✅ Data Warehouse
Regulatory compliance reports✅ Data Warehouse
Data Science experimentation✅ Data Lake or Sandbox DW
Heavy structured SQL analytics✅ DW (e.g., Teradata, Snowflake)
Logs / semi-structured / image data✅ Data Lake

✅ Summary: Roles of Databases, Lakes, and Warehouses in Banking

ComponentRole
OLTP (DBs)Real-time transactions, high-frequency updates
ETL/ELTMoves data from OLTP → DW / Data Lake
Data LakeRaw storage, multi-format, used for ML, backup
Data WarehouseStructured, optimized for analytics & reports

Pages: 1 2 3 4


Discover more from HintsToday

Subscribe to get the latest posts sent to your email.

Posted in

Leave a Reply

Discover more from HintsToday

Subscribe now to keep reading and get access to the full archive.

Continue reading