HintsToday

Hints and Answers for Everything

recent posts

about

Author: lochan2014

Python Code Execution- Behind the Door- What happens?
June 29, 2024
Temporary Functions in PL/Sql Vs Spark Sql
June 26, 2024
How PySpark automatically optimizes the job execution by breaking it down into stages and tasks based on data dependencies. can explain with an example
June 25, 2024
Understanding Pyspark execution with the help of Logs in Detail
June 23, 2024
Pyspark RDDs a Wonder -Transformations, actions and execution operations- please explain and list them
June 16, 2024
RDD (Resilient Distributed Dataset) is the fundamental data structure in Apache Spark. It is an immutable, distributed collection of objects that can be processed in parallel across a cluster of machines. Purpose of RDD How RDD is Beneficial RDDs are the backbone of Apache Spark’s distributed computing capabilities. They enable scalable, fault-tolerant, and efficient processing…
Are Dataframes in PySpark Lazy evaluated?
June 16, 2024
BDL Ecosystem- Components, HDFS in Detail
June 15, 2024
Big Data Lake: Data Storage HDFS is a scalable storage solution designed to handle massive datasets across clusters of machines. Hive tables provide a structured approach for querying and analyzing data stored in HDFS. Understanding how these components work together is essential for effectively managing data in your BDL ecosystem. HDFS – Hadoop Distributed File…
Big Data, Data Warehouse, Data Lakes, Big Data Lake, LakeHouse, Snowflake – Explain in simple words
June 15, 2024
Ordered Guide to Big Data, Data Lakes, Data Warehouses & Lakehouses 1 The Modern Data Landscape — Bird’s‑Eye View Every storage paradigm slots into this flow at the Storage layer, but each optimises different trade‑offs for the rest of the pipeline. 2 Foundations: What Is Big Data? 5 Vs Meaning Volume Petabytes+ generated continuously Velocity Milliseconds‑level arrival & processing Variety Structured, semi‑structured, unstructured Veracity Data quality…
Window functions in Oracle Pl/Sql and Hive explained and compared with examples
June 6, 2024
Common Table Expressions (CTEs) in Oracle Pl/Sql / Hive / Spark SQL explained and Compared
June 6, 2024