SAS First., Last. Syntax and uses with examples

In SAS, the FIRST. and LAST. automatic variables are used within a DATA step to identify the first and last occurrences of observations within a BY group. These variables are particularly useful when working with sorted data or when you need to perform specific operations on the first or last observation of a group. Here’s how you can use FIRST. and LAST. with examples:

Syntax:

  • FIRST.variable: Returns a value of 1 for the first observation in a BY group, and 0 otherwise.
  • LAST.variable: Returns a value of 1 for the last observation in a BY group, and 0 otherwise.

Example:

Suppose we have a dataset named sales with the following structure:

cssCopy codeID   Product   Revenue
1    A         100
2    A         150
3    B         200
4    B         180
5    B         220

We want to calculate the total revenue for each product and display it only for the first and last observations of each product group.

SAS Code:

DATA sales_summary;
SET sales;
BY Product;

/* Calculate total revenue for each product */
IF FIRST.Product THEN Total_Revenue = 0; /* Initialize Total_Revenue for the first observation */
Total_Revenue + Revenue; /* Accumulate revenue for each observation */

/* Output total revenue only for the last observation of each product group */
IF LAST.Product THEN OUTPUT;
RUN;

Explanation:

  • The BY Product; statement ensures that the data is processed in groups based on the Product variable.
  • The IF FIRST.Product THEN Total_Revenue = 0; statement initializes the Total_Revenue variable to 0 for the first observation of each product group.
  • The Total_Revenue + Revenue; statement accumulates the revenue for each observation within the product group.
  • The IF LAST.Product THEN OUTPUT; statement outputs the total revenue only for the last observation of each product group.

Output (sales_summary dataset):

Product   Total_Revenue
A 250
B 600

In this example, FIRST. and LAST. are used to identify the first and last observations within each product group. The total revenue is calculated and displayed only for the last observation of each product group. This technique allows you to perform specific operations or calculations based on the position of observations within BY groups in SAS.


Discover more from AI HintsToday

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

Latest Entries:-

  • Data Engineering Job Interview Questions :- Datawarehouse Terms
  • Oracle Query Execution phases- How query flows?
  • Pyspark -Introduction, Components, Compared With Hadoop
  • PySpark Architecture- (Driver- Executor) , Web Interface
  • Memory Management through Hadoop Traditional map reduce vs Pyspark- explained with example of Complex data pipeline used for Both used
  • Example Spark submit command used in very complex etl Jobs
  • Deploying a PySpark job- Explain Various Methods and Processes Involved
  • What is Hive?
  • In How many ways pyspark script can be executed? Detailed explanation
  • DAG Scheduler in Spark: Detailed Explanation, How it is involved at architecture Level
  • CPU Cores, executors, executor memory in pyspark- Expalin Memory Management in Pyspark
  • Pyspark- Jobs , Stages and Tasks explained
  • A DAG Stage in Pyspark is divided into tasks based on the partitions of the data. How these partitions are decided?
  • Apache Spark- Partitioning and Shuffling
  • Discuss Spark Data Types, Spark Schemas- How Sparks infers Schema?
  • String Data Manipulation and Data Cleaning in Pyspark

Discover more from AI HintsToday

Subscribe now to keep reading and get access to the full archive.

Continue reading