Thursday, October 29, 2015

String extraction using SAS Scan and Substr functions

In my last blog we saw string extraction using perl function. Today we are going to extract and concatenate character string using SAS Scan and Substr functions.

Syntax:

     SCAN(argument,n,<delimiters>)

where
argument specifies the character variable or expression to scan.
n specifies which word to read.
delimiters are special characters that must be enclosed in single quotation marks (' '). If you do not specify delimiters, default delimiters are used. 

     SUBSTR(argument,position,<n>) 
where
argument specifies the character variable or expression to scan.
position is the character position to start from.
n specifies the number of characters to extract. If n is omitted, all remaining characters are included in the substring. 

Let us see an example:

In the dataset work.Company there are four columns Name, Age, Sex and SSN.
We wants to create a new data set employeeID with a column named ID which contains unique Identification number for the employees based on their last name and last four digits of their  SSN.

Company
g
Obs
Name
Age
Sex
SSN
1
Morrison, Michael
32
M

2
Rudelich, Herbert
39
M
129-42-9261
3
Vincent, Martina
34
F
074-53-9892
4
Benito, Gisela
32
F
228-88-9649
5
Sirignano, Emily
12
F
442-21-8075
6
Harbinger, Nicholas
36
M
446-93-2122
7
Phillipon, Marie-Odile
28
F
776-84-5391
8
Gunter, Thomas
27
M
929-75-0218
     



data employeeID;
 set company;
 ID = (scan(name, 1, ","))!!(substr(SSN,8,4));
 run;
 proc print data=employeeID;
 run;

Explanation:
scan(name, 1, ",")
name is the character variable to scan
1 is the position of the word to read
"," is the delimiter

substr(SSN,8,4)
SSN is the character variable to scan.
8 is the character position to start from.
4 specifies the number of characters to extract


Result
Obs
Name
Age
Sex
SSN
ID
1
Morrison, Michael
32
M

Morrison
2
Rudelich, Herbert
39
M
129-42-9261
Rudelich9261
3
Vincent, Martina
34
F
074-53-9892
Vincent9892
4
Benito, Gisela
32
F
228-88-9649
Benito9649
5
Sirignano, Emily
12
F
442-21-8075
Sirignano8075
6
Harbinger, Nicholas
36
M
446-93-2122
Harbinger2122
7
Phillipon, Marie-Odile
28
F
776-84-5391
Phillipon5391
8
Gunter, Thomas
27
M
929-75-0218
Gunter0218


































































No comments:

Post a Comment