Thursday, October 29, 2015

String Extraction Using Perl Regular Expressions (PRX) in SAS

The PRXMATCH function searches source with the perl-regular-expression and returns the position at which the string begins.  If there is no match, PRXMATCH returns a zero.
syntax : PRXMATCH (perl-regular-expression, source)


Suppose we have a list of companies and their addresses :

7‑Eleven, Inc.community relations department dallas, TX 75221-0711
20 Century Fox 10201 pico blvd, los angeles, CA 90064
APACHE prime four business park kingswells scotland, United Kingdom

We wish to extract the names of the companies. The names of the companies starts with either a upper-case letters or numbers while their  addresses are in the lower-case. To do so we are going to the string extraction with the help of PRXMATCH Function.

Explanaton:
prxmatch ('/\b[a-z]\w*\b/')
In this example we are using forward slashes (/) as perl dilimiters.
 \b is word boundary (a space or end-of-line)
[a-z] matches lower-case letters
\w matches any word character (upper- and lowercase letters, blank and underscore)
* matches the previous subexpression zero or more times


As can be seen in the result  PRXMATCH function  helped us to extract the names of the companies.

No comments:

Post a Comment