Introducing 

Prezi AI.

Your new presentation assistant.

Refine, enhance, and tailor your content, source relevant images, and edit visuals quicker than ever before.

Loading…
Transcript

Standardize School Address

Problems we have:

  • Potential extra blanks between words
  • Some addresses used 'Street', 'Avenue', and 'Road' and some used the abbreviations 'St.', 'Ave.', and 'Rd
  • Upcase and lowcase are mixed together

How should we standardize school names using functions?

TRANWRD function

TRANWRD (source,target, replacement)

Standardize School Name

Validate School QEDID

  • Upcase and lowcase are mixed together
  • Potential extra blanks between words

How to fix the school names?

We need to verify only the value 'MDR', 'PSS', '_' and numbers are valid data value for variable QEDID

VERIFY function

VERIFY (Source, valid-value)

  • Convert multiple blanks to a single blank

COMPBL function

COMPBL (Argument)

  • Convert strings to proper case

PROPCASE function

PROPCASE (Argument)

How to approach a dataset

Learning Objectives

Parsing a string

  • What is SAS functions/When to use
  • Categories of SAS functions
  • Applications of functions
  • Q&A

Parsing a string means to take it apart based on some rules.

  • SUBSTR function
  • ANYALPHA function
  • ANYDIGIT function
  • COMPRESS function
  • SCAN function

  • Data structure/Key variables
  • Which variables need to be standardized for the further use
  • Which variables need to be validated

Functions vs. Procedures

What is the fudamental difference between SUM,MEAN funtions and Proc Means ?

Functions

Procedures

SCAN FUNCTION

Extract Source info from School QEDID

Categories of SAS functions

SCAN(Source, count, delimiter)

Functions can improve efficiency

SAS has more than 500 functions that belongs to more than 30 categories for variety of programming tasks.

  • Arithmetic/Mathematical /Probability/Financial
  • Array/Random number
  • Character Handling
  • Date and Time

The most common way to do extraction is using SUBSTR function

  • Why SUBSTR function is not that helpful in School QEDID extraction?
  • When SUBSTR function is a good choice?

We will create age group variables for dataset admit. We will put age 20 to 29 to group 20, 30 to 39 to group 30, 40 to 49 to group 40 etc. How should we program it?

  • Can we use SCAN to extract QEDID?
  • How to extract the names from the contact variable?
  • More powerful when you use it with DO LOOP

Old fashion way

Function way

COMPRESS function

COMPRESS(Source,'take-out','Modifier')

Round(argument, rounding-unit)

Introduction to SAS Functions

Wen Song

Learn more about creating dynamic, engaging presentations with Prezi