banner



How To Use Sas Count Words Frequency In An Eassy

A jargon-free, piece of cake-to-learn SAS base of operations class that is tailor-made for students with no prior knowledge of SAS.

The SCAN function in SAS

The Browse office in SAS provides a uncomplicated and convenient way to parse out words from graphic symbol strings. The SCAN function tin be used to select individual words from text or variables which contain text and so store those words in new variables. This article provides a number of unlike examples and uses for the Scan function, including some of the near commonly used options to help you become the nearly from this function.

 In particular, this article will cover:

  1. Selecting the nth word in a character string.
  2. Selecting the last discussion in a character string.
  3. Handling unlike word delimiters.
  4. Using Browse with Practice LOOPs to parse long character strings.

Software

Before nosotros continue, brand sure you take access to SAS Studio. It'due south gratuitous!

Data Sets

In this article, the CARS and BASEBALL datasets from the SASHELP library will be used to illustrate a number of dissimilar uses for the Scan function.

Selecting the Nth Word in a Grapheme String

Ane of the simplest operations you can perform with the SCAN function is to find the nth word in a character string.

 Let'southward start with an example to demonstrate how to find the first discussion in a character string and then store the result in a separate variable. The most basic use of the Scan function requires simply two arguments. After specifying SCAN and an open parenthesis, the first part of the function is to specify the graphic symbol string that you are planning to select words from. This can be either a variable or an explicit character string. In this first case we are using the explicit character cord, "I am an Expert SAS Programmer".

 The second statement is the count, which is the numeric position of the discussion inside the character cord that you want to search. So, to return the first give-and-take, we tin can explicitly specify a number 1. This could also be replaced with a variable containing the desired count value.

 The SAS syntax is equally follows:

data example;
 first_word = scan( "I am a SAS Programming Expert",1 );
run;

As yous can see in the output below, the new variable FIRST_WORD has been created and its value is the beginning discussion, "I" from the grapheme string, "I am a SAS Programming Expert":

In most cases, the character string that you would similar to select words from is independent in a variable itself. In this next example, the variable TEXT contains the graphic symbol string "I am a SAS Programming Expert" and over again we would like to extract the first word from the string. To do this, we simply replace the explicit graphic symbol string in quotes with the variable name TEXT. The count, 1, stays the same as the previous case since we are still interested in the first discussion:

data instance;
 text = "I am a SAS Programming Expert";
 first_word = browse( text ,one);
run;

The output dataset now contains both the original TEXT variable and the newly created FIRST_WORD variable which contains the get-go word from the TEXT variable, "I":

To select additional words, such as the 2d, third and quaternary word, nosotros tin can modify the count argument of the Scan function. To select the second word from a string, simply prepare the count statement to 2. For the tertiary word, set the count equal to 3, and and so on.

 In the following example, we create 3 additional variables, SECOND_WORD, THIRD_WORD and FOURTH_WORD, which select the second, third and fourth word respectively from the TEXT variable:

data example;
 text = "I am a SAS programming expert";
 first_word = scan(text,1);
 second_word = browse(text,2);
 third_word = scan(text,3);
 fourth_word = scan(text,4);

run;

The output information at present has 5 variables – the original TEXT variable every bit well equally the commencement through fourth give-and-take, each in split variables, as shown hither:

Do you accept a hard time learning SAS?

Accept our Applied SAS Training Course for Accented Beginners and learn how to write your first SAS plan!

Selecting the Last Word in a Grapheme String

Using the Scan part, you also the accept the ability to read from right to left, effectively allowing you lot to capture the last word in a character string.

 To tell SAS to read from right to left, nosotros merely modify the count statement to be a negative number to indicate the word number that nosotros would like to read, starting from the right and moving left. So, to select the give-and-take "Expert" in our TEXT variable, nosotros can utilize a count of -1, equally shown here:

data example;
 text = "I am a SAS Programming Proficient";
 last_word =browse(text,-1);
run;

As you can meet in the output data, we now have a new variable, LAST_WORD, which contains the last word of the text cord, "Expert":

Alternatively, instead of using a negative count you can use the "b" modifier bachelor with the Browse function. By specifying a "b" argument with the Scan office, you can tell SAS to read from right to left instead of the default left to right. Notation when using a modifier with the Browse part, the modifier needs to be the fourth argument, so you must always explicitly state the third argument (the delimiter) together with the fourth modifier argument so that SAS won't treat your modifier every bit the delimiter!

Here is the syntax with the "b" modifier included:

information example;
 text = "I am a SAS Programming Expert";
 last_word = scan(text,ane," ", "b" );
run;

The resulting output dataset shows the same consequence as previously – the last word, "Expert" has been captured in the LAST_WORD variable:

Note there are many other modifiers for the SCAN function to assist with special cases. These modifiers tin be constitute in the SAS documentation.

Become a Certified SAS Specialist

Become access to 2 SAS base of operations certification prep courses and 150+ practice exercises

Treatment Different Word Delimiters

And so far, the examples we take looked at have merely had blanks or spaces equally the delimiter between words. What happens when in that location is a unlike delimiter, such as a comma?

 In the example below, the code has been modified then that the words in the character string of the text variable are delimited with a comma instead of spaces. Hither, we are trying to select the fourth discussion:

information example;
 text ="I,am,a,SAS,Programming,Expert";
 fourth_word = scan(text,iv);
run;

Every bit yous can meet from the output data shown below, the SCAN part however works fifty-fifty with commas as the delimiter:

The reason this still works is because by default, with any calculator using ASCII characters, the SCAN part will automatically check for any of the following characters equally delimiters:

blank ! $ % & ( ) * + , - . / ; < ^ :

 When your data contains a delimiter between words not found in the default listing, you lot tin utilize thecharlist argument (the tertiary statement) with the Browse function to specify your own custom delimiter.

 For example, if the words in your character string are delimited with a plus sign (+), y'all simply need to enclose the plus sign in quotations as the third argument to the browse role.

 The syntax below demonstrates how to select the 5th word from a plus sign delimited character string:

data instance;
 text ="I+am+a+SAS+Programming+Good";
 fifth_word = browse(text,5,"+");
run;

In the output data below, y'all can see the fifth word in the cord has been successfully selected:

In some cases, yous may also want to forcefulness SAS to use but i of the default delimiters. By default, SAS will use not only one but all of the delimiters in the default list. This can get problematic in certain cases when your information contains multiple delimiters.

 In the SASHELP.BASEBALL dataset, the NAME variable contains a list of offset, last and heart names. The construction is equally follows: <last name>,<firstname><bare><middlename>. You would like to create two new variables: LASTNAME and GIVEN_NAMES.

 Since commas and spaces are default delimiters, nosotros start without specifying our own delimiter:

data baseball;
 set sashelp.baseball;

 lastname = browse(proper noun,1);
 given_names = scan(name,ii);

  continue name given_names lastname;
run;

At get-go glance information technology may announced as though the results are correct, but after further inspection you will detect that some names were not parsed properly. For example, Andy Van Slyke'due south given name should take been "Andy" and non "Slyke" equally shown beneath:

To correct this, we can tell SAS to simply apply the comma equally a delimiter and then that "Van Slyke" volition become the concluding name and Andy volition be the given name:

data baseball;
 set sashelp.baseball game;

  lastname = scan(name,one, "," );
 given_names = scan(proper noun,2, "," );

  keep name given_names lastname;
run;

Now that the blanks are no longer considered delimiters and only the commas are, nosotros get the desired issue in our output data with "Andy" now in the GIVEN_NAMES variable and "Van Slyke" in the LASTNAME variable:

Using Browse with Do LOOPS to Parse Long Grapheme Strings

When combined with a simple Exercise LOOP and a SAS , the SCAN function makes it piece of cake to parse out each word from a character cord into carve up variables.

For example, in the SASHELP.CARS dataset, you would like to parse out each discussion from the MODEL variable into 5 dissever variables. Since the words of the total model name are delimited by spaces, no modification is needed to the delimiter argument and the default tin can be used.

The code below uses a DO LOOP to scan the MODEL variable then create the variables MODELNAME1 to MODELNAME5:

data cars_parse;
 set sashelp.cars;

  array modelname[5] $15 model1-model5;
 do i = 1 to 5;
  modelname[i] = browse(model,i,", ");
 stop;

  proceed model model1-model5;
run;

As you can come across in the output data shown partially below, nosotros now have 5 new MODEL variables, with one word per variable:

Master SAS in 30 Days

Inline Feedbacks

View all comments

susi

browse at a try two names is non possible

iconmail

Get latest articles from SASCrunch

SAS Base of operations Certification Examination Prep Course

Two Certificate Prep Courses and 300+ Practice Exercises

Source: https://sascrunch.com/scan-function/

Posted by: lewisplar1972.blogspot.com

0 Response to "How To Use Sas Count Words Frequency In An Eassy"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel