Listing data and basic command syntax in Stata

1. Command syntax

This chapter gives a basic lesson on Stata’s command syntax while showing how to control the appearance of a data list.

As we have seen throughout this manual, you have a choice between using menus and dialogs and using the Command window. Although many find the menus more natural and the Command window baffling at first, some practice makes working with the Command window often much faster than using menus and dialogs. The Command window can become a faster way of working because of the clean and regular syntax of Stata commands. We will cover enough to get you started; help language has more information and examples, and [U] 11 Language syntax has all the details.

The syntax for the list command can be seen by typing help list:

list [varlist] [if] [in] [, options]

Here is how to read this syntax:

  • Anything inside square brackets is optional. For the list command,
    1. varlist is optional. A varlist is a list of variable names.
    2. if is optional. The if qualifier restricts the command to run only on those observations for which the qualifier is true. We saw examples of this in [GSW] 6 Using the Data Editor.
    3. in is optional. The in qualifier restricts the command to run on particular observation numbers.
    4. , and options are optional. options are separated from the rest of the command by a comma.
  • Optional pieces do not preclude one another unless explicitly stated. For the list command, it is possible to use a varlist with if and
  • If a part of a word is underlined, the underlined part is the minimum abbreviation. Any abbreviation at least this long is acceptable.
  1. The l in list is underlined, so l, li, and lis are all equivalent to list.
    • Anything not inside square brackets is required. For the list command, only the command itself is required.

Keeping these rules in mind, let’s investigate how list behaves when called with different arguments. We will be using the dataset afewcarslab.dta from the end of the previous chapter.

2. List with a variable list

Variable lists (or varlists) can be specified in a variety of ways, all designed to save typing and encourage good variable names.

  • The varlist is optional for list. This means that if no variables are specified, it is equivalent to specifying all variables. Another way to think of it is that the default behavior of the command is to run on all variables unless restricted by a varlist.
  • You can list a subset of variables explicitly, as in list make mpg price.
  • There are also many shorthand notations:
    • m* means all variables starting with m.
    • price-weight means all variables from price through weight in the dataset order. ma?e means all variables starting with ma, followed by any character, and ending in e.
  • You can list a variable by using an abbreviation unique to that variable, as in list gear_r~o. If the abbreviation is not unique, Stata returns an error message.

3. List with if

The if qualifier uses a logical expression to determine which observations to use. If the expression is true, the observation is used in the command; otherwise, it is skipped. The operators whose results are either true or false are

<      less than

<=   less than or         equal

==   equal

>      greater than

>=   greater than        or equal

!=     not equal

&     and

|       or

!       not (logical          negation)

() parentheses are for grouping to specify order of evaluation

In the logical expressions, & is evaluated before | (similar to multiplication before addition in arithmetic). You can use this in your expressions, but it is often better to use parentheses to ensure that the expressions are evaluated in the proper order. See [U] 13.2 Operators for complete details.

In the listings above, we see more examples of Stata treating missing numerical values as large values, as well as the care that should be taken when the if qualifier is applied to a variable with missing values. See [GSW] 6 Using the Data Editor.

4. List with if, common mistakes

Here is a series of listings with common errors and their corrections. See if you can find the errors before reading the correct entry.

The error arises because “equal” is expressed by ==, not by =. Corrected, it becomes

Other common errors with logic:

Joint tests are specified with &, not with the word and or multiple ifs. The if qualifier should be if mpg==21 & weight>4000, not if mpg==21 if weight>4000. Here is its correction:

A problem with string variables:

Strings must be in double quotes, as in make==”Datsun 510″. Without the quotes, Stata thinks that Datsun is a variable that it cannot find. Here is the correction:

Confusing value labels with strings:

Value labels look like strings, but the underlying variable is numeric. Variable foreign takes on values 0 and 1 but has the value label that attaches 0 to “Domestic” and 1 to “Foreign” (see [GSW] 9 Labeling data). To see the underlying numeric values of variables with labeled values, use the label list command (see [D] label), or investigate the variable with codebook varname. We can correct the error here by looking for observations where foreign==0.

There is a second construction that also allows the use of the value label directly.

5. List with in

The in qualifier uses a numlist to give a range of observations that should be listed. numlists have the form of one number or first/last. Positive numbers count from the beginning of the dataset. Negative numbers count from the end of the dataset. Here are some examples:

6. Controlling the list output

The fine control over list output is exercised by specifying one or more options. You can use sepby() to separate observations by variable. abbreviate() specifies the minimum number of characters to abbreviate a variable name in the output. divider draws a vertical line between the variables in the list.

The separator() option draws a horizontal line at specified intervals. When not specified, it defaults to a value of 5.

7. Break

If you want to interrupt a Stata command, click on the Break button, X

It is always safe to click on the Break button. After you click on Break, the state of the system is the same as if you had never issued the original command.

Source: STATA (2021), Getting Started with Stata for Windows, Stata Press Publication.

Leave a Reply

Your email address will not be published. Required fields are marked *