NA items in R data

The NA item is a special object in R and represents “Not Available;able”. Sometimes this is because data were genuinely not collected (and therefore really are missing). Other times it is because you have columns of unequal length and your data.frame is padded out (with NA) to make a rectangular object with all columns containing the same number of elements.

The na.rm = TRUE parameter can be used to “take care” of NA items in some summary commands, e.g. sum() or mean():

x
[1] 2 4 3 6 2 8 NA NA
mean(x)
[1] NA
mean(x, na.rm = TRUE)
[1] 4.166667

However, this does not always work.

length(x, na.rm = TRUE)
Error in length(x, na.rm = TRUE) :
2 arguments passed to 'length' which requires 1

In this case the na.omit() command can be used to strip out the NA​ items:

length(na.omit(x))
[1] 6

The na.omit() command essentially removes all the NA items.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s