Count NA's in a data frame.

countNa computes the number of missing values in a data frame. It counts the number of missings in each column, the number of rows in which a value in at least one columns is missing, and the expected number of rows with at least one missing value (computed under the assumption of independence of missingness in individual columns). Number of rows left are also given. Optionally, combinations of columns reaching highest joint missingness is also reported (if combColCount > 1 and the number of columns in x is at least 2).

countNa(d, sort = TRUE, decreasing = FALSE, combColCount = 3)

Arguments

d	a data frame
sort	sort columns of 'x' by the number of missings?
decreasing	if sorting by the number of missing, should the sort be decreasing or increasing?
combColCount	maximum number of columns to combine when finding a combination of columns reaching the highest number of missings

Value

A data frame (or a list of two data frames) describing the missingness. The first data frame consists of rows describes the missingness in individual columns, plus the missingness in the combination of all columns (in a row called 'any'), plus the average missingness (in a row called 'average'). The second data frame (if requested) describes the combinations of at most combColCount columns of x reaching highest joint missingness.

Examples

d<-data.frame(x1=1,x2=2,x3=1:4,y=c(1,NA,2,NA),z=c(NaN,NaN,3,4))
d
#>   x1 x2 x3  y   z
#> 1  1  2  1  1 NaN
#> 2  1  2  2 NA NaN
#> 3  1  2  3  2   3
#> 4  1  2  4 NA   4
countNa(d)
#> $columns
#>      names   missing missingPercent     left leftPercent
#> 1       x1 0.0000000        0.00000 4.000000   100.00000
#> 2       x2 0.0000000        0.00000 4.000000   100.00000
#> 3       x3 0.0000000        0.00000 4.000000   100.00000
#> 4        y 2.0000000       50.00000 2.000000    50.00000
#> 5        z 2.0000000       50.00000 2.000000    50.00000
#> 6      any 3.0000000       75.00000 1.000000    25.00000
#> 7 expected 0.9685669       24.21417 3.031433    75.78583
#> 
#> $columnCombinations
#>   missing   cols
#> 1       3    z+y
#> 2       3 x1+z+y
#> 3       2   x1+y
#> 4       2   x1+z
#>

Arguments

Value

Examples

Contents

Author