countNa.Rd
countNa
computes the number of missing values in a data frame.
It counts the number of missings in each column, the number of rows
in which a value in at least one columns is missing, and the
expected number of rows with at least one missing value (computed
under the assumption of independence of missingness in individual
columns). Number of rows left are also given.
Optionally, combinations of columns reaching highest joint
missingness is also reported (if combColCount > 1
and the
number of columns in x
is at least 2
).
countNa(d, sort = TRUE, decreasing = FALSE, combColCount = 3)
d | a data frame |
---|---|
sort | sort columns of 'x' by the number of missings? |
decreasing | if sorting by the number of missing, should the sort be decreasing or increasing? |
combColCount | maximum number of columns to combine when finding a combination of columns reaching the highest number of missings |
A data frame (or a list of two data frames) describing the
missingness. The first data frame consists of rows describes the
missingness in individual columns, plus the missingness in the
combination of all columns (in a row called 'any'), plus the
average missingness (in a row called 'average').
The second data frame (if requested) describes the combinations
of at most combColCount
columns of x
reaching highest joint
missingness.
#> x1 x2 x3 y z #> 1 1 2 1 1 NaN #> 2 1 2 2 NA NaN #> 3 1 2 3 2 3 #> 4 1 2 4 NA 4countNa(d)#> $columns #> names missing missingPercent left leftPercent #> 1 x1 0.0000000 0.00000 4.000000 100.00000 #> 2 x2 0.0000000 0.00000 4.000000 100.00000 #> 3 x3 0.0000000 0.00000 4.000000 100.00000 #> 4 y 2.0000000 50.00000 2.000000 50.00000 #> 5 z 2.0000000 50.00000 2.000000 50.00000 #> 6 any 3.0000000 75.00000 1.000000 25.00000 #> 7 expected 0.9685669 24.21417 3.031433 75.78583 #> #> $columnCombinations #> missing cols #> 1 3 z+y #> 2 3 x1+z+y #> 3 2 x1+y #> 4 2 x1+z #>