Question

我正在分析两个有一些缺失值的因子变量。如何在表命令中省略缺失值：

> table(code3,code4)
       code4
code3     HIGH LOW
        134    9   1
   HIGH  22    7   0
   LOW   19    0   8
> 
>
> round(prop.table(table(code3,code4),2),2)
       code4
code3      HIGH  LOW
        0.77 0.56 0.11
   HIGH 0.13 0.44 0.00
   LOW  0.11 0.00 0.89
>

我希望桌子仅显示＆＃34; HIGH＆＃34;和＆＃34;低＆＃34;值列和行，即省略所有缺失值。

另外请告诉我这些缺失值是否会对chisq.test产生任何影响：

> 
> chisq.test(code3,code4)

        Pearson's Chi-squared test

data:  code3 and code4 
X-squared = 57.8434, df = 4, p-value = 8.231e-12

Warning message:
In chisq.test(code3, code4) :
  Chi-squared approximation may be incorrect
> 
>

我怀疑这是一个简单的问题，但我在互联网上找不到任何简单的答案。

＆＃34;帮助（表）＆＃34; R中的命令提供以下信息：

## NA counting:
     is.na(d) <- 3:4
     d. <- addNA(d)
     d.[1:7]
     table(d.) # ", exclude = NULL" is not needed
     ## i.e., if you want to count the NA's of 'd', use
     table(d, useNA="ifany")

我如何根据我的要求进行调整？谢谢你的帮助。

Answer 1

我怀疑你的缺失价值＆＃39;是空白（""）。如果您将它们编码为NA，则可以让您的生活更轻松。

一个小例子（我猜想发生了什么）

# sample data with some 'missing values'
x <- c("high", "", "low", "", "high", "")
x
table(x)
#   high  low 
# 3    2    1     

# replace "" with R:s 'official' missing values
x[x == ""] <- NA

table(x)
# x
# high  low 
#    2    1

这里也许与na.strings中的read.table参数相关。

下次，请提供最小化，自包含的示例。请查看以下链接，了解一般提示，以及如何在R中进行操作：here，here和here。

在table命令中处理缺失值

1 个答案: