我有一个矢量如下:
var connectionString = Configuration.ConfigurationManager.ConnectionStrings["nameOfConnectionString"].ConnectionString;
var sqlConnection = new SqlConnection(connectionString);
我想摆脱NAs,所以我尝试了> dput(v)
structure(c("1", "2", "2", "2", "2", "1", "2", "2", "1", "2",
"2", "1", "1", "2", "2", "2", "1", "2", "2", "2", "2", "1", "2",
"1", "1", "2", "1", "1", "1", "1", "1", "2", "2", "1", "2", "2",
"2", "2", "2", "2", "NA", "NA", "NA", "NA", "NA", "NA", "NA",
"NA", "NA", "2", "1", "2", "2", "1", "2", "2", "1", "1", "1",
"1", "2", "2", "1", "1", "1", "1", "1", "2", "2", "1", "2", "2",
"1", "1", "2", "2", "2", "1", "1", "2", "2", "2", "1", "1", "2",
"1", "2", "1", "2", "1", "2", "1", "1", "1", "2", "1", "2", "1",
"2", "2", "2", "1", "2", "2", "1", "1", "2", "2", "1", "1", "2",
"1", "2", "1", "2", "2", "1", "2", "1", "1", "2", "2", "1", "2",
"2", "2", "2", "2", "2", "2", "1", "2", "2", "1"), .Label = logical(0))
,但这并不起作用。我认为NA不是NA类型,而是字面上的字符串" NA"所以我尝试使用以下
na.omit
其中没有按照我想要的方式工作。
*编辑
v[] <- lapply(v, function(x) {
is.na(levels(x)) <- levels(x) == "NA"
x
})
鉴于此类data.frame,我想删除其中包含NA的任何行。我已尝试> dput(data)
structure(list(w = c(2, 1, 1, 1, 1, 2, 1, 2, 2, 1, 1, 2, 1, 2,
1, 1, 2, 1, 1, 1, 1, 2, 2, 2, 1, 1, 1, 2, 2, 1, 1, 1, 1, 2, 1,
1, 1, 1, 1, 1, 1, 2, 1, 2, 2, 1, 1, 1, 2, 1, 2, 1, 2, 2, 1, 2,
2, 1, 2, 2, 1, 1, 2, 2, 2, 2, 2, 1, 1, 2, 1, 2, 1, 1, 1, 2, 1,
1, 1, 2, 1, 2, 2, 2, 2, 1, 1, 2, 2, 2, 2, 2, 1, 2, 1, 2, 2, 2,
2, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1,
1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2), x = c("1", "2",
"2", "2", "2", "2", "2", "1", "1", "1", "2", "1", "1", "1", "2",
"1", "1", "2", "2", "2", "2", "1", "1", "1", "1", "2", "2", "1",
"1", "2", "1", "2", "2", "1", "2", "1", "2", "2", "1", "1", "NA",
"NA", "NA", "NA", "NA", "NA", "NA", "NA", "NA", "1", "1", "2",
"2", "1", "1", "2", "1", "2", "1", "1", "2", "2", "1", "1", "1",
"1", "1", "2", "2", "1", "2", "2", "2", "2", "2", "1", "2", "1",
"1", "2", "2", "2", "1", "1", "1", "1", "2", "1", "1", "1", "2",
"2", "2", "1", "2", "1", "2", "1", "2", "2", "2", "1", "2", "2",
"1", "1", "2", "2", "1", "2", "2", "1", "1", "2", "2", "1", "1",
"1", "1", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2",
"2", "1", "2", "1"), y = c("1", "2", "2", "2", "2", "1", "2",
"2", "1", "2", "2", "1", "1", "2", "2", "2", "1", "2", "2", "2",
"2", "1", "2", "1", "1", "2", "1", "1", "1", "1", "1", "2", "2",
"1", "2", "2", "2", "2", "2", "2", "NA", "NA", "NA", "NA", "NA",
"NA", "NA", "NA", "NA", "2", "1", "2", "2", "1", "2", "2", "1",
"1", "1", "1", "2", "2", "1", "1", "1", "1", "1", "2", "2", "1",
"2", "2", "1", "1", "2", "2", "2", "1", "1", "2", "2", "2", "1",
"1", "2", "1", "2", "1", "2", "1", "2", "1", "1", "1", "2", "1",
"2", "1", "2", "2", "2", "1", "2", "2", "1", "1", "2", "2", "1",
"1", "2", "1", "2", "1", "2", "2", "1", "2", "1", "1", "2", "2",
"1", "2", "2", "2", "2", "2", "2", "2", "1", "2", "2", "1"),
z = structure(c(2L, 1L, 3L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L,
2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 3L, 2L, 1L,
1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L, 3L, 1L, 2L,
2L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 2L, 3L, 2L, 2L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 1L, 1L), .Label = c("0", "1", "2"), class = "factor")), .Names = c("w",
"x", "y", "z"), row.names = c(11L, 12L, 14L, 16L, 19L, 20L, 24L,
29L, 30L, 34L, 36L, 38L, 41L, 42L, 44L, 63L, 66L, 69L, 74L, 76L,
78L, 80L, 81L, 91L, 93L, 96L, 97L, 98L, 103L, 104L, 106L, 109L,
117L, 118L, 120L, 124L, 125L, 126L, 129L, 133L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 10L, 13L, 15L, 17L, 18L, 21L, 22L, 23L, 25L,
26L, 27L, 28L, 31L, 32L, 33L, 35L, 37L, 39L, 40L, 43L, 45L, 46L,
47L, 48L, 49L, 50L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L,
60L, 61L, 62L, 64L, 65L, 67L, 68L, 70L, 71L, 72L, 73L, 75L, 77L,
79L, 82L, 83L, 84L, 85L, 86L, 87L, 88L, 89L, 90L, 92L, 94L, 95L,
99L, 100L, 101L, 102L, 105L, 107L, 108L, 110L, 111L, 112L, 113L,
114L, 115L, 116L, 119L, 121L, 122L, 123L, 127L, 128L, 130L, 131L,
132L, 134L), class = "data.frame")
ld
个ForImp
,但由于&#34; NA&#34;不是NA类型,因此无效。
答案 0 :(得分:2)
请参阅本文末尾的注释。
你的对象有点奇怪;它是一个字符向量,但它有一个属性"levels"
,它是一个零长度的逻辑向量。
无论如何,你想在这里寻找字符串"NA"
,因为它们是文字"NA"
字符串而不是NA
s。
xx[xx != "NA"]
> xx[xx != "NA"]
[1] "1" "2" "2" "2" "2" "1" "2" "2" "1" "2" "2" "1" "1" "2" "2" "2" "1" "2"
[19] "2" "2" "2" "1" "2" "1" "1" "2" "1" "1" "1" "1" "1" "2" "2" "1" "2" "2"
[37] "2" "2" "2" "2" "2" "1" "2" "2" "1" "2" "2" "1" "1" "1" "1" "2" "2" "1"
[55] "1" "1" "1" "1" "2" "2" "1" "2" "2" "1" "1" "2" "2" "2" "1" "1" "2" "2"
[73] "2" "1" "1" "2" "1" "2" "1" "2" "1" "2" "1" "1" "1" "2" "1" "2" "1" "2"
[91] "2" "2" "1" "2" "2" "1" "1" "2" "2" "1" "1" "2" "1" "2" "1" "2" "2" "1"
[109] "2" "1" "1" "2" "2" "1" "2" "2" "2" "2" "2" "2" "2" "1" "2" "2" "1"
(其中xx
是您发布的对象。)
假设您的数据框现在在xxx
,请先找到"NA"
的元素:
xxx!=“NA”
然后计算行总和,在执行此操作时注意TRUE == 1
和FALSE == 2
,并查找小于ncol(xxx)
(即4)TRUE
值的行。
ind <- rowSums(xxx != "NA") < ncol(xxx)
(@ DavidArenburg建议替代rowSums(xxx == "NA") > 0
,它比上面的版本更简洁,当然比我原来更简洁。)
这表示至少有一个"NA"
字符串
然后使用ind
取消选择xxx
:
XXX <- xxx[!ind, ]
> XXX <- xxx[!ind, ]
> nrow(xxx)
[1] 134
> nrow(XXX)
[1] 125
我会像xx
一样添加xxx
(您的数据框)也有点奇怪:
> str(xxx)
'data.frame': 134 obs. of 4 variables:
$ w: num 2 1 1 1 1 2 1 2 2 1 ...
$ x: chr "1" "2" "2" "2" ...
$ y: chr "1" "2" "2" "2" ...
$ z: Factor w/ 3 levels "0","1","2": 2 1 3 2 2 1 1 1 1 1 ..
似乎你把三种不同类型的对象组合在一起,显然是值0,1,2,但它们实际上是微妙不同的对象。您似乎还有"NA"
个字符串,您可能需要NA
个字符串。我会研究为什么以及如何结束这样的数据框架。
答案 1 :(得分:2)
您在编辑时略微移动了球门柱,但是:
git rebase -i parentBranch
注释:
anyCharNA <- apply(dd,1,function(x) any(x=="NA"))
dim(dd)
## [1] 134 4
dim(dd[!anyCharNA,])
## [1] 125 4
是危险/混乱的,这也是内置函数的名称。 R通常可以区分,但并不总是...... data
... 如果您想要清理数据 - 假设您确实希望所有内容都是整数 -
na.omit()
(额外的复杂性是确保因子正确转换回整数所必需的)