数据集如下:
"1" 10 40 "r" "q" "0" "r" "r" "0" "r" "0" "0" "0" "0" "0" "t" "q" "0" "0" "s" "0" "r" 0 "0" 0 "0" "0" 0 0 0 "0"
"2" 10 173 "s" "s" "s" "0" "0" "s" "s" "0" "t" "t" "s" "t" "t" "r" "s" "0" "q" "0" "0" 0 "0" 0 "0" "0" 0 0 0 "0"
"3" 10 2107 "t" "0" "0" "s" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" 0 "0" 0 "0" "0" 0 0 0 "0"
"4" 10 993 "s" "0" "q" "s" "s" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" 0 "0" 0 "0" "0" 0 0 0 "0"
"5" 10 1712 "t" "0" "s" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "s" "0" "t" "0" 0 "0" 0 "0" "0" 0 0 0 "0"
"6" 776 1872 "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" 0 "r" 0 "0" "0" 0 0 0 "s"
输出应为:
"1" 10 40 "r" "q" "0" "r" "r" "0" "r" "0" "0" "0" "0" "0" "t" "q" "0" "0" "s" "0" "r" 0 "0" 0 "0" "0" 0 0 0 "0"
"2" 10 173 "s" "s" "s" "0" "0" "s" "s" "0" "t" "t" "s" "t" "t" "r" "s" "0" "q" "0" "0" 0 "0" 0 "0" "0" 0 0 0 "0"
"4" 10 993 "s" "0" "q" "s" "s" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" 0 "0" 0 "0" "0" 0 0 0 "0"
"5" 10 1712 "t" "0" "s" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "0" "s" "0" "t" "0" 0 "0" 0 "0" "0" 0 0 0 "0"
我尝试的代码是:
x=read.table("sample.txt")
nrowx=nrow(x)
for(i in 1:nrowx)
{
count=0
for(j in 3:30)
{
if(x[i,j]!=0)
count = count+1
}
if(count<4)
x[i,]=NA
}
x=x[complete.cases(x),]
请建议一些不涉及循环的方法。
答案 0 :(得分:1)
看起来您的行中没有任何行包含少于四个非零条目:
例如,打印每行非零条目的数量,tab
为您的表格:
apply(tab, 1, function(x)sum(x!="0"))
[1] 12 16 5 7 7 5
例如,要删除少于5个非零项的所有行,您可以
tab[-which(apply(tab, 1, function(x)sum(x!="0"))<=5),]
但是,我不确定数据中的第一列是否被视为数据框中的列。
这有帮助吗?