我想在我的data frame
中搜索所有值,看看是否有无限值,因为当我运行一个函数时,我得到错误:
Error in optim(apply(X, 2, median, na.rm = TRUE), fn = medfun, gr = dmedfun, :
non-finite value supplied by optim
此外:
Warning message:
In betadisper(d, rep(1, ncol(as.matrix(d)))) :
Missing observations due to 'd' removed.
我有一个巨大的文件,它只发生在一些行但我不明白为什么?
由于
答案 0 :(得分:1)
dplyr
如何做到这一点:显然,在这种情况下,只有当您的数字为正数时才会有效,但如果数据更复杂,您可以调整它:
df<-read.table(header=F,text="name1 2 2
name2 1 0
name2 0 2
name3 0 2
name3 0 1")
require(dplyr) #for aggregation
group_by(df,V1) %.% # with each subset of V1 (col 1)
summarise(prod=sum(V2)*sum(V3)) %.% # calculate prod - 0 if either column sums to 0
filter(prod!=0) %.% select(V1) %.% # this selects the rows where prod !=0, and the V1 col
inner_join(df) # join to original df as a filter
V1 V2 V3
1 name1 2 2
2 name2 1 0
3 name2 0 2
或data.table
require(data.table) # prereq
merge(df, # merge the data frame
data.table(df,"V1")[,list(prod=prod(colSums(.SD))),by="V1"][prod!=0,] # this sums all columns except the "by" key, and filters
)[,1:ncol(df)] # this just chops the "prod" column off the end