我有一个如下所示的df:
id type start end features
1 5 word 1 2 NN
2 6 word 3 3 .
3 7 word 5 12 NN
4 8 word 14 19 VBZ
5 9 word 21 30 NN
6 10 word 32 32 WDT
7 11 word 34 37 VBP
8 12 word 39 41 IN
9 13 word 43 44 IN
10 14 word 46 46 DT
我想创建一个新列" sum"在'开始'中的每个值的总和并且'结束'。
我创建了以下功能:
mySum <- function(row) {
row["start"]+row["end"]
}
df$sum <- apply(df,1, mySum );
但是当我运行这个时,我得到以下错误:
Error in row["start"] + row["end"] :
non-numeric argument to binary operator
但是如果我在函数中只保留[&#34; start&#34;]或row [&#34; end&#34;]行,它就会被创建。
我还尝试强制列中的每个值都是数字。
df$start = as.integer(as.vector(df$start));
df$end = as.integer(as.vector(df$end));
但是,只有当我添加值时,我才会得到相同的错误。
我的数据框架结构如下:
在我运行dput(droplevels(head(df,10)))
structure(list(id = 5:14, type = c("word", "word", "word", "word",
"word", "word", "word", "word", "word", "word"), start = c(1L,
3L, 5L, 14L, 21L, 32L, 34L, 39L, 43L, 46L), end = c(2L, 3L, 12L,
19L, 30L, 32L, 37L, 41L, 44L, 46L), features = list(structure(list(
POS = "NN"), .Names = "POS"), structure(list(POS = "."), .Names = "POS"),
structure(list(POS = "NN"), .Names = "POS"), structure(list(
POS = "VBZ"), .Names = "POS"), structure(list(POS = "NN"), .Names = "POS"),
structure(list(POS = "WDT"), .Names = "POS"), structure(list(
POS = "VBP"), .Names = "POS"), structure(list(POS = "IN"), .Names = "POS"),
structure(list(POS = "IN"), .Names = "POS"), structure(list(
POS = "DT"), .Names = "POS"))), .Names = c("id", "type",
"start", "end", "features"), row.names = c(NA, 10L), class = "data.frame")
答案 0 :(得分:1)
只做
df1$Sum <- df1[,'start']+ df1[,'end']
df1$Sum
#[1] 3 6 17 33 51 64 71 80 87 92
或者
rowSums(df1[c('start', 'end')], na.rm=TRUE)
#1 2 3 4 5 6 7 8 9 10
#3 6 17 33 51 64 71 80 87 92
error
表示您有非数字列。检查str(df1)
。如果课程为factor
或character
,请将其更改为numeric
并应用上述代码。例如,如果列为factor
,我们将转换为numeric
df1[c('start', 'end')] <- lapply(df1[c('start', 'end')],
function(x) as.numeric(as.character(x)))
如果是character
列,只需使用as.numeric
。