作为R的初学者,我需要一些帮助:
我已经写了一个函数调用它说有趣(a,b,c)并返回说“d”。 a,b,c是我的4m记录数据集中的列值。我的函数应用了一些逻辑并在“d”上返回一些值,我想稍后将它添加到我的数据集中。
请有人帮我解释一下 1.在具有多个参数的数据集上调用函数 2.将“d”中的新信息添加到我的数据集中 3.足够有效处理4m记录。
提前致谢。
请参阅以下代码
#hybrid FUNCTION
hybridfun <- function(df, lookup, df_year, df_name, df_id, lup_year, lup_name, lup_id_digit, lup_id_letter){
for (i in 1:nrow(lookup)){
df$new = "NOT_SURE"
if (df$df_year == lookup$lup_year)
if (df$df_name == lookup$lup_name)
if (substring(df$df_id, lookup$lup_id_digit, lookup$lup_id_digit) == lookup$lup_id_letter){
df$new = "HYBRID"
break
}
}
print(fuel_type)
}
hybridfun(data, lookup, "data_year", "data_name", "data_id", "lookup_year", "lookup_name", "lookup_id_digit", "lookup_id_letter")
答案 0 :(得分:0)
我不完全确定你要做什么。也许是这样的?
set.seed(2017);
df <- data.frame(
a = rnorm(6),
b = rnorm(6),
c = rnorm(6));
df;
# a b c
#1 1.43420148 -1.958366456 -0.7467347
#2 -0.07729196 -0.001524259 0.3066498
#3 0.73913723 -0.265336001 -1.4304858
#4 -1.75860473 1.563222619 1.1944265
#5 -0.06982523 0.342768064 -0.4820681
#6 0.45190553 1.572425400 1.3178624
# Custom function that sums entries from columns
# with names a, b, c
myfunc <- function(df, a, b, c) {
# Some operation for the three columns, here calculate the sum
df$d <- df$a + df$b + df$c;
return(df);
}
df2 <- myfunc(df, "a", "b", "c");
df2;
# a b c d
#1 1.43420148 -1.958366456 -0.7467347 -1.2708997
#2 -0.07729196 -0.001524259 0.3066498 0.2278336
#3 0.73913723 -0.265336001 -1.4304858 -0.9566846
#4 -1.75860473 1.563222619 1.1944265 0.9990444
#5 -0.06982523 0.342768064 -0.4820681 -0.2091252
#6 0.45190553 1.572425400 1.3178624 3.3421933
对于以后的帖子,请花点时间阅读有关SO的how to ask问题,然后提供minimal reproducible example/attempt,,包括示例数据。