R加权平均值

时间:2018-06-14 17:03:14

标签: r machine-learning

我完全是R的新手,说实话,我实际上需要在我的天蓝色机器学习模型中加入一个计算。

我有以下数据:

ApplicationID,ApplicationDate,ClientAgeToApplicationDate,RequiredLoanDuration,WeekOfYear,JobName,DistrictName,RiskScore
68679,16.02.2012 0:00:00,55.00000000,12,8,Unknown,Česká Lípa,0
68681,15.02.2012 0:00:00,38.00000000,48,8,Unknown,Olomouc,0
68682,08.02.2012 0:00:00,29.00000000,36,7,Unknown,Třebíč,0
68684,18.02.2012 0:00:00,24.00000000,30,8,Unknown,Uherské Hradiště,4
68687,17.02.2012 0:00:00,32.00000000,24,8,Unknown,Blansko,4

我想让最少的ApplicationDates更具相关性。

我正在使用多级NN并尝试过:

# Sample operation
data.set = rbind(dataset1);

# Take last 1/3 of a year as the most relevant
library(dplyr)
grouped <- data.set %>% group_by(ClientAgeToApplicationDate, RequiredLoanDuration, WeekOfYear, JobName, DistrictName) %>% filter(as.numeric(Sys.Date() - 'ApplicationDate', units="days") <= 120) %>% summarise(mean(RiskScore))
data.set <- data.set %>% left_join(grouped)

# You'll see this output in the R Device port.
# It'll have your stdout, stderr and PNG graphics device(s).
plot(data.set);

帮助将受到重视。 提前谢谢。

1 个答案:

答案 0 :(得分:0)

您正在从Sys.Date()

中减去一个字符串