使用大数据集,此代码花费的时间非常长。有没有人有任何更简单的运行方式?
运行该代码,将我的机器锁定一会儿
SID_Scores <- filtered %>%
group_by(SalesPerson_SID) %>%
summarise(
Brand_Advocacy = mean(Q1, na.rm = TRUE),
Vehicle_Satisfaction = mean(Q2, na.rm = TRUE),
Dealer_Satisfaction = mean(Q3, na.rm = TRUE),
Sales_Advocacy = mean(Q6N_srvsls_Recommend_10Pt, na.rm = TRUE),
Overall_SalesCon = mean(Q5N1_ovrsls, na.rm = TRUE),
Understanding_Needs = mean(Q7N1_slsneeds, na.rm = TRUE),
Product_Features = mean(Q7N2_slsfeat, na.rm = TRUE),
Professional_Court = mean(Q7N3_slsprof, na.rm = TRUE),
Feel_Valued = mean(SlsValued, na.rm = TRUE),
Trustworthy = mean(SlsTrustworthy, na.rm = TRUE),
Financial_Arrang = mean(Q5N2_ovrfin, na.rm = TRUE),
Financial_Agreement = mean(Q8N2_finease, na.rm = TRUE),
Respect_Time = mean(Q8N3_fintime, na.rm = TRUE),
Honesty = mean(Q8N4_finhon, na.rm = TRUE),
Delivery = mean(Q5N3_ovrdlv, na.rm = TRUE),
U_Pairing = (sum(filtered$Q_UCPairing == '1', na.rm = TRUE)) / (
sum(filtered$Q_UCPairing == '1', na.rm = TRUE) +
sum(filtered$Q_UCPairing == '2', na.rm = TRUE)
),
U_Demonstrate = (sum(filtered$Q_UCDemonstrate == '1', na.rm = TRUE)) /
(
sum(filtered$Q_UCDemonstrate == '1', na.rm = TRUE) +
sum(filtered$Q_UCDemonstrate == '2', na.rm = TRUE)
),
U_FreeTrials = (sum(filtered$Q_UCFreeTrials == '1', na.rm = TRUE)) /
(
sum(filtered$Q_UCFreeTrials == '1', na.rm = TRUE) +
sum(filtered$Q_UCFreeTrials == '2', na.rm = TRUE)
),
U_Presets = (sum(filtered$Q_UCRadioPreset == '1', na.rm = TRUE)) /
(
sum(filtered$Q_UCRadioPreset == '1', na.rm = TRUE) +
sum(filtered$Q_UCRadioPreset == '2', na.rm = TRUE)
)
) %>%
group_by(SalesPerson_SID)
它已经运行了几个小时。筛选出540000行包含35个变量
以下是用于重现一些示例数据的代码:
structure(list(EventType = c("001", "001", "001", "001", "001",
"001"), `Survey Type` = c("Sales", "Sales", "Sales", "Sales",
"Sales", "Sales"), ModelYear = c(2018, 2019, 2018, 2018, 2018,
2018), PurchaseDate = c(20181209, 20181216, 20181209, 20181215,
20181218, 20181218), `ZoneCode (BC)` = c("32", "71", "71", "51",
"63", "74"), SalesDistrict = c("G", "D", "G", "C", "T", "G"),
SalesGroupSize = c("E", "E", "B", "D", "D", "B"), DealerCode = c("60698",
"45622", "69319", "36277", "44107", "26922"), Q1 = c(9, 8,
10, 10, 10, 9), Q2 = c(9, 10, 10, 10, 10, 9), Q3 = c(8, 10,
10, 10, 10, 9), Q6N_srvsls_Recommend_10Pt = c(9, 10, 10,
10, 10, 9), Q5N1_ovrsls = c(8, 10, 10, 10, 10, 8), Q5N2_ovrfin = c(9,
10, 10, 10, 10, 7), Q5N3_ovrdlv = c(8, NA, 10, 10, 10, 6),
Q5N4_srvsls_facility = c(9, 10, 10, 10, 10, 10), Q7N1_slsneeds = c(9,
10, 10, 10, 10, 9), Q7N2_slsfeat = c(9, 10, 10, 10, 10, 9
), Q7N3_slsprof = c(10, 10, 10, 10, 10, 9), Q8N1_finneg = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), Q8N2_finease = c(9,
10, 10, 9, 10, 7), Q8N3_fintime = c(9, 10, 10, 10, 10, 10
), Q8N4_finhon = c(9, 10, 10, 10, 10, 9), Q9 = c(0, 0, 0,
0, 0, 0), SlsValued = c(9, 10, 10, 10, 10, 8), SlsTrustworthy = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), SlsPaperwork = c(NA,
3, 2, 2, 2, NA), `SlsF&ITransaction` = c(3, 2, 2, 3, 1, 4
), SalesPerson_SID = c("S39547M", "S56830O", "S35478Q", "S61788P",
"S35680B", "S75254K"), Q_UCPairing = c(1, 1, 1, 1, 1, 1),
Q_UCDemonstrate = c(1, 1, 1, 1, NA, 1), Q_UCFreeTrials = c(1,
1, 1, 1, 1, 1), Q_UCRadioPreset = c(1, 1, 1, 2, 1, 1)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -6L), .Names = c("EventType",
"Survey Type", "ModelYear", "PurchaseDate", "ZoneCode (BC)",
"SalesDistrict", "SalesGroupSize", "DealerCode", "Q1", "Q2",
"Q3", "Q6N_srvsls_Recommend_10Pt", "Q5N1_ovrsls", "Q5N2_ovrfin",
"Q5N3_ovrdlv", "Q5N4_srvsls_facility", "Q7N1_slsneeds", "Q7N2_slsfeat",
"Q7N3_slsprof", "Q8N1_finneg", "Q8N2_finease", "Q8N3_fintime",
"Q8N4_finhon", "Q9", "SlsValued", "SlsTrustworthy", "SlsPaperwork",
"SlsF&ITransaction", "SalesPerson_SID", "Q_UCPairing", "Q_UCDemonstrate",
"Q_UCFreeTrials", "Q_UCRadioPreset"))