让我们考虑一个简单的数据框如下:
id area1feature1 area1feature2 area2feature1 area2feature2
1 1 2 3 4
2 3 6 1 5
现在,我希望将feature1
用于所有区域,feature2
用于所有区域,依此类推,然后创建新的sumOfFeature1
,sumOfFeature2
等。< / p>
所以预期的输出是这样的:
id area1feature1 area1feature2 area2feature1 area2feature2 sumOfFeature1 sumOfFeature2
1 1 2 3 4 4 6
2 3 6 1 5 4 11
如何根据子字符串匹配列,然后将它们组合起来为数据框创建新列?
答案 0 :(得分:0)
The way I did it is as follows :
Let input
be the data frame.
features_to_be_combined <- c('feature1', 'feature2')
locations <- sapply(features_to_be_combined, grep, colnames(input))
feature1_locations <- locations[, 'feature1']
sumOfFeature1 <- rep(0, dim(input)[1])
for (i in 1:length(feature1_locations)) {
sumOfFeature1 <- sumOfFeature1 + input[, feature1_locations[i]]
}
Now all that remains is to repeat the same procedure for feature2
and then add newly created features, namely sumOfFeature1
and sumOfFeature2
, to the input
data frame.
I am sure there will a better way to do this (may be using apply
again on combined features), but this worked for me as expected.