如何在数据框中添加两列以在R中基于列名的子字符串创建新的第三列?

时间:2015-04-23 03:35:10

标签: r dataframe

让我们考虑一个简单的数据框如下:

id area1feature1 area1feature2 area2feature1 area2feature2
1  1             2             3             4
2  3             6             1             5

现在,我希望将feature1用于所有区域,feature2用于所有区域,依此类推,然后创建新的sumOfFeature1sumOfFeature2等。< / p>

所以预期的输出是这样的:

id area1feature1 area1feature2 area2feature1 area2feature2 sumOfFeature1 sumOfFeature2
1  1             2             3             4             4             6
2  3             6             1             5             4             11

如何根据子字符串匹配列,然后将它们组合起来为数据框创建新列?

1 个答案:

答案 0 :(得分:0)

The way I did it is as follows : Let input be the data frame.

features_to_be_combined <- c('feature1', 'feature2')
locations <- sapply(features_to_be_combined, grep, colnames(input))
feature1_locations <- locations[, 'feature1']
sumOfFeature1 <- rep(0, dim(input)[1])
for (i in 1:length(feature1_locations)) {
    sumOfFeature1 <- sumOfFeature1 + input[, feature1_locations[i]]
}

Now all that remains is to repeat the same procedure for feature2 and then add newly created features, namely sumOfFeature1 and sumOfFeature2, to the input data frame. I am sure there will a better way to do this (may be using apply again on combined features), but this worked for me as expected.