Question

我环顾四周寻找答案，并没有找到解决方案。

我试图将数据框的多个（~60）列（物种计数）除以数据框中的单个列（样本工作单位）

我能够提出下面的解决方案 - 但它比我想要的更麻烦。正如现在写的那样，我可能会意外地运行最后一行代码两次，并通过分割两次来搞乱我的值。

下面是一个简短的例子，我演示了我使用的解决方案。有什么更清洁的建议吗？

#short data.frame with some count data
#Hours is the sampling effort


counts=data.frame(sp1=sample(1:10,10),sp2=sample(1:10,10),
         sp3=sample(1:10,10),sp4=sample(1:10,10),
         Hours=rnorm(10,4,1))


#get my 'species' names
names=colnames(counts)[1:4]

#This seems messy: and if I run the second line twice, I will screw up my values. I want to divide all 'sp' columns by the single 'Hours' column

rates=counts
rates[names]=rates[,names]/rates[,'Hours']

ps：我一直在使用％＆gt;％，所以如果有人有解决方案我可以转换'count'data.frame而不创建新的data.frame，那就会膨胀！

ps.s.s我怀疑Hadley的一个函数可能有我需要的东西（例如mutate_each？），但我无法弄清楚..

Answer 1

我真的没有看到你的基础R方法有什么问题，它非常干净。如果您担心多次意外运行第二行而不运行第一行，只需引用原始的counts列，如下所示。我会做出微小的调整，就像这样：

rates = counts
rates[names] = counts[names] / counts[["Hours"]]

使用[和[[可以保证数据类型，而不管names的长度。

我喜欢dplyr，但这似乎更加混乱：

# This works if you want everything except the Hours column
rates = counts %>% mutate_each(funs(./Hours), vars = -Hours)

# This sort of works if you want to use the names vector
rates = counts %>% mutate_at(funs(./Hours), .cols = names)

如何在R数据帧中将一列分成多列

1 个答案: