Question

我想根据同一数据帧的其他列的标准来标准化数据帧列中的值。

例如： 我有一个数据框，其中包含一些丰富度（“丰度”），我想根据其类别（“a”，“b”标准化这些措施“和”c“）。

 class <- c("a","a","a","a","a","b","b","b","b","c","c","c","c")
 abundance <- c(5,54,6,7,876,0,43,54,1,1,34,54,90)

 stand <- new column with the standardized values

你知道是否有某些功能允许我这样做吗？

感谢您的时间

Answer 1

尝试：

ddt = data.table(class, abundance)
ddt[,stdz:=(abundance-mean(abundance))/sd(abundance),by=class]
ddt
    class abundance       stdz
 1:     a         5 -0.4803884
 2:     a        54 -0.3528747
 3:     a         6 -0.4777860
 4:     a         7 -0.4751837
 5:     a       876  1.7862328
 6:     b         0 -0.8725918
 7:     b        43  0.6588959
 8:     b        54  1.0506718
 9:     b         1 -0.8369758
10:     c         1 -1.1744878
11:     c        34 -0.2885884
12:     c        54  0.2483203
13:     c        90  1.2147560

也可以使用@akrun＆amp; amp; @BondedDust：

ddt[,stdz:=scale(abundance),by=class]

Answer 2

您可以使用stdz中的weights，也可以指定weight

library(weights)
with(dat, ave(abundance, class, FUN=stdz))
#[1] -0.4803884 -0.3528747 -0.4777860 -0.4751837  1.7862328 -0.8725918
#[7]  0.6588959  1.0506718 -0.8369758 -1.1744878 -0.2885884  0.2483203
#[13]  1.2147560

或使用scale

中的base R

with(dat, ave(abundance, class, FUN=scale))
#[1] -0.4803884 -0.3528747 -0.4777860 -0.4751837  1.7862328 -0.8725918
#[7]  0.6588959  1.0506718 -0.8369758 -1.1744878 -0.2885884  0.2483203
#[13]  1.2147560

根据R中的条件标准化数据框中的列

2 个答案: