计算并写入R中列的平均值和标准差

时间:2016-07-11 16:04:58

标签: r csv mean

我目前正在使用* csv中的数据。我已经有了一个有效的脚本来绘制我的数据,但是我被那些看似最简单的任务所困扰。我正在尝试编写一个脚本来获取我的数据(按列排列)并让它按列计算平均值并将其写入新文档(./ testAVG)。

另外,我正在尝试获取相同的数据,计算SD(按列)并将该数据附加到原始文档的末尾(最好是重复我拥有的数据行总数)。

这是我到目前为止的脚本:

#Number of lines with data 
Nlines = 5
#Number of lines to skip
Nskip = 0

chem <- read.table("./test.csv", skip=Nskip, sep=",", col.names = c("Sample", "SiO2", "Al2O3", "FeO", "MgO", "CaO", "Na2O", "K2O", "Total", "eSiO2", "eAl2O3", "eFeO", "eMgO", "eCaO", "eNa2O", "eK2O"), fill=TRUE, header = TRUE, nrow=Nlines)

sd1 <- sd(chem$SiO2)
sd2 <- sd(chem$Al2O3)
sd3 <- sd(chem$FeO)
sd4 <- sd(chem$MgO)
sd5 <- sd(chem$CaO)
sd6 <- sd(chem$Na2O)
sd7 <- sd(chem$K2O)

avg1 <- colMeans(chem$SiO2, na.rm = FALSE, dims=1)
avg2 <- colMeans(chem$Al2O3, na.rm = FALSE, dims=1)
avg3 <- colMeans(chem$FeO, na.rm = FALSE, dims=1)
avg4 <- colMeans(chem$MgO, na.rm = FALSE, dims=1)
avg5 <- colMeans(chem$CaO, na.rm = FALSE, dims=1)
avg6 <- colMeans(chem$Na2O, na.rm = FALSE, dims=1)
avg7 <- colMeans(chem$K2O, na.rm = FALSE, dims=1)

write <- write.table(sd1,sd2,sd3,sd4,sd5,sd6,sd7, file="./test.csv", append=TRUE, sep=",", dec=".", col.names = c("eSiO2", "eAl2O3", "eFeO", "eMgO", "eCaO", "eNa2O", "eK2O"))

write <- write.table(avg1, avg2, avg3, avg4, avg5, avg6, avg7, file="./testAVG.csv", append=FALSE, sep=",", dec=".", col.names = c("Sample", "SiO2", "Al2O3", "FeO", "MgO", "CaO", "Na2O", "K2O", "Total"))

我正在使用的数据是

Sample, SiO2, Al2O3, FeO, MgO, CaO, Na2O, K2O, Total,eSiO2,eAl2O3,eFeO,eMgO,eCaO,eNa2O,eK2O
01,65.01,14.77,0.34,1.31,17.27,1.14,0.2,100,,,,,,,
02,72.6,16.27,0.53,0.06,1.27,5.55,3.71,100,,,,,,,
03,64.95,14.65,0.18,1.29,17.48,1.21,0.23,100,,,,,,,
04,64.95,14.65,0.18,1.29,17.48,1.21,0.23,100,,,,,,,

我收到此错误:

Error in colMeans(chem$SiO2, na.rm = FALSE, dims = 1) : 
  'x' must be an array of at least two dimensions

有什么建议吗?感谢

1 个答案:

答案 0 :(得分:1)

评论已经暗示如何做到这一点,但您似乎对R还不熟悉,所以让我明确告诉您如何使用mtcars数据集更好地做到这一点:

df <- mtcars

df_sd <- apply(df, 2, sd) # this is how to use apply. See ?apply
df_avg <- colMeans(df)    # this is how to use colMeans. See ?colMeans

write.table(df_sd, file="test.csv")     # no assignment necessary.
write.table(df_avg, file="testAVG.csv") # writing the file is a desired side effect...

此外,请考虑以下一行:

avg1 <- colMeans(chem$SiO2, na.rm = FALSE, dims=1)

关于colMeans的一个很酷的事情是它一次计算许多列的列方式。在这里,您只提供一个向量,即chem$SiO2。如果这真的是你想要做的,你只需要写

avg1 <- mean(chem$SiO2)