我有很多csv文件存储在一个文件夹中。即file1.csv,file2.csv,file3.csv等 每个csv文件包含每个对象的相同测量值。 文件如下所示:
ID time measuremment1 measurement2 measurement3
1 5 12 324 123
1 6 123 654 45
1 3 346 556 548
另一个看起来像这样:
ID time measurement1 measurement2 measurement3
2 2 234 345 253
2 8 35 998 316
2 17 515 1005 323
2 50 156 155 616
等等。另外,我有一个数据框,我想为每个对象(文件)执行几次计算,如下所示:
calc<- data.frame(mean1 = mean(measurement1), var1 = var(measurement1),
sd1 = sd(measurement1), mean2 = mean(measurement2), var2 = var(measurement2),
sd2 = sd(measurement2))
等,我想要做的是找到一种方法来迭代地读取每个csv文件并为每个对象执行这些计算。最后,我想将它们导出到一个单独的csv文件中(以便我需要集中的信息),或者在R控制台中打印它并将其从那里复制到文本或excel文件。 我在R工作 任何人都可以提供任何帮助吗? 谢谢!
答案 0 :(得分:2)
这样的事情:
#region IronTiger Boss
if (Owner.Name == "IronTiger")
{
byte times = (byte)Kernel.Random.Next(1, 3);
byte ref_times = (byte)Kernel.Random.Next(1, 6);
for (byte i = 0; i < times; i++)
{
uint Uid = 0;
byte type = (byte)Kernel.Random.Next(1, 28);
switch (type)
{
case 1:
Uid = 824020;
break;
case 2:
Uid = 824019;
break;
case 3:
Uid = 824018;
break;
case 4:
Uid = 823060;
break;
case 5:
Uid = 823061;
break;
case 6:
Uid = 823060;
break;
case 7:
Uid = 823059;
break;
case 8:
Uid = 823058;
break;
case 9:
Uid = 822072;
break;
case 10:
Uid = 822071;
break;
case 11:
Uid = 821033;
break;
case 12:
Uid = 820076;
break;
case 13:
Uid = 820075;
break;
case 14:
Uid = 820074;
break;
case 15:
Uid = 820073;
break;
case 16:
Uid = 800917;
break;
case 17:
Uid = 800811;
break;
case 18:
Uid = 800810;
break;
case 19:
Uid = 800725;
break;
case 20:
Uid = 800618;
break;
case 21:
Uid = 800522;
break;
case 22:
Uid = 800422;
break;
case 23:
Uid = 800255;
break;
case 24:
Uid = 800255;
break;
case 25:
Uid = 800142;
break;
case 26:
Uid = 800111;
break;
case 27:
Uid = 800020;
break;
case 28:
Uid = 821034;
break;
}
if (Uid != 0)
{
killer.Owner.Inventory.Add(Uid, 0, 1);
DeadPool.Kernel.SendWorldMessage(new DeadPool.Network.GamePackets.Message("Congratulations! " + killer.Name + " has killed " + Name + " and dropped! " + Database.ConquerItemInformation.BaseInformations[Uid].Name + "!", System.Drawing.Color.White, 2011), Program.Values);
return;
}
}
}
#endregion
使用此方法,所有数据文件都将加载到R列表中。
library(dplyr)
按dat = sapply(list.files(pattern="csv$"), function(file) {
df = read.csv(file, stringsAsFactors=FALSE, header=TRUE)
df$source = file
df
}, simplify=FALSE)
dat = bind_rows(dat)
汇总:
ID
或者在较新的dat.summary = dat %>% group_by(ID) %>%
summarise_each(funs(mean(., na.rm=TRUE), var(., na.rm=TRUE), sd(., na.rm=TRUE)), -time)
成语中:
dplyr
这样,您一次只能将一个数据文件加载到内存中。
dat.summary = dat %>% group_by(ID) %>%
summarise_at(vars(matches("measurement")),
funs(mean(., na.rm=TRUE), var(., na.rm=TRUE), sd(., na.rm=TRUE)))
dat.summary = sapply(list.files(pattern="csv$"), function(file) {
df = read.csv(file, stringsAsFactors=FALSE, header=TRUE)
# Summarise by ID
df %>% group_by(ID) %>%
summarise_at(vars(matches("measurement")),
funs(mean(., na.rm=TRUE), var(., na.rm=TRUE), sd(., na.rm=TRUE)))
})
dat.summary = bind_rows(dat.summary)
或
write.csv(dat.summary, "my_summary.csv", row.names=FALSE)
答案 1 :(得分:2)
亚历克斯, 这是一个多步骤的过程。
以下是我的行为:
步骤1:使用read.csv函数读取所有文件。
csv1<-read.csv("1.csv")
csv2<-read.csv("1.csv")
csv3<-read.csv("1.csv")
第2步: 您需要将它们组合在一个csv文件中。
csv1$type<-"1"
csv2$type<-"2"
csv3$type<-"3"
csv<-rbind(csv1, csv2,csv3)
确保列匹配,否则上面的最后一步将引发错误。
步骤3:
研究如何使用dplyr查找摘要统计信息。 SO上有很多例子。只有在看到你自己尝试过后,我才能提供帮助。
希望这有帮助。