我有一组带有标题的csv文件,但包含不同日期的记录,如
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title></title>
<meta name="generator" content="LibreOffice 5.1.4.2 (Linux)" />
<meta name="created" content="2017-01-13T23:23:37.721354329" />
<meta name="changed" content="2017-01-13T23:24:34.687494320" />
</head>
<body lang="en-US" dir="ltr">
<p>
<br/>
<br/>
</p>
<table width="664" cellpadding="4" cellspacing="0">
<col width="123">
<col width="125">
<col width="125">
<col width="125">
<col width="124">
<tr valign="top">
<td width="123" style="border-top: 1px solid #000000; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0.04in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>arrtime</p>
</td>
<td width="125" style="border-top: 1px solid #000000; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0.04in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>interaatime</p>
</td>
<td width="125" style="border-top: 1px solid #000000; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0.04in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>serTime</p>
</td>
<td width="125" style="border-top: 1px solid #000000; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0.04in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>DepTime</p>
</td>
<td width="124" style="border: 1px solid #000000; padding: 0.04in">
<p>ActSerTime</p>
</td>
</tr>
<tr valign="top">
<td width="123" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>8.37</p>
</td>
<td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>0</p>
</td>
<td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>8.39</p>
</td>
<td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>8.40</p>
</td>
<td width="124" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0.04in">
<p>0.01</p>
</td>
</tr>
<tr valign="top">
<td width="123" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>8.39</p>
</td>
<td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>2</p>
</td>
<td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>8.40</p>
</td>
<td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>8.42</p>
</td>
<td width="124" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0.04in">
<p>0.02</p>
</td>
</tr>
<tr valign="top">
<td width="123" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>8.40</p>
</td>
<td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>1</p>
</td>
<td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>8.42</p>
</td>
<td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
<p>8.46</p>
</td>
<td width="124" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0.04in">
<p>0.04</p>
</td>
</tr>
</table>
<p style="margin-bottom: 0in">
<br/>
</p>
</body>
</html>
我使用代码导入文件
#Scripts to load all the csv files of the queue data from day 1 to 13
path <- "/home/ilanre/Documents/Queue Research Work/"
files <- list.files(path=path, pattern="*.csv")
for(file in files)
{
perpos <- which(strsplit(file, "")[[1]]==".")
assign(
gsub(" ","",substr(file, 1, perpos-1)),
read.csv(paste(path,file,sep="")))
}
我想使用lappy()函数查找每天第2列和第5列的总和,并将所有内容存储在列表或数据框中 感谢
答案 0 :(得分:0)
使用您已创建的对象files
,以下内容将创建一个列表,其中包含第二天的总和以及每天的第五列的总和。
list_of_sums <- lapply(files, function(file){
df <- read.csv(file)
c( sum(df)[, 2]), sum(df)[, 5] )
})
答案 1 :(得分:0)
考虑data.frame()
内的lapply
来绑定两列的总和:
path <- "/home/ilanre/Documents/Queue Research Work/"
files <- list.files(path=path, pattern="*.csv")
dfList <- lapply(files, function(f){
df <- read.csv(paste0(path, f))
data.frame(SumOfInteraatime = sum(df$interaatime), SumOfActSerTime = sum(df$ActSerTime))
})
dfList <- setNames(dfList, gsub(".csv", "", files))