如何将函数应用于列表中存储的特定数据帧列

时间:2017-01-13 22:35:51

标签: r

我有一组带有标题的csv文件,但包含不同日期的记录,如

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>

<head>
  <meta http-equiv="content-type" content="text/html; charset=utf-8" />
  <title></title>
  <meta name="generator" content="LibreOffice 5.1.4.2 (Linux)" />
  <meta name="created" content="2017-01-13T23:23:37.721354329" />
  <meta name="changed" content="2017-01-13T23:24:34.687494320" />
</head>

<body lang="en-US" dir="ltr">
  <p>
    <br/>
    <br/>

  </p>
  <table width="664" cellpadding="4" cellspacing="0">
    <col width="123">
      <col width="125">
        <col width="125">
          <col width="125">
            <col width="124">
              <tr valign="top">
                <td width="123" style="border-top: 1px solid #000000; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0.04in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>arrtime</p>
                </td>
                <td width="125" style="border-top: 1px solid #000000; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0.04in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>interaatime</p>
                </td>
                <td width="125" style="border-top: 1px solid #000000; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0.04in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>serTime</p>
                </td>
                <td width="125" style="border-top: 1px solid #000000; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0.04in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>DepTime</p>
                </td>
                <td width="124" style="border: 1px solid #000000; padding: 0.04in">
                  <p>ActSerTime</p>
                </td>
              </tr>
              <tr valign="top">
                <td width="123" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>8.37</p>
                </td>
                <td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>0</p>
                </td>
                <td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>8.39</p>
                </td>
                <td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>8.40</p>
                </td>
                <td width="124" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0.04in">
                  <p>0.01</p>
                </td>
              </tr>
              <tr valign="top">
                <td width="123" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>8.39</p>
                </td>
                <td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>2</p>
                </td>
                <td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>8.40</p>
                </td>
                <td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>8.42</p>
                </td>
                <td width="124" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0.04in">
                  <p>0.02</p>
                </td>
              </tr>
              <tr valign="top">
                <td width="123" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>8.40</p>
                </td>
                <td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>1</p>
                </td>
                <td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>8.42</p>
                </td>
                <td width="125" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: none; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0in">
                  <p>8.46</p>
                </td>
                <td width="124" style="border-top: none; border-bottom: 1px solid #000000; border-left: 1px solid #000000; border-right: 1px solid #000000; padding-top: 0in; padding-bottom: 0.04in; padding-left: 0.04in; padding-right: 0.04in">
                  <p>0.04</p>
                </td>
              </tr>
  </table>
  <p style="margin-bottom: 0in">
    <br/>

  </p>
</body>

</html>

我使用代码导入文件

    #Scripts to load all the csv files of the queue data from day 1 to 13
    path <- "/home/ilanre/Documents/Queue Research Work/"
    files <- list.files(path=path, pattern="*.csv")
    for(file in files)
    {
      perpos <- which(strsplit(file, "")[[1]]==".")
      assign(
      gsub(" ","",substr(file, 1, perpos-1)), 
      read.csv(paste(path,file,sep="")))
    }

我想使用lappy()函数查找每天第2列和第5列的总和,并将所有内容存储在列表或数据框中 感谢

2 个答案:

答案 0 :(得分:0)

使用您已创建的对象files,以下内容将创建一个列表,其中包含第二天的总和以及每天的第五列的总和。

list_of_sums <- lapply(files, function(file){
      df <- read.csv(file) 
      c( sum(df)[, 2]), sum(df)[, 5] )
    })

答案 1 :(得分:0)

考虑data.frame()内的lapply来绑定两列的总和:

path <- "/home/ilanre/Documents/Queue Research Work/"
files <- list.files(path=path, pattern="*.csv")

dfList <- lapply(files, function(f){
     df <- read.csv(paste0(path, f))
     data.frame(SumOfInteraatime = sum(df$interaatime), SumOfActSerTime = sum(df$ActSerTime))
})

dfList <- setNames(dfList, gsub(".csv", "", files))