在R中的双循环中使用字符串变量

时间:2016-02-01 15:31:01

标签: r loops

我需要遍历一些包含两个变量的数据:season和school。如果我把学校变量固定(下面)我可以让它循环我指定的季节:

library(XML)

# parameters
first_season <- 2014
last_season <- 2015

# seasons 
num_seasons <- as.numeric(last_season - first_season + 1)
seasons <- seq(first_season, last_season, by=1)

# defense
defense <- data.frame()
for (i in 1:num_seasons) {
  url <- paste("http://www.sports-reference.com/cfb/schools/wisconsin/", seasons[i], ".html", sep = "") 
  df <- readHTMLTable(url,which=4, header=FALSE, stringsAsFactors=F)
  df$season = seasons[i]
  defense <- rbind(defense, df)
  rm(df)
  print(seasons[i])
}

我的问题是我不知道如何添加a)一个额外的参数来循环,以及b)如果参数是非数字的,如何处理它。

我的学校列表位于表格/列colleges$school ^

> head(colleges$school)
[1] "Air Force"           "Akron"              
[3] "Alabama"             "Alabama-Birmingham" 
[5] "Alameda Coast Guard" "Alcorn State" 

^网址将始终为lower(colleges$school)-替换,但我可以控制它。

提前致谢!

1 个答案:

答案 0 :(得分:0)

不确定我理解(b)。你的意思是传递参数(例如school [j]),或存储数据(例如seasons [i])。

我所做的就是添加一个外循环并迭代大学。我将结果存储在一个名为school_defense的新df中。我没有你的学校名单所以我无法测试它。

 library(XML)

# parameters
first_season <- 2014
last_season <- 2015

# seasons 
num_seasons <- as.numeric(last_season - first_season + 1)
seasons <- seq(first_season, last_season, by=1)

# schools 
schools <- unique(lower(colleges$school))

# defense
school_defense <- data.frame()
for(j in 1:length(schools)){
    defense <- data.frame()
    for (i in 1:num_seasons) {
      url <- paste("http://www.sports-reference.com/cfb/schools/", school[j],"/", seasons[i], ".html", sep = "") 
      df <- readHTMLTable(url,which=4, header=FALSE, stringsAsFactors=F)
      df$season = seasons[i]
      defense <- rbind(defense, df)
      rm(df)
      print(seasons[i])
    }
 defense <- data.frame(school = rep(school[j], nrow(defense)), defense)
 school_defense <- data.frame(school_defense, defense)
}