如何在两列范围内运行for循环(或其他方法)

时间:2019-02-01 19:54:07

标签: r for-loop matrix

我有一个for循环,用于一些Web抓取。例如,假设它正在收集历史库存数据。

start <- 1533103200
end <- 1549004400

company <- c("fb","amzn","f")

for (i in company){
    print(paste('https://finance.yahoo.com/quote/',i, '/history?period1=',start,'&period2=',maxDate,'&interval=1d&filter=history&frequency=1d',sep=""))
}

开始和结束是日期代码。现在,我有一个起始日期和结束日期代码(间隔100天)的data.frame,我也想进入打印链接的列表,这意味着我需要三个x的以下data.frame而不是三个链接。在这个例子中,那将是6个链接...

start <- c(1533193200,1541833200)
end <- c(1541746800,1549004400)
dates <- as.data.frame(cbind(start,end))

该列表是动态的并且很长,因此我可能必须将for循环嵌入另一个for循环中,但是我没有太多经验使用两个变量来实现此目的。任何帮助都会很棒!

预期结果将是...。

[1] "https://finance.yahoo.com/quote/fb/history?period1=1533193200&period2=1541746800&interval=1d&filter=history&frequency=1d"
[1] "https://finance.yahoo.com/quote/amzn/history?period1=1533193200&period2=1541746800&interval=1d&filter=history&frequency=1d"
[1] "https://finance.yahoo.com/quote/f/history?period1=1533193200&period2=1541746800&interval=1d&filter=history&frequency=1d"
[1] "https://finance.yahoo.com/quote/fb/history?period1=1541833200&period2=1549004400&interval=1d&filter=history&frequency=1d"
[1] "https://finance.yahoo.com/quote/amzn/history?period1=1541833200&period2=1549004400&interval=1d&filter=history&frequency=1d"
[1] "https://finance.yahoo.com/quote/f/history?period1=1541833200&period2=1549004400&interval=1d&filter=history&frequency=1d"

...而不是第一个循环的结果...

[1] "https://finance.yahoo.com/quote/fb/history?period1=1533103200&period2=1548918000&interval=1d&filter=history&frequency=1d"
[1] "https://finance.yahoo.com/quote/amzn/history?period1=1533103200&period2=1548918000&interval=1d&filter=history&frequency=1d"
[1] "https://finance.yahoo.com/quote/f/history?period1=1533103200&period2=1548918000&interval=1d&filter=history&frequency=1d"

2 个答案:

答案 0 :(得分:0)

您需要遍历公司的AND日期。

start <- c(1533193200,1541833200)
end <- c(1541746800,1549004400)
dates <- as.data.frame(cbind(start,end))

companies <- c("fb","amzn","f")

string <- 'https://finance.yahoo.com/quote/%s/history?period1=%s&period2=%s&interval=1d&filter=history&frequency=1d'

for (company in companies) {
  for (date in 1:nrow(dates)) {
    date <- dates[date, ]
    print(sprintf(string, company, date["start"], date["end"]))
  }
}

[1] "https://finance.yahoo.com/quote/fb/history?period1=1533193200&period2=1541746800&interval=1d&filter=history&frequency=1d"
[1] "https://finance.yahoo.com/quote/fb/history?period1=1541833200&period2=1549004400&interval=1d&filter=history&frequency=1d"
[1] "https://finance.yahoo.com/quote/amzn/history?period1=1533193200&period2=1541746800&interval=1d&filter=history&frequency=1d"
[1] "https://finance.yahoo.com/quote/amzn/history?period1=1541833200&period2=1549004400&interval=1d&filter=history&frequency=1d"
[1] "https://finance.yahoo.com/quote/f/history?period1=1533193200&period2=1541746800&interval=1d&filter=history&frequency=1d"
[1] "https://finance.yahoo.com/quote/f/history?period1=1541833200&period2=1549004400&interval=1d&filter=history&frequency=1d"

答案 1 :(得分:0)

我简化了您的data.frame的结构:

df <- data.frame(
  start = c(1533193200, 1541833200),
  end = c(1541746800, 1549004400)
)

然后,我将为每个公司在data.frame中分配新列:

companies <- c("fb", "amzn", "f")
df[, companies] <- ""

现在,您可以遍历新列并用链接填充它们:

for (i in companies) {
  df[, i] <- paste0(
    'https://finance.yahoo.com/quote/',
    i, '/history?period1=',
    df$start,
    '&period2=',
    df$maxDate,
    '&interval=1d&filter=history&frequency=1d')
}

在单独的列中,每个公司的链接都很好data.frame

> df
       start        end
1 1533193200 1541746800
2 1541833200 1549004400


fb
1 https://finance.yahoo.com/quote/fb/history?period1=1533193200&period2=&interval=1d&filter=history&frequency=1d
2 https://finance.yahoo.com/quote/fb/history?period1=1541833200&period2=&interval=1d&filter=history&frequency=1d
                                                                                                              amzn
1 https://finance.yahoo.com/quote/amzn/history?period1=1533193200&period2=&interval=1d&filter=history&frequency=1d
2 https://finance.yahoo.com/quote/amzn/history?period1=1541833200&period2=&interval=1d&filter=history&frequency=1d
                                                                                                              f
1 https://finance.yahoo.com/quote/f/history?period1=1533193200&period2=&interval=1d&filter=history&frequency=1d
2 https://finance.yahoo.com/quote/f/history?period1=1541833200&period2=&interval=1d&filter=history&frequency=1d

您可以在“整洁”这个,如果你喜欢用的链接,并作为有关链接元信息等栏目列了:

df_tidy <- tidyr::gather(df, company, url, -start, -end)

> df_tidy$url
[1] "https://finance.yahoo.com/quote/fb/history?period1=1533193200&period2=&interval=1d&filter=history&frequency=1d"  
[2] "https://finance.yahoo.com/quote/fb/history?period1=1541833200&period2=&interval=1d&filter=history&frequency=1d"  
[3] "https://finance.yahoo.com/quote/amzn/history?period1=1533193200&period2=&interval=1d&filter=history&frequency=1d"
[4] "https://finance.yahoo.com/quote/amzn/history?period1=1541833200&period2=&interval=1d&filter=history&frequency=1d"
[5] "https://finance.yahoo.com/quote/f/history?period1=1533193200&period2=&interval=1d&filter=history&frequency=1d"   
[6] "https://finance.yahoo.com/quote/f/history?period1=1541833200&period2=&interval=1d&filter=history&frequency=1d"