Question

在之前的查询中，我想知道我是否能找到类似于SAS宏变量的重复过程的解决方案。链接如下：

R macros to enable user defined input similar to %let in SAS

然而，我希望通过探索指定字符列表的可能性而不是用户输入宏变量的值（为方便起见，将其调用为宏变量）来向前迈进一步。

例如，这里是我正在处理的代码的简短摘录，它指定使用paste0函数使用宏变量：

### Change metric between MSP, RSP, Val, Price MSP, Price RSP, Margin
### Change level between product and channel
### Change histyr, baseyr, futeyr according to year value
metric <- "RSP"
level <- "channel"
histyr <- "2009"
baseyr <- "2014"
futeyr <- "2019"
macro <- "gni"
macro1 <- "pce"

inputpath <- "C:/Projects/Consumption curves/UKPOV/excel files/"
outpath <- "C:/Projects/Consumption curves/UKPOV/output/gen_output/"


infile <- paste0(inputpath, metric, "_", level, "_CC_", histyr, "_salespercap.csv")
Category_sales <- read.csv(infile)
macroeco_data <- read.csv(infile2)

macroeco_data$Country <- str_trim(macroeco_data$Country)

sales_nd_macroeco <- sqldf("SELECT  L.*, R.gnippp_histyr as GNI_PPP, R.SR_histyr as SR
                           FROM Category_sales L LEFT JOIN macroeco_data R
                           ON (L.Country = R.Country) order by GNI_PPP DESC")

现在，不是在每次我想要创建字符列表或字符向量时指定每个度量标准，而是使用循环来为每个度量标准运行而无需人工干预。

我尝试了以下但它似乎没有奏效。不确定我这样做是否正确

metric <- c("MSP", "RSP", "Vol", "PriceMSP", "PriceRSP", "Margin")

for (i in metric) {
  level <- "channel"
  histyr <- "2009"
  baseyr <- "2014"
  futeyr <- "2019"
  macro <- "gni"
  macro1 <- "pce"
  inputpath <- "C:/Projects/Consumption curves/UKPOV/excel files/"
  outpath <- "C:/Projects/Consumption curves/UKPOV/output/gen_output/"

  infile <- paste0(inputpath, metric[i], "_", level, "_CC_", histyr, "_salespercap.csv")
  Category_sales <- read.csv(infile)


  infile2 <- paste0(inputpath,"macro_",histyr,".csv")
  macroeco_data<- read.csv(infile2)

  macroeco_data$Country<-str_trim(macroeco_data$Country)

  sales_nd_macroeco <- sqldf("SELECT  L.*, R.gnippp_histyr as GNI_PPP, R.SR_histyr as SR
                           FROM Category_sales L LEFT JOIN macroeco_data R
                           ON (L.Country = R.Country) order by GNI_PPP DESC")
}

错误如下：

 metric<-c("MSP","RSP", "Vol","PriceMSP" ,"PriceRSP", "Margin")
> metric
[1] "MSP"      "RSP"      "Vol"      "PriceMSP" "PriceRSP" "Margin"  
> for(i in metric){
+ level<-"channel"
+ histyr<-"2009"
+ baseyr<-"2014"
+ futeyr<-"2019"
+ macro<-"gni"
+ macro1<-"pce"
+ inputpath<-"C:/Projects/Consumption curves/UKPOV/excel files/"
+ outpath<-"C:/Projects/Consumption curves/UKPOV/output/gen_output/"
+ infile <- paste0(inputpath,metric[i],"_",level,"_CC_",histyr,"_salespercap.csv")
+ Category_sales <- read.csv(infile)
+ infile2 <- paste0(inputpath,"macro_",histyr,".csv")
+ macroeco_data<- read.csv(infile2)
+ macroeco_data$Country<-str_trim(macroeco_data$Country)
+ sales_nd_macroeco <- sqldf("SELECT  L.*, R.gnippp_histyr as GNI_PPP, R.SR_histyr as SR
+ FROM Category_sales L LEFT JOIN macroeco_data R
+ ON (L.Country = R.Country) order by GNI_PPP DESC")
+ }
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'C:/Projects/Consumption curves/UKPOV/excel files/NA_channel_CC_2009_salespercap.csv': No such file or directory
>

Answer 1

以下for循环：

for (x in y) {
  # do things
}

迭代y的元素，依次将每个元素分配给对象x并执行循环中包含的表达式。

在您的示例for (i in metric)中，metric是一个字符向量，对象i依次假定metric的每个元素的值。也就是说，第一次循环，i是"MSP";第二次，i为"RSP"，依此类推。所以稍后，在您引用metric[i]时，第一次通过循环，这相当于metric["MSP"]，当然是NA（在您的未命名向量的情况下）。这反过来导致文件名"C:/Projects/Consumption curves/UKPOV/excel files/NA_channel_CC_2009_salespercap.csv"。

您引用metric[i]这一事实表明您希望i的值为metric元素的索引，即数字为了实现这种行为，您通常使用以下循环：

for (i in 1:length(metric)) {
  # do things
}

或等同于

for (i in seq_along(metric)) {
  # do things
}

以下内容可能会起到作用：

metric <- c('MSP', 'RSP', 'Vol', 'PriceMSP', 'PriceRSP', 'Margin')
inputpath <- 'C:/Projects/Consumption curves/UKPOV/excel files/'
level <- 'channel'
histyr <- '2009'

macroeco_data <- read.csv(paste0(inputpath, 'macro_', histyr, '.csv'))
macroeco_data$Country <- str_trim(macroeco_data$Country)

for (x in metric) {
  f <- paste0(inputpath, x, '_', level, '_CC_', histyr, '_salespercap.csv')
  Category_sales <- read.csv(f)
  sales_nd_macroeco <- sqldf('SELECT  L.*, R.gnippp_histyr as GNI_PPP, R.SR_histyr as SR
                             FROM Category_sales L LEFT JOIN macroeco_data R
                             ON (L.Country = R.Country) order by GNI_PPP DESC')
})

请注意，我已将所有内容从循环中拉出来，而不需要在那里。

另外，请注意每次循环时都会覆盖sales_nd_macroeco的值，因此该对象的最终值将与度量"Margin"相对应。要改为返回list个对象，您可以使用for (i in seq_along(metric))迭代索引1：6，将sqldf的结果分配给sales_nd_macroeco[[i]]，其中sales_nd_macroeco现在是您开始定义的长度为6的空列表，或者您可以使用lapply：

sales_nd_macroeco <- lapply(metric, function(x) {
  f <- paste0(inputpath, x, '_', level, '_CC_', histyr, '_salespercap.csv')
  Category_sales <- read.csv(f)
  sqldf('SELECT  L.*, R.gnippp_histyr as GNI_PPP, R.SR_histyr as SR
         FROM Category_sales L LEFT JOIN macroeco_data R
         ON (L.Country = R.Country) order by GNI_PPP DESC')
})

循环包含用R中的字符列表/向量指定的宏变量

1 个答案: