在之前的查询中,我想知道我是否能找到类似于SAS宏变量的重复过程的解决方案。链接如下:
R macros to enable user defined input similar to %let in SAS
然而,我希望通过探索指定字符列表的可能性而不是用户输入宏变量的值(为方便起见,将其调用为宏变量)来向前迈进一步。
例如,这里是我正在处理的代码的简短摘录,它指定使用paste0
函数使用宏变量:
### Change metric between MSP, RSP, Val, Price MSP, Price RSP, Margin
### Change level between product and channel
### Change histyr, baseyr, futeyr according to year value
metric <- "RSP"
level <- "channel"
histyr <- "2009"
baseyr <- "2014"
futeyr <- "2019"
macro <- "gni"
macro1 <- "pce"
inputpath <- "C:/Projects/Consumption curves/UKPOV/excel files/"
outpath <- "C:/Projects/Consumption curves/UKPOV/output/gen_output/"
infile <- paste0(inputpath, metric, "_", level, "_CC_", histyr, "_salespercap.csv")
Category_sales <- read.csv(infile)
macroeco_data <- read.csv(infile2)
macroeco_data$Country <- str_trim(macroeco_data$Country)
sales_nd_macroeco <- sqldf("SELECT L.*, R.gnippp_histyr as GNI_PPP, R.SR_histyr as SR
FROM Category_sales L LEFT JOIN macroeco_data R
ON (L.Country = R.Country) order by GNI_PPP DESC")
现在,不是在每次我想要创建字符列表或字符向量时指定每个度量标准,而是使用循环来为每个度量标准运行而无需人工干预。
我尝试了以下但它似乎没有奏效。不确定我这样做是否正确
metric <- c("MSP", "RSP", "Vol", "PriceMSP", "PriceRSP", "Margin")
for (i in metric) {
level <- "channel"
histyr <- "2009"
baseyr <- "2014"
futeyr <- "2019"
macro <- "gni"
macro1 <- "pce"
inputpath <- "C:/Projects/Consumption curves/UKPOV/excel files/"
outpath <- "C:/Projects/Consumption curves/UKPOV/output/gen_output/"
infile <- paste0(inputpath, metric[i], "_", level, "_CC_", histyr, "_salespercap.csv")
Category_sales <- read.csv(infile)
infile2 <- paste0(inputpath,"macro_",histyr,".csv")
macroeco_data<- read.csv(infile2)
macroeco_data$Country<-str_trim(macroeco_data$Country)
sales_nd_macroeco <- sqldf("SELECT L.*, R.gnippp_histyr as GNI_PPP, R.SR_histyr as SR
FROM Category_sales L LEFT JOIN macroeco_data R
ON (L.Country = R.Country) order by GNI_PPP DESC")
}
错误如下:
metric<-c("MSP","RSP", "Vol","PriceMSP" ,"PriceRSP", "Margin")
> metric
[1] "MSP" "RSP" "Vol" "PriceMSP" "PriceRSP" "Margin"
> for(i in metric){
+ level<-"channel"
+ histyr<-"2009"
+ baseyr<-"2014"
+ futeyr<-"2019"
+ macro<-"gni"
+ macro1<-"pce"
+ inputpath<-"C:/Projects/Consumption curves/UKPOV/excel files/"
+ outpath<-"C:/Projects/Consumption curves/UKPOV/output/gen_output/"
+ infile <- paste0(inputpath,metric[i],"_",level,"_CC_",histyr,"_salespercap.csv")
+ Category_sales <- read.csv(infile)
+ infile2 <- paste0(inputpath,"macro_",histyr,".csv")
+ macroeco_data<- read.csv(infile2)
+ macroeco_data$Country<-str_trim(macroeco_data$Country)
+ sales_nd_macroeco <- sqldf("SELECT L.*, R.gnippp_histyr as GNI_PPP, R.SR_histyr as SR
+ FROM Category_sales L LEFT JOIN macroeco_data R
+ ON (L.Country = R.Country) order by GNI_PPP DESC")
+ }
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open file 'C:/Projects/Consumption curves/UKPOV/excel files/NA_channel_CC_2009_salespercap.csv': No such file or directory
>
答案 0 :(得分:2)
以下for
循环:
for (x in y) {
# do things
}
迭代y
的元素,依次将每个元素分配给对象x
并执行循环中包含的表达式。
在您的示例for (i in metric)
中,metric
是一个字符向量,对象i
依次假定metric
的每个元素的值。也就是说,第一次循环,i
是"MSP"
;第二次,i
为"RSP"
,依此类推。所以稍后,在您引用metric[i]
时,第一次通过循环,这相当于metric["MSP"]
,当然是NA
(在您的未命名向量的情况下)。这反过来导致文件名"C:/Projects/Consumption curves/UKPOV/excel files/NA_channel_CC_2009_salespercap.csv"
。
您引用metric[i]
这一事实表明您希望i
的值为metric
元素的索引,即数字为了实现这种行为,您通常使用以下循环:
for (i in 1:length(metric)) {
# do things
}
或等同于
for (i in seq_along(metric)) {
# do things
}
以下内容可能会起到作用:
metric <- c('MSP', 'RSP', 'Vol', 'PriceMSP', 'PriceRSP', 'Margin')
inputpath <- 'C:/Projects/Consumption curves/UKPOV/excel files/'
level <- 'channel'
histyr <- '2009'
macroeco_data <- read.csv(paste0(inputpath, 'macro_', histyr, '.csv'))
macroeco_data$Country <- str_trim(macroeco_data$Country)
for (x in metric) {
f <- paste0(inputpath, x, '_', level, '_CC_', histyr, '_salespercap.csv')
Category_sales <- read.csv(f)
sales_nd_macroeco <- sqldf('SELECT L.*, R.gnippp_histyr as GNI_PPP, R.SR_histyr as SR
FROM Category_sales L LEFT JOIN macroeco_data R
ON (L.Country = R.Country) order by GNI_PPP DESC')
})
请注意,我已将所有内容从循环中拉出来,而不需要在那里。
另外,请注意每次循环时都会覆盖sales_nd_macroeco
的值,因此该对象的最终值将与度量"Margin"
相对应。要改为返回list
个对象,您可以使用for (i in seq_along(metric))
迭代索引1:6,将sqldf
的结果分配给sales_nd_macroeco[[i]]
,其中sales_nd_macroeco
现在是您开始定义的长度为6的空列表,或者您可以使用lapply
:
sales_nd_macroeco <- lapply(metric, function(x) {
f <- paste0(inputpath, x, '_', level, '_CC_', histyr, '_salespercap.csv')
Category_sales <- read.csv(f)
sqldf('SELECT L.*, R.gnippp_histyr as GNI_PPP, R.SR_histyr as SR
FROM Category_sales L LEFT JOIN macroeco_data R
ON (L.Country = R.Country) order by GNI_PPP DESC')
})