我认为这是一个愚蠢的问题,但我仍然需要一些语法帮助,因为我是R的新手。
我有一个包含5列的空白数据框。对于每一行,pgsql查询获取一行5个值,这些值需要在循环的每次迭代中添加到数据帧中。
Exiting_df:
Mat CrA Cur Dil Ccl
NA Na Na Na Na
循环的每次迭代都会从pgsql查询中带来一个新的值数据帧,如下所示:
Mat CrA Cur Dil Ccl
5 13 9 44 2
这将附加到Existing_df,如下所示: 在i = 1:
Mat CrA Cur Dil Ccl
5 13 9 44 2
at i = 2:
Mat CrA Cur Dil Ccl
5 13 9 44 2
11 1 113 41 11
at i = 3:
Mat CrA Cur Dil Ccl
5 13 9 44 2
11 1 113 39 11
14 22 79 54 12
依旧......
这是for循环:
for (i in 1:nrow(month_stats))
{
status_counts <- tbl(con_cg_db,sql(
paste("select distinct stagename,
sum(case when stagename is not null then 1 else 0 end)
from current_oppty_sf
where extract(month from cast(loan_agreement_date__c as date)) = ",i,"
group by stagename",sep="")
))
status_counts<- t(as.data.frame(status_counts))
### something needs to go here to appropriately combine the data
###frames as I've described in my question
}
有时,根据数据,从pgsql查询中带来值的数据帧只有3或4列。在这种情况下,缺失的列需要在相应的行和列中自动占用0。主数据框的列。
我该怎么做?
答案 0 :(得分:0)
require(dplyr)
df<- c()
for (i in 1:nrow(month_stats))
{
status_counts <- tbl(con_cg_db,sql(
paste("select distinct stagename,
sum(case when stagename is not null then 1 else 0 end)
from current_oppty_sf
where extract(month from cast(loan_agreement_date__c as date)) = ",i,"
group by stagename",sep="")
))
status_counts<- t(as.data.frame(status_counts))
df <- bind_rows(df, status_counts)
}
而不是零,当缺少某些值时,此代码将为您提供NA。如果你真的想要零,你可以在循环之后插入它们:
df[is.na(df)] <- 0