我使用替换表命令将以下数据发送到redshift - 是否有一个命令来代替向表中添加新行而不是替换整个内容?
PipelineSimulation<-matrix(,42,7)
PipelineSimulation<-as.data.frame(PipelineSimulation)
PipelineSimulation[1,1]<-"APAC"
PipelineSimulation[1,2]<-"Enterprise"
and so on through
PipelineSimulation[42,3]<-"Commit"
PipelineSimulation[42,4]<-"Upsell"
PipelineSimulation[42,5]<-NAMEFURate
PipelineSimulation[42,6]<-mean(NFUEntTotals)
PipelineSimulation[,7]<-Sys.time()
然后将其变为红移我使用
library(RPostgres)
library(redshiftTools)
library(RPostgreSQL)
library("aws.s3")
library("DBI")
drv<-dbDriver('PostgreSQL')
con <- dbConnect(RPostgres::Postgres(), host='bi-prod-dw-
instance.cceimtxgnc4w.us-west-2.redshift.amazonaws.com', port='5439',
dbname= '***', user="***", password="***", sslmode='require')
query="select * from everyonesdb.jet_pipelinesimulation_historic;"
result<-dbGetQuery(con,query)
print (nrow(result))
Sys.setenv("AWS_ACCESS_KEY_ID" = "***",
"AWS_SECRET_ACCESS_KEY" = "***",
"AWS_DEFAULT_REGION" = "us-west-2")
b=get_bucket(bucket = 'bjnbi-bjnrd/jetPipelineSimulation')
rs_replace_table(PipelineSimulation, con,
tableName='everyonesdb.jet_pipelinesimulation_historic', bucket='bjnbi-
bjnrd/jetPipelineSimulation',split_files =2)
因此,我想保留旧数据而不是rs_replace_table,只是在可能的情况下将新行添加到现有表中
答案 0 :(得分:1)
来自How to bulk upload your data from R into Redshift:
id1, id2 Seq[String] Map[String,(String,Long,Long)]
截断目标表,然后从数据框中完全加载它,只有在你不关心它所持有的当前数据时才这样做。另一方面,
rs_replace_table
替换具有重合键的行,并插入表中不存在的行。
使用rs_upsert_table
代替rs_upsert_table
是否可以解决您的问题?