我正在尝试使用dplyr的src_postgres
函数建立的postgres连接将数据框保存到AWS红移数据库。如下所示,数据框中有一列超过256个字符(有些甚至更多)。当我尝试将此数据帧保存为redshift时,当我使用dplyr的copy_to
函数时,会出现以下错误。无论如何我可以增加字符数的限制,这样我就可以将这个数据框保存到AWS红移上,或者其他人对如何将我的数据框保存到红移有任何建议吗?谢谢。
> nchar(df$text)
[1] 598
> copy_to(conn_dplyr, df, TableName, temporary = FALSE)
Error in postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver: (could not Retrieve the result : ERROR: value too long for type character varying(256)
)
答案 0 :(得分:0)
这是因为Redshift不支持Text数据类型。当您将任何列声明为Text时,Redshift会将其内部存储为Varchar(255)。 相反,将列/变量更改为varchar(1000)(根据传入的预期值,长度达到最大值。)
答案 1 :(得分:0)
I have had a very similar issue recently and found some sort of work around, not very elegant but it worked
Callback<TableColumn<MyClass, Object>, TableCell<MyClass, Object>> callback = new Callback<TableColumn<MyClass, Object>, TableCell<MyClass, Object>>() {
@Override
public TableCell<MyClass, Object> call(TableColumn<MyClass, Object> param) {
return null;
}
};
Then added a simple lookup function:
getColumnClasses <- function(df) {
return(data.frame(lapply(df[1, ], class)))
}
Finally, you can call rClassToRedshiftType <- function(class) {
switch(class,
factor = {
return('VARCHAR(256)')
},
character = {
return('VARCHAR(65535)')
},
logical = {
return('boolean')
},
numeric = {
return('float')
},
integer = {
return('int')
}
)
return('timestamp')
}
getRedshiftTypesForDataFrame <- function(df) {
return(
apply(
getColumnClasses(df), 2,
FUN = rClassToRedshiftType
)
)
}
using the parameter copy_to
types
Obviously, if you know the columns in advance you can define the dplyr::copy_to(
connection,
df, table.name,
temporary = FALSE, types = getRedshiftTypesForDataFrame(df)
)
vector manually