Question

我正在尝试使用dplyr的src_postgres函数建立的postgres连接将数据框保存到AWS红移数据库。如下所示，数据框中有一列超过256个字符（有些甚至更多）。当我尝试将此数据帧保存为redshift时，当我使用dplyr的copy_to函数时，会出现以下错误。无论如何我可以增加字符数的限制，这样我就可以将这个数据框保存到AWS红移上，或者其他人对如何将我的数据框保存到红移有任何建议吗？谢谢。

> nchar(df$text)
[1] 598

> copy_to(conn_dplyr, df, TableName, temporary = FALSE)
Error in postgresqlExecStatement(conn, statement, ...) : 
RS-DBI driver: (could not Retrieve the result : ERROR:  value too long for    type character varying(256)
)

Answer 1

这是因为Redshift不支持Text数据类型。当您将任何列声明为Text时，Redshift会将其内部存储为Varchar（255）。相反，将列/变量更改为varchar（1000）（根据传入的预期值，长度达到最大值。）

Answer 2

I have had a very similar issue recently and found some sort of work around, not very elegant but it worked

 Callback<TableColumn<MyClass, Object>, TableCell<MyClass, Object>> callback = new Callback<TableColumn<MyClass, Object>, TableCell<MyClass, Object>>() {
         @Override
         public TableCell<MyClass, Object> call(TableColumn<MyClass, Object> param) {
            return null;
         }
      };

Then added a simple lookup function:

getColumnClasses <- function(df) {
  return(data.frame(lapply(df[1, ], class)))
}

Finally, you can call rClassToRedshiftType <- function(class) { switch(class, factor = { return('VARCHAR(256)') }, character = { return('VARCHAR(65535)') }, logical = { return('boolean') }, numeric = { return('float') }, integer = { return('int') } ) return('timestamp') } getRedshiftTypesForDataFrame <- function(df) { return( apply( getColumnClasses(df), 2, FUN = rClassToRedshiftType ) ) } using the parameter copy_to

types

Obviously, if you know the columns in advance you can define the dplyr::copy_to( connection, df, table.name, temporary = FALSE, types = getRedshiftTypesForDataFrame(df) ) vector manually

当超过256个字符时，如何将R数据帧保存到AWS红移？

2 个答案: