Question

我试图做一些相当简单的事情，但是odo被破坏了，或者我不理解数据共享在这个包的上下文中是如何工作的。

CSV文件：

email,dob
tony@gmail.com,1982-07-13
blah@haha.com,1997-01-01
...

代码：

from odo import odo
import pandas as pd

df = pd.read_csv("...")
connection_str = "postgresql+psycopg2:// ... "

t = odo('path/to/data.csv', connection_str, dshape='var * {email: string, dob: datetime}')

错误：

AssertionError: datashape must be Record type, got 0 * {email: string, dob: datetime}

如果我尝试直接从DataFrame进行操作，则会出现同样的错误 - ＆gt; Postgres也是：

t = odo(df, connection_str, dshape='var * {email: string, dob: datetime}')

其他一些无法解决问题的方法：1）从CSV文件中删除标题行，2）将var更改为DataFrame中的实际行数。

我在这里做错了什么？

Answer 1

connection_str是否有表名？当我遇到类似的问题但是使用sqlite数据库时，我已经修复了它。

应该是这样的：

connection_str = "postgresql+psycopg2://your_database_name::data"
t = odo(df, connection_str, dshape='var * {email: string, dob: datetime}')

其中＆＃39;数据＆＃39;在＆＃39; connection_str＆＃39;是你的新表名。

另见：

python odo sql AssertionError: datashape must be Record type, got 0 * {...}

https://github.com/blaze/odo/issues/580

使用odo加载CSV - ＆gt;在AWS上发布postgres

1 个答案: