连接pandas数据帧 - 传递的所有对象都是None

时间:2013-07-12 16:20:00

标签: python pandas

我正在尝试以块的形式读取和过滤csv文件,然后将结果放入数据帧中。

以下是我用于阅读和过滤csv的内容:

csv_chunks = pandas.read_csv(filepath, sep = DELIMITER,skiprows = 2, chunksize = 1000, converters = {"A": str, "B": str})
for chunk in csv_chunks:
    chunk = chunk[(chunk["B"] + chunk["A"]).isin(acids.tolist())]

当我去连接块时

df = pandas.concat(chunk for chunk in csv_chunks)

我收到错误说

  File "C:\Program Files\Python\Anaconda\lib\site-packages\pandas\tools\merge.py
", line 872, in concat
verify_integrity=verify_integrity)
File "C:\Program Files\Python\Anaconda\lib\site-packages\pandas\tools\merge.py
", line 913, in __init__
raise Exception('All objects passed were None')
Exception: All objects passed were None

有几个空是空的,但也有非空的,所以不确定哪些对象被视为无。欢迎任何想法!

谢谢, 安

1 个答案:

答案 0 :(得分:1)

尝试:

csv_chunks = [chunk[(chunk["B"] + chunk["A"]).isin(acids.tolist())]
              for chunk in csv_chunks]
df = pandas.concat(csv_chunks)

代码

for chunk in csv_chunks:
    chunk = chunk[(chunk["B"] + chunk["A"]).isin(acids.tolist())]

可能没有按你的意愿行事。随着for-loop的每次迭代,for chunk in csv_chunks会将csv_chunks中的项目分配给chunk。然后,

chunk = chunk[(chunk["B"] + chunk["A"]).isin(acids.tolist())]

立即重新分配新值chunk。很好,但这不会改变csv_chunks中的项目。您只是在一些自变量chunk中重复该值。

要修改csv_chunks中的值,您可以使用列表推导来构建新列表,然后重新分配到变量{{1} }:

csv_chunks