Question

所以我有列表的数据框。

以下是数据框的代码：

Cafe_dataframe = pd.DataFrame({'카페 URL' : URL_LIST,
                            '카페명' : CafeTitle,
                            '카테고리명' : Category_Name,
                            '글쓴이ID' : NaverID,
                            '포스트 제목' : PostTitle,
                            '포스트일' : Posting_Date,
                            '조회수' : The_Number_of_Views,
                            '비디오수' : The_Number_of_Video,
                            '그림수' : The_Number_of_Picture,
                            '댓글수' : The_Number_of_Comment,
                            '글자수' : The_Number_of_Characters,
                            '키워드수' : The_Number_of_Keyword
                           })

Cafe_dataframe.to_csv('cafe_data.csv', index=True, encoding='UTF8')

path="./cafe_data.csv"
with open(path, 'r', encoding='UTF8', errors='replace') as infile, open('cafe_data_.csv', 'w', encoding='euc-kr', errors='replace') as outfile:
inputs = csv.reader(infile)
output = csv.writer(outfile)

for index, row in enumerate(inputs):
    output.writerow(row)

os.remove('cafe_data.csv')

并引发此错误：

ValueError: arrays must all be same length

现在，我知道dataframe不能用长度不同的列表制成，我检查了每个列表的长度，结果发现URL_LIST有{{1} }元素，而其他元素只有1000。

但是我需要的是使用{em>长度不限的列表创建755文件的方法。

还有其他方法可以创建带有列表的csv文件吗？

还是仍然可以忽略CSV而仍然创建ValueError？

Answer 1

使用collections.OrderedDict和itertools.zip_longest：

from collections import OrderedDict
from itertools import zip_longest

d = OrderedDict({"A": [0,1], "C": [0,1,2,4], "B": [0,1,2]})

df = pd.DataFrame(list(zip_longest(*d.values())), columns = d.keys())
print(df)
     A  C    B
0  0.0  0  0.0
1  1.0  1  1.0
2  NaN  2  2.0
3  NaN  4  NaN

注意：OrderedDict用于确保d.values()和d.keys()顺序正确。如果您使用的是python 3.6或更高版本，则可以使用普通的dict。

将列表移到CSV的更好方法？

1 个答案: