我一直在尝试的方法,但是它返回[“ NOTES1”,“ NOTES2”,“ NOTES3”],而不是数据帧列的内容:
df_word_list = []
df_notes = df[["NOTES1" , "NOTES2", "NOTES3"]]
one_list = list(flatten(df_notes.values.tolist()))
for word in df_notes:
df_word_list.append(word)
print(df_word_list)
这是否意味着未正确读取数据帧?谢谢
答案 0 :(得分:1)
您似乎正在尝试以两种不同的方式将单词放入数据帧的单词到单个列表中?
import pandas as pd
data = [{"NOTES1": "annual report",
"NOTES2": "all of these",
"NOTES3": "we urge and"},
{"NOTES1": "business 10-k",
"NOTES2": "state are",
"NOTES3": "we urge you to"},
{"NOTES1": "business annual ",
"NOTES2": "all of these",
"NOTES3": "we various"}]
df = pd.DataFrame(data)
# should probably call this word_list
df_word_list = []
# I'm assuming your data looks like above
df_notes = df[["NOTES1" , "NOTES2", "NOTES3"]]
你从哪里弄扁?
# one_list = list(flatten(df_notes.values.tolist()))
1)我认为您正在尝试整理列表?可以使用列表理解来做到这一点:
flat_list1 = [item for sublist in df_notes.values.tolist() for item in sublist]
print(flat_list1)
# ['annual report', 'all of these', 'we urge and', 'business 10-k', 'state are', 'we urge you to', 'business annual ', 'all of these', 'we various']
或使用两个for循环:
flat_list2 = []
for sublist in df_notes.values.tolist():
print(sublist)
for item in sublist:
print(item)
flat_list2.append(item)
print(flat_list2)
# ['annual report', 'all of these', 'we urge and', 'business 10-k', 'state are', 'we urge you to', 'business annual ', 'all of these', 'we various']
2)我认为您正在尝试遍历每一行?您可以使用itterows做到这一点的另一种方法:
word_list = []
for row_num, row_series in df_notes.iterrows():
print("Row Number:\t", row_num)
row_list = row_series.tolist()
print("Row Data:\t",row_list)
word_list = row_list + word_list
print(word_list)
# ['annual report', 'all of these', 'we urge and', 'business 10-k', 'state are', 'we urge you to', 'business annual ', 'all of these', 'we various']