我正在使用Python3从Excel电子表格中读取一列:
import pandas as pd
from pandas import ExcelFile
df = pd.read_excel('MWE.xlsx', sheet_name='Sheet1')
print(df)
col1 col2
0 starts normal egg, bacon
1 still none the wiser egg, sausage, bacon
2 maybe odd tastes egg, spam
3 or maybe post-war egg, bacon, spam
4 maybe for the hungry egg, bacon, sausage, spam
5 bingo spam, bacon, sausage, spam
我想将col2
简化为col2中单个单词的列表(例如egg,腊肉,...)。
df.col2.ravel()
似乎将col2
简化为字符串列表。
df.col2.flatten()
产生
AttributeError: 'Series' object has no attribute 'flatten'
答案 0 :(得分:2)
如果您想要将一系列列表作为col2,则可以实现此目的:
service.createStatusOptions()
结果:
df = pd.DataFrame({'col1': ['starts normal','still none the wiser'], 'col2': ['egg, bacon','egg, sausage, bacon']})
df['col2'] = df['col2'].map(lambda x: [i.strip() for i in x.split(',')])
print(df)
答案 1 :(得分:1)
尝试简单的方法,例如:
Document document = readResponse(pageId, postId);
Integer theObject = getObjectFromDocument(document);
Integer theComments = getCommentsFromDocument(document);
答案 2 :(得分:1)
也许这就是您需要的:
将一系列用逗号分隔的字符串转换为列表列表
arrs = df.col2.map(lambda x: [i.strip() for i in x.split(',')]).tolist()
# [['egg', 'bacon'], ['egg', 'sausage', 'bacon'], ...]
获取包含唯一项的列表
unique = list({elem for arr in arrs for elem in arr})
# ['spam', 'sausage', 'egg', 'bacon']