我编写了一个函数,用于输出每个供应商的每种产品的数量。 输出就像:
[FreqDist({'scanning and digitization of 100 laboratory notebooks': 1})]
[FreqDist({'2018 immuno-oncology podium presentation': 1, '2019 ash review podium presentation': 1, '2019 immuno-oncology podium presentation': 1, '2018 meeting attendance - immuno-oncology': 1, 'ipn ion wo7 - 2019 yescarta in-practice programs': 1, 'lpp select attendance': 1, '2018 lpp national clinical track & attendance': 1, 'ipn ion wo8 - 2019 exhibits': 1})]
[FreqDist({'av for 2400 broadway 2nd floor training room - shipping': 1, 'av for 1552 18th street stand-up meeting space - e-waste': 1, 'av for 1552 18th street conference room - equipment': 1, 'construction for tcf03 expansion': 1, 'av for 2400 broadway 2nd floor training room - labor': 1, 'small conf room 306': 1, 'av for 2400 broadway 2nd floor conference room d - e-waste fees': 1, 'av for 1552 18th street stand-up meeting space - labor': 1, 'av for 1552 18th street stand-up meeting space - shipping': 1, 'av for torrance conf room at water garden': 1, ...})]
我是Python的初学者。我想将输出转换为数据帧,因此可以基于函数输出添加其他列或其他信息。
我尝试了以下代码,但是返回了一个空的数据框:
def freq(word):
word = word.apply(lambda x: x.astype(str).str.lower())
frequency = pd.DataFrame()
for i in word:
dist = nlp.FreqDist(word[i])
dis = dict(dist)
df = pd.DataFrame([dis], columns=dis.keys())
frequency.append(df)
print(frequency)
all_suppliers = []
for i in set(all_data['SUPPLIER']):
subset = all_data[all_data.SUPPLIER == i]
subset = subset['LINE_DESCRIPTION']
subset = pd.DataFrame(subset)
all_suppliers.append(freq(subset))
我正在使用Jupiter笔记本和Python3。谢谢您的帮助。