我想知道我是否具有以下格式的文件
我想将每一列都放在列表列表中,因为我有多个句子:
所以输出看起来像这样
[['Learning centre of The University of Lahore is established for professional development.'],
['These events, destroyed the bond between them.']]
,动词列也是如此。这是我尝试过的方法,但是它将所有内容都放在一个列表中,而不是列表列表中
train_fn="/content/data/wiki/wiki1.train.oie"
dfE = pandas.read_csv(train_fn, sep= "\t",
header=0,
keep_default_na=False)
train_textEI = dfE['word'].tolist()
train_textEI = [' '.join(t.split()) for t in train_textEI]
train_textEI = np.array(train_textEI, dtype=object)[:, np.newaxis]
它输出列表中的每个单词
[['Learning'],['Center'],['of'],['The'],['University'],['of'],
['Lahore'],['is'],['established'],['for'],['the'],
['professional'],['development'],['.'],['These'],['events'],[','],
['destroyed'],['the'],['bond'],['between'],['them'],['.']]
答案 0 :(得分:1)
通过将Series.eq
的Series
与Series.cumsum
,word_id
和groupby
进行比较来创建助手Series
并转换为列表,最后将输出df = pd.DataFrame({'word_id':[0,1,2,0,1],
'word':['a s','ds d','sss dd','d','sd ds']})
L = df.groupby(df['word_id'].eq(0).cumsum())['word'].apply(lambda x: [' '.join(x)]).tolist()
print (L)
[['a s ds d sss dd'], ['d sd ds']]
转换为列表:
mvvm