我在pandas df中有一个表,由两列组成。
|product_id |Bigram
---------------------------------------------------------------------
|111 |[('111','987'),('987','741'),('12','111')]
|987 |[('987','1232'),('1232','987')
|654 |('654,12'),('12,324'),('24,465')]
|321 |[('321','741')]
|324 |[('324','654'),('654','862'),('862','324')]
|123 |[('123','98'),('12','123')]
我想从Bigram列创建一个列表L,这样每行中和每行中的所有值都会附加到列表中。
例如,。我的输出应该是。
L = [(['987','1232'],['1232','987'],['654,12'],['12,324'],['24,465'],
['321','741'],............['123','98'],['12','123'])]
有没有办法做到这一点?使用一些for循环?
答案 0 :(得分:0)
我认为你需要tolist
:
L = df.Bigram.tolist()
或者:
L = list(df.Bigram)
编辑:
问题是列Bigram
中的值是字符串,因此首先需要按list
转换为元组的ast
:
from ast import literal_eval
from itertools import chain
df.Bigram = df.Bigram.apply(literal_eval)
print (df)
product_id Bigram
0 111 [(111, 987), (987, 741), (12, 111)]
1 987 [(987, 1232), (1232, 987)]
2 654 [(654, 12), (12, 324), (24, 465)]
3 321 [(321, 741)]
4 324 [(324, 654), (654, 862), (862, 324)]
5 123 [(123, 98), (12, 123)]
L = [tuple([list(x) for x in chain.from_iterable(df.Bigram)])]
print (L)
[(['111', '987'], ['987', '741'], ['12', '111'],
['987', '1232'], ['1232', '987'], ['654', '12'],
['12', '324'], ['24', '465'], ['321', '741'],
['324', '654'], ['654', '862'], ['862', '324'],
['123', '98'], ['12', '123'])]