从元组中提取列表并在python中转置

时间:2018-11-04 04:18:29

标签: python list tuples transpose

我有下面给出的数据框。我想从元组列表中提取第一个列表,然后将提取到列表中的列表转置。

data = {'Document_No':[0.0,1.0], 'list_of_topics': [
([(0, 0.14572892),
  (1, 0.014889247),
  (11, 0.44593897)],
 [(4, [0]), (5, [4]), (6, [11]), (7, [11]), (8, [11, 4]), (9, [11, 4])],
 [(4, [(0, 0.9999998)]),
  (7, [(11, 0.9999998)]),
  (9, [(4, 0.05520946), (11, 0.93936676)])]),
([(0, 0.2453892),
  (11, 0.78657897)],
 [(4, [0]), (5, [4]), (6, [11]), (7, [11]), (8, [11, 4]), (9, [11, 4])],
 [(4, [(0, 0.9999998)]),
  (7, [(11, 0.9999998)]),
  (9, [(4, 0.05520946), (11, 0.93936676)])])
]}

df = pd.DataFrame(data)

所需结果:

  Document_No     0            1                 11
0          0.0  0.14572892  0.014889247     0.44593897
1          1.0  0.2453892   0               0.78657897

我的解决方案:

pd.DataFrame([[j[0] for j in i] for i in df['list_of_topics']], index=df['Document_No']).transpose()
Out[245]: 
Document_No                    0.0                    1.0
0                  (0, 0.14572892)        (0, 0.14572892)
1                         (4, [0])               (4, [0])
2            (4, [(0, 0.9999998)])  (4, [(0, 0.9999998)])

没有得到想要的结果。谁能帮我找出我在哪里做错了。

1 个答案:

答案 0 :(得分:1)

您可以在列中选择所需的元组,并使用正则表达式提取数据

df1 = pd.DataFrame.from_records(df.list_of_topics[0])
for tup in df.list_of_topics[1:]:
    df1 = df1.merge(pd.DataFrame.from_records(tup),on=0,how='outer')

df1.set_index(0,inplace=True)
df1.T.reset_index(drop=True)

出局:

            0   1   11
0   0.145729    0.014889    0.445939
1   0.245389    NaN     0.786579