如何将列表中的字典转换为python中的DataFrame?

时间:2017-03-30 04:13:44

标签: python list pandas dictionary dataframe

Python初学者。我正在努力将list dicts pandas.DataFrame转移到a = {'Scores': {'s1': [{'Math': '95', 'Science': '74.5', 'English': '60.5'}, {'Math': '87.9', 'Science': '97.3', 'English': '78.3'}], 's2': [{'Math': '67.2', 'Science': '74.2', 'English': '89'}]}} 的正确位置。我的数据具有以下结构。

pandas.Dataframe

我的b = a.pop('Scores') c = list(b.values()) df = pd.DataFrame(c) 列应该是主题'数学','科学'和'英语',行应该是分数。列是动态创建的,因此我无法明确提及要调用它的列名。我需要的只是键S1 ...... Sn的值。

这是我到目前为止所尝试的:

                                               0  \
0  {'Math': '95', 'Science': '74.5', 'English': '...
1  {'Math': '67.2', 'Science': '74.2', 'English':...

                                               1
0  {'Math': '87.9', 'Science': '97.3', 'English':...
1                                               None

这会将我的数据框显示为:

Math  Science  English
95    74.5     60.5
87.9  97.3     78.3
67.2  74.2     89

相反,我正在寻找:

import Tkinter as tk
import threading
import imageio
from PIL import Image, ImageTk

video_name = "test_video.mp4" #This is your video file path
video = imageio.get_reader(video_name)

def stream(label):

    frame = 0
    for image in video.iter_data():
        frame += 1                                    #counter to save new frame number
        image_frame = Image.fromarray(image)          
        image_frame.save('FRAMES/frame_%d.png' % frame)      #if you need the frame you can save each frame to hd
        frame_image = ImageTk.PhotoImage(image_frame)
        label.config(image=frame_image)
        label.image = frame_image
        if frame == 40: break                         #after 40 frames stop, or remove this line for the entire video

if __name__ == "__main__":

    root = tk.Tk()
    my_label = tk.Label(root)
    my_label.pack()
    thread = threading.Thread(target=stream, args=(my_label,))
    thread.daemon = 1
    thread.start()
    root.mainloop()

我很感激能得到的任何帮助。

3 个答案:

答案 0 :(得分:3)

您可以在迭代dict的值后使用sum。

<强>代码:

import pandas as pd

data = sum([x for x in a['Scores'].values()], [])
print(pd.DataFrame(data, columns=['Math', 'Science', 'English']))

测试数据:

a = {'Scores': {'s1': [{'Math': '95',
                        'Science': '74.5',
                        'English': '60.5'},
                       {'Math': '87.9',
                        'Science': '97.3',
                        'English': '78.3'}],
                's2': [{'Math': '67.2',
                        'Science': '74.2',
                        'English': '89'}]}}

<强>结果:

   Math Science English
0  67.2    74.2      89
1    95    74.5    60.5
2  87.9    97.3    78.3

答案 1 :(得分:1)

您可以使用理解/生成器提取所有分数:

>>> pd.DataFrame(s for k, v in a['Scores'].items() for s in v)
  English  Math Science
0    60.5    95    74.5
1    78.3  87.9    97.3
2      89  67.2    74.2

答案 2 :(得分:0)

您必须自己apply

pd.Series(a['Scores']).apply(pd.Series).stack().apply(pd.Series)

     English  Math Science
s1 0    60.5    95    74.5
   1    78.3  87.9    97.3
s2 0      89  67.2    74.2