Question

嗨，我有一个由学生的姓名，科目和分数组成的学生数据集。

每个学生有5个科目要写作。但是，下表中缺少某些学生的学科和成绩数据。请找到下表：

Second_User_Attribute

我想找到名称少于5个科目的分数并附加其他行，以便所有学生对这5个科目都有各自的分数。预期输出如下：

Name    Subject Score
Harry   Math    4
Harry   Science 5
Harry   Social  3
Harry   French  5
Harry   Spanish 4
Steve   Math    5
Steve   Science 3
Steve   Social  5
Steve   French  4
Tom     Math    5
Tom     Science 4
Tom     Social  5

您可以在这里看到史蒂夫，哈利和汤姆在所有5门科目中都得分。

Answer 1

这似乎是reindex

的完美应用

设置：

z=io.StringIO("""Name    Subject Score
Harry   Math    4
Harry   Science 5
Harry   Social  3
Harry   French  5
Harry   Spanish 4
Steve   Math    5
Steve   Science 3
Steve   Social  5
Steve   French  4
Tom     Math    5
Tom     Science 4
Tom     Social  5""")

df=pd.read_table(z,delim_whitespace=True)

然后

new_index = pd.MultiIndex.from_product([df['Name'].unique(), df['Subject'].unique()], names=['Name', 'Subject'])
df.set_index(['Name', 'Subject']).reindex(new_index)

                        Score
Name    Subject 
Harry   Math            4.0
        Science         5.0
        Social          3.0
        French          5.0
        Spanish         4.0
Steve   Math            5.0
        Science         3.0
        Social          5.0
        French          4.0
        Spanish         NaN
Tom     Math            5.0
        Science         4.0
        Social          5.0
        French          NaN
        Spanish         NaN

用Python重新编制索引

1 个答案: