Question

我有一个看起来像这样的Pandas DataFrame（目前没有内置行索引以外的索引，但是如果它更容易将索引添加到＆＃34; Person＆＃34;和＆＃34 ; Car＆＃34;，那也很好）：

before = pd.DataFrame({
  'Email': ['john@example.com','mary@example.com','jane@example.com','john@example.com','mary@example.com'],
  'Person': ['John','Mary','Jane','John','Mary'],
  'Car': ['Ford','Toyota','Nissan','Nissan','Ford']
})

我想重新塑造它看起来像这样：

after = pd.DataFrame({
  'Person': ['John','Mary','Jane'],
  'Email': ['john@example.com','mary@example.com','jane@example.com'],
  'Ford': [True,True,False],
  'Nissan': [True,False,True],
  'Toyota': [False,True,False]
})

请注意，约翰拥有福特和日产，玛丽拥有福特汽车和丰田汽车，而保罗则坚持他信赖的日产。

我尝试过堆叠多索引DataFrame，分组，旋转的各种排列 - 我似乎无法弄清楚如何从＆＃34; Car＆＃34;中获取价值。列并将其转置为一个值为＆＃34; True＆＃34;的新列，将人们合并在一起，例如，他们的名字。

Answer 1

不确定这是否是最好的方法，但有一种方法是 -

In [26]: before.pivot_table(index=['Email','Person'],columns=['Car'], aggfunc=lambda x: True).fillna(False).reset_index()
Out[26]:
Car             Email Person   Ford Nissan Toyota
0    jane@example.com   Jane  False   True  False
1    john@example.com   John   True   True  False
2    mary@example.com   Mary   True  False   True

Answer 2

before['has_car'] = True

Out[93]:
car                Email    Person  has_car
Ford    john@example.com    John    True
Toyota  mary@example.com    Mary    True
Nissan  jane@example.com    Jane    True
Nissan  john@example.com    John    True
Ford    mary@example.com    Mary    True

df = before.pivot_table(index = ['Person' , 'Email'], columns= 'Car' , values='has_car')


Out[89]:
                            Ford    Nissan  Toyota
Person  Email           
Jane    jane@example.com    NaN     True    NaN
John    john@example.com    True    True    NaN
Mary    mary@example.com    True    NaN     True

df.fillna(False).reset_index()

Out[102]:
Car Person  Email               Ford    Nissan  Toyota
0   Jane    jane@example.com    False   True    False
1   John    john@example.com    True    True    False
2   Mary    mary@example.com    True    False   True

展平Pandas DataFrame

2 个答案: