我有一个熊猫数据框,其中包含三列。我想根据Project Column中的值创建一个元组的多个列表
print (df)
Project Resource Time
0 P1 0 4
1 P1 2 4
2 P1 1 10
3 P1 3 3
4 P2 1 3
5 P2 3 10
6 P2 0 11
7 P2 2 3
8 P2 0 12
9 P2 3 11
10 P2 1 3
11 P2 2 3
12 P3 0 12
列出要创建的元组,如下所示 [[(0,4),(2,4),(1,10),(3,3)],[(1,3),(3,10),(0,11),(2,3 ),(0,12),(3,11),(1,3),(2,3)],[(0,12)]]
我使用了以下代码
tuples = [tuple(x) for x in data.values]
答案 0 :(得分:2)
将DataFrame.groupby
与lambda函数和zip
一起使用,最后将输出Series
转换为list
:
t = df.groupby('Project').apply(lambda x: list(zip(x['Resource'], x['Time']))).tolist()
print (t)
[[(0, 4), (2, 4), (1, 10), (3, 3)],
[(1, 3), (3, 10), (0, 11), (2, 3), (0, 12), (3, 11), (1, 3), (2, 3)],
[(0, 12)]]
另一种解决方案:
t = (df.groupby('Project')['Resource','Time']
.apply(lambda x: [tuple(y) for y in x.values])
.tolist())
答案 1 :(得分:1)
您可以使用zip
函数遍历熊猫数据框的几列:
df = pd.DataFrame({"ressource":[0,2, 1,3], "time":[4,4, 10, 3]})
tuples = [(x,y) for x,y in zip(df['ressource'], df['time'])]
输出:
[(0, 4), (2, 4), (1, 10), (3, 3)]
答案 2 :(得分:1)
尝试一下:
>>> df['zip'] = tuple(zip(df.Resource, df.Time))
>>> df.groupby('Project').agg(lambda x:list(x))['zip'].tolist()
[[(0, 4), (2, 4), (1, 10), (3, 3)],
[(1, 3), (3, 10), (0, 11), (2, 3), (0, 12), (3, 11), (1, 3), (2, 3)],
[(0, 12)]]
答案 3 :(得分:0)
怎么样呢?
listExample=[]
for code in tmpa.loc[:, 'Project'].unique():
listExample.append([(a, b) for a, b in tmpa[tmpa.loc[:, 'Project']==code].loc[:, ['Resource', 'Time']].values])
这不是很漂亮,但是我认为它应该可以工作。
答案 4 :(得分:0)
如果要按项目划分元组,请执行以下操作:
#Create tuples column
df['Tuples'] = df.apply(lambda r: (r['Resource'], r['Time']), axis=1)
# Concatenate tuples grouped by 'Project'
result_df = df[['Project', 'Tuples']].groupby('Project').agg(list)
结果是:
Tuples
Project
P1 [(0, 4), (2, 4), (1, 10), (3, 3)]
P2 [(1, 3), (3, 10), (0, 11), (2, 3), (0, 12), (3...
P3 [(0, 12)]
然后,您可以重置索引以使“项目”列返回:
result_df.reset_index(inplace=True)
result_df
Project Tuples
0 P1 [(0, 4), (2, 4), (1, 10), (3, 3)]
1 P2 [(1, 3), (3, 10), (0, 11), (2, 3), (0, 12), (3...
2 P3 [(0, 12)]