My Pandas dataframe df,如下所示:
Column
0 0 [
{ “weight": “40", “height": 4,”age”:13 },
{ “weight": “50", “height": 10,”age”:15 },
{ “weight": “30", “height": 5,”age”:25 },
{ “weight": “25", “height”:5,”age”:35 }
]
1 1 [
{ “weight": “60", “height": 6, “age":45 },
{ “weight": “80", “height": 8, “age”:30 },
{ “weight": “90", “height": 9, “age”:20 },
{ “weight": “70", “height": 7, “age”:50 }
]
输出:
weight height New_column (compute Weight/Height )
0 (40,50,30,25) (4,10,5,5) (10,5,6,5)
1 (60,80,90,70) (6,8,9,7) (10,10,10,10)
有人可以为此写一个伪代码或算法吗?我想在熊猫中这样做。我想不出办法。
答案 0 :(得分:0)
简化:
df # original
Column
0 [{'weight': '40', 'height': 4, 'age': 13}, {'w...
1 [{'weight': '60', 'height': 6, 'age': 45}, {'w...
df = pd.DataFrame(np.concatenate(df.Column).tolist()).astype(int)
df
age height weight
0 13 4 40
1 15 10 50
2 25 5 30
3 35 5 25
4 45 6 60
5 30 8 80
6 20 9 90
7 50 7 70
创建新列,并按4
:
df['New_column'] = df.weight / df.height
g = df.groupby(df.index // 4 * 4)\
['weight', 'height', 'New_column'].agg(lambda x: tuple(x.values))
g
weight height New_column
0 (40, 50, 30, 25) (4, 10, 5, 5) (10.0, 5.0, 6.0, 5.0)
4 (60, 80, 90, 70) (6, 8, 9, 7) (10.0, 10.0, 10.0, 10.0)
答案 1 :(得分:0)
您可以将数据保持为宽格式,并仍然可以获得所需的weight:height
比率:
orig
Columns
0 [{'weight': '40', 'height': 4, 'age': 13}, {'w...
1 [{'weight': '60', 'height': 6, 'age': 45}, {'w...
def extract(row, field):
return [int(x[field]) for x in row.Columns]
df = orig.assign(weight=orig.apply(extract, args=("weight",), axis=1).values,
height=orig.apply(extract, args=("height",), axis=1).values)
df['ratio'] = df.apply(lambda x: pd.Series(x.weight)/pd.Series(x.height),
axis=1).values.tolist()
df
height weight ratio
0 [4, 10, 5, 5] [40, 50, 30, 25] [10.0, 5.0, 6.0, 5.0]
1 [6, 8, 9, 7] [60, 80, 90, 70] [10.0, 10.0, 10.0, 10.0]