熊猫数据帧每行重复一定次数

时间:2019-09-27 08:24:55

标签: python pandas

我有一个由7个值组成的pandas数据框:

   Minutiae         LR
0         1   1.975476
1         2   1.082983
2         3   0.269608
3         4   0.878350
4         5   2.820141
5         6   8.686183
6         7  24.340116
7         8  46.475523
8         9  66.139377

我试图做的是将每一行相乘一定的次数,使两个值保持相同。例如1129的细节3和1085的细节4等。

到目前为止,我只能找到增加每行数量的方法,但不能单独增加

感谢您对此的任何见解,谢谢。

1 个答案:

答案 0 :(得分:1)

为每个GET dev_products/_search { "size": 0, "query": { "bool": { "must": [ { "term": { "category_id": { "value": 2233 } } }, { "nested": { "path": "properties", "query": { "bool": { "must": [ { "term": { "properties.char_id": { "value": 347 } } }, { "term": { "properties.char_value_id": { "value": 3480 } } } ] } } } } ] } }, "aggs": { "properties": { "nested": { "path": "properties" }, "aggs": { "char_values_string": { "terms": { "field": "properties.char_value_string", "size": 10000 }, "aggs": { "char_ids": { "terms": { "field": "properties.char_id", "size": 10000 } } } }, "min_max_char_values_numeric": { "terms": { "field": "properties.char_id", "size": 10000 }, "aggs": { "min_char_value_numeric": { "min": { "field": "properties.char_value_numeric" } }, "max_char_value_numeric": { "max": { "field": "properties.char_value_numeric" } } } } } }, "min_price": { "min": { "field": "price" } }, "max_price": { "max": { "field": "price" } } } } Series.map创建重复次数的字典,然后使用Index.repeat重复索引,最后将DataFrame.loc用于重复行:

Minute

详细信息

print (df)
   Minutiae        LR
0         1  1.975476
1         2  1.082983
2         3  0.269608
3         4  0.878350

d = {1:2, 2:1, 3:5, 4:3}

df1 = df.loc[df.index.repeat(df['Minutiae'].map(d))]
print (df1)
   Minutiae        LR
0         1  1.975476
0         1  1.975476
1         2  1.082983
2         3  0.269608
2         3  0.269608
2         3  0.269608
2         3  0.269608
2         3  0.269608
3         4  0.878350
3         4  0.878350
3         4  0.878350

或创建新列以重复:

print (df['Minutiae'].map(d))
0    2
1    1
2    5
3    3
Name: Minutiae, dtype: int64

print (df.index.repeat(df['Minutiae'].map(d)))
Int64Index([0, 0, 1, 2, 2, 2, 2, 2, 3, 3, 3], dtype='int64')