pandas merge和groupby

时间:2018-05-30 01:32:07

标签: python pandas

我有2个pandas数据帧,如下所示。

数据框1:

Section    chainage_from     chainage_to     Frame  
R125R002    10.133            10.138          1  
R125R002    10.138            10.143          2  
R125R002    10.143            10.148          3  
R125R002    10.148            10.153          4  
R125R002    10.153            10.158          5

数据框2:

Section Chainage    1   2   3   4   5   6   7   8   
R125R002    10.133  0   0   1   0   0   0   0   0     
R125R002    10.134  1   0   1   0   0   0   0   0     
R125R002    10.135  0   0   1   0   0   0   0   0     
R125R002    10.136  0   0   1   0   0   0   0   0     
R125R002    10.137  0   0   1   0   0   0   0   0     
R125R002    10.138  0   0   1   0   0   0   0   0     
R125R002    10.139  0   0   1   0   0   0   0   0     
R125R002    10.14   5   0   1   0   0   0   0   0     
R125R002    10.141  1   0   1   0   0   0   0   0     
R125R002    10.142  0   0   1   0   0   0   0   0     
R125R002    10.143  0   0   1   0   0   0   0   0     
R125R002    10.144  0   0   1   0   0   0   0   0     
R125R002    10.145  0   0   1   0   0   0   0   0     
R125R002    10.146  0   0   1   0   0   0   0   0     
R125R002    10.147  0   0   1   0   0   0   0   0     
R125R002    10.148  0   0   1   0   0   0   0   0     
R125R002    10.149  0   0   1   0   0   0   0   0     
R125R002    10.15   0   0   1   0   0   0   0   0     
R125R002    10.151  0   0   1   0   0   0   0   0     
R125R002    10.152  0   0   1   0   0   0   0   0     
R125R002    10.153  0   0   1   0   0   0   0   0  

必需的输出数据帧:

Section Chainage Frame  1   2   3   4   5   6   7   8   
R125R002    10.133  1   1   0   1   0   0   0   0   0     
R125R002    10.138  2   0   0   1   0   0   0   0   0     
R125R002    10.143  3   6   0   1   0   0   0   0   0     
R125R002    10.148  4   0   0   1   0   0   0   0   0     
R125R002    10.153  5   0   0   1   0   0   0   0   0   

数据帧2的增量为1 m,而数据帧1的增量为5 m。我想将数据帧2合并到chainage_from和chainage_to之间的数据帧1并应用group by。第1列的Groupby为sum,第2列为max,colum3为8的平均值。

在SQL中,我将链接2帧之间的部分,并在链接的条件和之间应用,然后添加groupby。 有没有办法在熊猫中实现这一目标。

2 个答案:

答案 0 :(得分:1)

将数据框合并到<?php $servername = "localhost"; $username = "username"; $password = "password"; $dbname = "product_list"; // Create connection $conn = mysqli_connect($servername, $username, $password, $dbname); // Check connection if (!$conn) { die("Connection failed: " . mysqli_connect_error()); } if (isset($_POST['update'])) { $sql = "UPDATE products SET name_product = '$_POST[name_p]', price_product = '$_POST[price_p]', sku_product = '$_POST[sku_p]', type_product = '$_POST[type_p]', sizedvd_product = '$_POST[sizedvd_p]', weightbook_product = '$_POST[weightbook_p]', heightfurn_product = '$_POST[heightfurn_p]', widthfurn_product = '$_POST[widthfurn_p]', lengthfurn_product = '$_POST[lengthfurn_p]', WHERE id_product = '$_POST[id_p]'"; } else { echo "Nothing was posted"; } if (mysqli_query($conn, $sql)) { echo "Record updated successfully"; } else { echo "Error updating record: " . mysqli_error($conn); } mysqli_close($conn); ?> 并进行过滤,以便Section位于[来自&amp;至)。

Chainage

groupby&amp;聚合,传递一个映射列名称和字典的字典要使用的聚合函数。

merged = pd.merge_asof(df2, df1, by='Section', left_on='Chainage', right_on='chainage_from')

输出:

merged.groupby(['Section', 'chainage_from', 'Frame'], as_index=False).agg(
    {'1': 'sum', '2': 'max', '3': 'mean', '4': 'mean',
     '5': 'mean', '6': 'mean', '7': 'mean', '8': 'mean'}
)

答案 1 :(得分:0)

我们可以使用IntervalIndex创建区间,然后使用.loc获取df2位置的df1值,并指定Frame列,然后我们创建带有列的字典&#39;具有不同功能的名称,使用agg来实现您的需要

idx = pd.IntervalIndex.from_arrays(left = df1.chainage_from,right = df1.chainage_to,closed = 'left')
df1.index = idx

df2['Frame'] = df1.loc[df2.Chainage].Frame.values

d = {'Chainage':'first','1':'sum','2':'max'}

d.update(dict(zip(list('345678'),['mean']*6)))

s = df2.groupby(['Section','Frame'],as_index = False).agg(d)
s
Out[294]: 
    Section  Frame  6  7  2  1  5  3  8  4  Chainage
0  R125R002      1  0  0  0  1  0  1  0  0    10.133
1  R125R002      2  0  0  0  6  0  1  0  0    10.138
2  R125R002      3  0  0  0  0  0  1  0  0    10.143
3  R125R002      4  0  0  0  0  0  1  0  0    10.148
4  R125R002      5  0  0  0  0  0  1  0  0    10.153