我有2个pandas数据帧,如下所示。
数据框1:
Section chainage_from chainage_to Frame
R125R002 10.133 10.138 1
R125R002 10.138 10.143 2
R125R002 10.143 10.148 3
R125R002 10.148 10.153 4
R125R002 10.153 10.158 5
数据框2:
Section Chainage 1 2 3 4 5 6 7 8
R125R002 10.133 0 0 1 0 0 0 0 0
R125R002 10.134 1 0 1 0 0 0 0 0
R125R002 10.135 0 0 1 0 0 0 0 0
R125R002 10.136 0 0 1 0 0 0 0 0
R125R002 10.137 0 0 1 0 0 0 0 0
R125R002 10.138 0 0 1 0 0 0 0 0
R125R002 10.139 0 0 1 0 0 0 0 0
R125R002 10.14 5 0 1 0 0 0 0 0
R125R002 10.141 1 0 1 0 0 0 0 0
R125R002 10.142 0 0 1 0 0 0 0 0
R125R002 10.143 0 0 1 0 0 0 0 0
R125R002 10.144 0 0 1 0 0 0 0 0
R125R002 10.145 0 0 1 0 0 0 0 0
R125R002 10.146 0 0 1 0 0 0 0 0
R125R002 10.147 0 0 1 0 0 0 0 0
R125R002 10.148 0 0 1 0 0 0 0 0
R125R002 10.149 0 0 1 0 0 0 0 0
R125R002 10.15 0 0 1 0 0 0 0 0
R125R002 10.151 0 0 1 0 0 0 0 0
R125R002 10.152 0 0 1 0 0 0 0 0
R125R002 10.153 0 0 1 0 0 0 0 0
必需的输出数据帧:
Section Chainage Frame 1 2 3 4 5 6 7 8
R125R002 10.133 1 1 0 1 0 0 0 0 0
R125R002 10.138 2 0 0 1 0 0 0 0 0
R125R002 10.143 3 6 0 1 0 0 0 0 0
R125R002 10.148 4 0 0 1 0 0 0 0 0
R125R002 10.153 5 0 0 1 0 0 0 0 0
数据帧2的增量为1 m,而数据帧1的增量为5 m。我想将数据帧2合并到chainage_from和chainage_to之间的数据帧1并应用group by。第1列的Groupby为sum,第2列为max,colum3为8的平均值。
在SQL中,我将链接2帧之间的部分,并在链接的条件和之间应用,然后添加groupby。 有没有办法在熊猫中实现这一目标。
答案 0 :(得分:1)
将数据框合并到<?php
$servername = "localhost";
$username = "username";
$password = "password";
$dbname = "product_list";
// Create connection
$conn = mysqli_connect($servername, $username, $password, $dbname);
// Check connection
if (!$conn) {
die("Connection failed: " . mysqli_connect_error());
}
if (isset($_POST['update'])) {
$sql = "UPDATE products SET name_product = '$_POST[name_p]', price_product = '$_POST[price_p]', sku_product = '$_POST[sku_p]', type_product = '$_POST[type_p]', sizedvd_product = '$_POST[sizedvd_p]', weightbook_product = '$_POST[weightbook_p]', heightfurn_product = '$_POST[heightfurn_p]', widthfurn_product = '$_POST[widthfurn_p]', lengthfurn_product = '$_POST[lengthfurn_p]', WHERE id_product = '$_POST[id_p]'";
} else {
echo "Nothing was posted";
}
if (mysqli_query($conn, $sql)) {
echo "Record updated successfully";
} else {
echo "Error updating record: " . mysqli_error($conn);
}
mysqli_close($conn);
?>
并进行过滤,以便Section
位于[来自&amp;至)。
Chainage
groupby&amp;聚合,传递一个映射列名称和字典的字典要使用的聚合函数。
merged = pd.merge_asof(df2, df1, by='Section', left_on='Chainage', right_on='chainage_from')
输出:
merged.groupby(['Section', 'chainage_from', 'Frame'], as_index=False).agg(
{'1': 'sum', '2': 'max', '3': 'mean', '4': 'mean',
'5': 'mean', '6': 'mean', '7': 'mean', '8': 'mean'}
)
答案 1 :(得分:0)
我们可以使用IntervalIndex
创建区间,然后使用.loc
获取df2
位置的df1
值,并指定Frame列,然后我们创建带有列的字典&#39;具有不同功能的名称,使用agg
来实现您的需要
idx = pd.IntervalIndex.from_arrays(left = df1.chainage_from,right = df1.chainage_to,closed = 'left')
df1.index = idx
df2['Frame'] = df1.loc[df2.Chainage].Frame.values
d = {'Chainage':'first','1':'sum','2':'max'}
d.update(dict(zip(list('345678'),['mean']*6)))
s = df2.groupby(['Section','Frame'],as_index = False).agg(d)
s
Out[294]:
Section Frame 6 7 2 1 5 3 8 4 Chainage
0 R125R002 1 0 0 0 1 0 1 0 0 10.133
1 R125R002 2 0 0 0 6 0 1 0 0 10.138
2 R125R002 3 0 0 0 0 0 1 0 0 10.143
3 R125R002 4 0 0 0 0 0 1 0 0 10.148
4 R125R002 5 0 0 0 0 0 1 0 0 10.153