Pyspark:内部连接两个pyspark数据框,然后从第一个数据框选择所有列,从第二个数据框选择几个列

时间:2020-08-23 05:50:53

标签: pyspark

我有两个pyspark数据帧A和B。我想内部联接两个pyspark数据帧,并从第一个数据帧中选择所有列,从第二个数据帧中选择几个列。

A_df                
id  column1 column2 column3 column4
1   A1  A2  A3  A4
2   A1  A2  A3  A4
3   A1  A2  A3  A4
4   A1  A2  A3  A4
B_df                        
id  column1 column2 column3 column4 column5 column6
1   B1  B2  B3  B4  B5  B6
2   B1  B2  B3  B4  B5  B6
3   B1  B2  B3  B4  B5  B6
4   B1  B2  B3  B4  B5  B6
joined_df                       
id  column1 column2 column3 column4 column5 column6
1   A1  A2  A3  A4  B5  B6
2   A1  A2  A3  A4  B5  B6
3   A1  A2  A3  A4  B5  B6
4   A1  A2  A3  A4  B5  B6

我正在尝试以下代码-

joined_df = (A_df.alias('A_df').join(B_df.alias('B_df'),
                               on = A_df['id'] == B_df['id'],
                               how = 'inner')
                               .select('A_df.*',B_df.column5,B_df.column6))

但是在交换列中的值时,给出了一个奇怪的结果。我该如何实现?预先感谢

1 个答案:

答案 0 :(得分:2)

出什么问题了?一切都按预期进行。

var mouseDown = false;
var mousePos = [0,0];
var cameraPos = 0;
        
document.addEventListener('mousedown', onMouseDown, false);
function onMouseDown( event ) {
    mouseDown = true;
    mousePos = [event.offsetX, event.offsetY];
    cameraPos = camera.position;
}
document.addEventListener('mouseup', onMouseUp, false);
function onMouseUp( event ) {
    mouseDown = false;
}
document.addEventListener('mousemove', onMouseMove, false);
function onMouseMove( event ) {
    if (mouseDown) {
        // scale factor takes into account the current FOV
        scale =  Math.tan(camera.fov/2 * Math.PI / 180)/1.5;
        dx = mousePos[0] - event.offsetX;
        dy = mousePos[1] - event.offsetY;
        x = cameraPos.x + scale*dx;
        y = cameraPos.y - scale*dy;
        camera.position.x = x;
        camera.position.y = y;
        mousePos = [event.offsetX, event.offsetY];
        cameraPos = camera.position;
    }
}
相关问题