Question

假设我有两个数据框
df1：col1 col2 col3
df2：col1 col2 col4

我想使用col1和col2连接两个数据框，而无需定义新的别名表名称。

我不想做

df = df1.join（df2，（df1.col1 == df2.col1）＆（df1.col2 == df2.col2）<< << 并且还要在加入后删除重复的加入列。

因此最终数据帧将仅具有col1 col2 col3 col4

如何实现？

Answer 1

对于spark数据框，请按以下方式使用。

int process_count;
file >> process_count;

for (unsigned p = 0; p < process_count; ++p)
{
    int process_name, start_time, finish_time, memory_count;
    std::vector<int> memory;

    file >> process_name >> start_time >> finish_time >> memory_count;

    for (unsigned m = 0; m < memory_count; ++m)
    {
        int memory_size;
        file >> memory_size;
        memory.push_back(memory_size);
    }

    // Here all data for the "process" have been read from the file, use it...
}

数据框：使用明确定义的连接列连接数据框

1 个答案: