python从两个数据帧重采样

时间:2019-04-14 19:30:02

标签: python pandas

有两个数据帧

#!/usr/local/bin/perl

use Switch;

# Edited for demo
switch($format)
{
    # Format A eg:
    #     2/17/2018 400000098627 =2,000.0 $2.0994 $4,387.75
    #     3/7/2018 1)0000006043 2,000.0 $2.0731 $4,332.78
    #     3/26/2018 4 )0000034242 2,000.0 $2.1729 $4,541.36
    #     4/17/2018 2)0000008516 2,000.0 $2.219 $4,637.71
    #
    case /^(?:april|snow)$/i
    { # This is where the ? character breaks compilation:
        $pat = '^\s*(\S+)\s+(?:[0-9|\)| ]+)+\s+\D?(\S+)\s+\$';

      # WORKS:
      # $pat = '^\s*(\S+)\s+(' .$q. ':[0-9|\)| ]+)+\s+\D' .$q. '(\S+)\s+\$';
    }

    # Format B
    case /^(?:umberto|petro)$/i
    {
        $pat = '^(\S+)\s+.*Think 1\s+(\S+)\s+';
    }
}

df1

import pandas as pd
df = pd.DataFrame({'x': [10, 47, 58, 68, 75, 80],
                       'y': [10, 9, 8, 7, 6, 5]})
df2 = pd.DataFrame({'x': [45, 55, 66, 69, 79, 82], 'y': [10, 9, 8, 7, 6, 5]})

df2

x   y
10  10
47  9
58  8
68  7
75  6
80  5

我想在它们之间进行插值并生成一个采样率为N的新数据帧。在此示例中,假设N = 3。

所需的输出是

x   y
45  10
55  9
66  8
69  7
79  6
82  5

如何使用数据框创建所需的输出?

1 个答案:

答案 0 :(得分:0)

如果您不介意使用numpy,此解决方案将为您提供所需的输出:

import pandas as pd
import numpy as np

N = 3

df = pd.DataFrame({'x': [10, 47, 58, 68, 75, 80],
                   'y': [10, 9, 8, 7, 6, 5]})
df2 = pd.DataFrame({'x': [45, 55, 66, 69, 79, 82], 'y': [10, 9, 8, 7, 6, 5]})

new_x = np.array([np.linspace(i, j, N) for i, j in zip(df['x'], df2['x'])]).flatten()
new_y = df['y'].loc[np.repeat(df.index.values, N)]

final_df = pd.DataFrame({'x': new_x, 'y': new_y})

print(final_df)

输出

       x   y
0   10.0  10
1   27.5  10
2   45.0  10
3   47.0   9
...
15  80.0   5
16  81.0   5
17  82.0   5