上传非时间数据

时间:2017-03-27 14:35:58

标签: python pandas

是否有更好的方法根据每行中包含的启动和停止条件对一系列浮动进行上采样?这是我开始的样本:

borehole   top  bottom  lithology
0     AP-2  94.6    95.1   dolomite
1     AP-2  95.1    96.7  limestone
2     AP-2  96.7    97.0   dolomite
3     AP-2  97.0    97.5  limestone
4     AP-2  97.5    97.8  limestone
5     AP-3  87.4    87.7  limestone
6     AP-3  87.7    88.1  limestone
7     AP-3  88.1    88.5   dolomite
8     AP-3  88.5    89.1  limestone

对于每一行,我想使用选定的增量在顶部和底部值之间添加行,并向前填充钻孔和岩性值。数据帧中的增量是常量。这是我想要的填充数据帧,增量为0.1。 (这仅适用于上述数据帧中的前两行:)

   borehole   Top       Lith
0      AP-2  94.6   dolomite
1      AP-2  94.7   dolomite
2      AP-2  94.8   dolomite
3      AP-2  94.9   dolomite
4      AP-2  95.0   dolomite
5      AP-2  95.1  limestone
6      AP-2  95.2  limestone
7      AP-2  95.3  limestone
8      AP-2  95.4  limestone
9      AP-2  95.5  limestone
10     AP-2  95.6  limestone
11     AP-2  95.7  limestone
12     AP-2  95.8  limestone
13     AP-2  95.9  limestone
14     AP-2  96.0  limestone
15     AP-2  96.1  limestone
16     AP-2  96.2  limestone
17     AP-2  96.3  limestone
18     AP-2  96.4  limestone
19     AP-2  96.5  limestone
20     AP-2  96.6  limestone
21     AP-2  96.7  limestone

这里是我使用的代码并且它有效,但是当我在熊猫中制作一个循环时,我想知道我是否遗漏了一些明显的东西。 pd.DataFrame.resample()非常诱人,但我无法弄清楚如何使其与非时间数据一起使用。

import pandas as pd
import numpy as np

liths = pd.DataFrame(
{'borehole': {0: 'AP-2',
  1: 'AP-2',
  2: 'AP-2',
  3: 'AP-2',
  4: 'AP-2',
  5: 'AP-3',
  6: 'AP-3',
  7: 'AP-3',
  8: 'AP-3'},
 'bottom': {0: 95.099999999999994,
  1: 96.700000000000003,
  2: 97.0,
  3: 97.5,
  4: 97.799999999999997,
  5: 87.700000000000003,
  6: 88.099999999999994,
  7: 88.5,
  8: 89.099999999999994},
 'lithology': {0: 'dolomite',
  1: 'limestone',
  2: 'dolomite',
  3: 'limestone',
  4: 'limestone',
  5: 'limestone',
  6: 'limestone',
  7: 'dolomite',
  8: 'limestone'},
 'top': {0: 94.599999999999994,
  1: 95.099999999999994,
  2: 96.700000000000003,
  3: 97.0,
  4: 97.5,
  5: 87.400000000000006,
  6: 87.700000000000003,
  7: 88.099999999999994,
  8: 88.5}}
)

filled = []
increment = 0.1
for row in liths.itertuples():
    start = row.top
    end = row.bottom
    for i in np.arange(start, end, increment):
        filled.append([row.borehole, i, row.lithology])
filled = pd.DataFrame(filled, columns=['borehole', 'Top', 'Lith']); filled

1 个答案:

答案 0 :(得分:2)

我从每一行构建一个新的数据框并将它们全部连接起来

def expand(r):
    a = np.arange(r.top, r.bottom, .1)
    n = len(a)
    return pd.DataFrame(dict(
            borehole=[r.borehole] * n,
            lithology=[r.lithology] * n,
            top=a
        ))

pd.concat([expand(r) for r in df.itertuples()], ignore_index=True)
   borehole  lithology   top
0      AP-2   dolomite  94.6
1      AP-2   dolomite  94.7
2      AP-2   dolomite  94.8
3      AP-2   dolomite  94.9
4      AP-2   dolomite  95.0
5      AP-2  limestone  95.1
6      AP-2  limestone  95.2
7      AP-2  limestone  95.3
8      AP-2  limestone  95.4
9      AP-2  limestone  95.5
10     AP-2  limestone  95.6
11     AP-2  limestone  95.7
12     AP-2  limestone  95.8
13     AP-2  limestone  95.9
14     AP-2  limestone  96.0
15     AP-2  limestone  96.1
16     AP-2  limestone  96.2
17     AP-2  limestone  96.3
18     AP-2  limestone  96.4
19     AP-2  limestone  96.5
20     AP-2  limestone  96.6
21     AP-2  limestone  96.7
22     AP-2   dolomite  96.7
23     AP-2   dolomite  96.8
24     AP-2   dolomite  96.9
25     AP-2  limestone  97.0
26     AP-2  limestone  97.1
27     AP-2  limestone  97.2
28     AP-2  limestone  97.3
29     AP-2  limestone  97.4
30     AP-2  limestone  97.5
31     AP-2  limestone  97.6
32     AP-2  limestone  97.7
33     AP-3  limestone  87.4
34     AP-3  limestone  87.5
35     AP-3  limestone  87.6
36     AP-3  limestone  87.7
37     AP-3  limestone  87.8
38     AP-3  limestone  87.9
39     AP-3  limestone  88.0
40     AP-3   dolomite  88.1
41     AP-3   dolomite  88.2
42     AP-3   dolomite  88.3
43     AP-3   dolomite  88.4
44     AP-3   dolomite  88.5
45     AP-3  limestone  88.5
46     AP-3  limestone  88.6
47     AP-3  limestone  88.7
48     AP-3  limestone  88.8
49     AP-3  limestone  88.9
50     AP-3  limestone  89.0