Numpy和Pandas插值也会改变原始数据

时间:2015-12-22 15:00:59

标签: python numpy pandas interpolation

我正在尝试为缺失的日子插入数据。原始数据是;

2012-06-27 00:00:00 17
2012-06-27 01:00:00 17
2012-06-27 02:00:00 18
2012-06-27 03:00:00 18
2012-06-27 04:00:00 19
2012-06-27 05:00:00 20
2012-06-27 06:00:00 22
2012-06-27 07:00:00 23
2012-06-27 08:00:00 25
2012-06-27 09:00:00 27
2012-06-27 10:00:00 27
2012-06-27 11:00:00 29
2012-06-27 12:00:00 29
2012-06-27 13:00:00 30
2012-06-27 14:00:00 30
2012-06-27 15:00:00 29
2012-06-27 16:00:00 28
2012-06-27 17:00:00 26
2012-06-27 18:00:00 25
2012-06-27 19:00:00 24
2012-06-27 20:00:00 23
2012-06-27 21:00:00 23
2012-06-27 22:00:00 16
2012-06-27 23:00:00 15
2012-06-29 00:00:00 15
2012-06-29 01:00:00 16
2012-06-29 02:00:00 16
2012-06-29 03:00:00 16
2012-06-29 04:00:00 17
2012-06-29 05:00:00 17
2012-06-29 06:00:00 18
2012-06-29 07:00:00 19
2012-06-29 08:00:00 20
2012-06-29 09:00:00 22
2012-06-29 10:00:00 22
2012-06-29 11:00:00 22
2012-06-29 12:00:00 22
2012-06-29 13:00:00 22
2012-06-29 14:00:00 22
2012-06-29 15:00:00 22
2012-06-29 16:00:00 21
2012-06-29 17:00:00 19
2012-06-29 18:00:00 17
2012-06-29 19:00:00 16
2012-06-29 20:00:00 15
2012-06-29 21:00:00 14
2012-06-29 22:00:00 14
2012-06-29 23:00:00 13

如您所见,2014-12-28缺失,所以我尝试使用Numpy和Pandas进行插值。 对于Numpy,代码是;

def inter_lin_nan(ts_temp, rule):
ts_temp = ts_temp.resample(rule)
mask = np.isnan(ts_temp)
# interpolling missing values
ts_temp[mask] = np.interp(np.flatnonzero(mask), np.flatnonzero(~mask),ts_temp[~mask])
return(ts_temp)

和我使用的熊猫;

df_temp=df_temp.asfreq('1h')
df_temp['Temp2'] = df_temp['temp'].interpolate(method='linear')

问题是,这两种方法都会为缺失的一天进行插值,但它们也会更改2014-12-29的原始数据。你知道为什么会这样,或者我错过了什么?

1 个答案:

答案 0 :(得分:0)

我无法重现这个问题,但这对我有用(假设您的数据框在日期时间被编入索引):

@RequestMapping(value = "foo", method = RequestMethod.GET)
public ResponseEntity<Result> doSomething(@RequestParam int someParam) {
    try {
        final Result result = service.getByParam(someParam);
        if (result == null) {
            return ResponseUtils.noContent();
        } else {
            return new ResponseEntity<Result>(result, null, HttpStatus.ACCEPTED);
        }

        } catch (Exception ex) {
            return ResponseUtils.internalServerError();
        }
}

//you forgot to add static keyword in this Utils class
public static class ResponseUtils{
    public static <T> ResponseEntity<T> noContent(){
       return withStatus(HttpStatus.NO_CONTENT);
    }

    public static <T> ResponseEntity<T> internalServerError(){
       return withStatus(HttpStatus.INTERNAL_SERVER_ERROR);
    }

    public static <T> ResponseEntity<T> accepted(){
       return withStatus(HttpStatus.ACCEPTED);
    }

    private static <T> ResponseEntity<T> withStatus(HttpStatus status){
       return new ResponseEntity<T>(status);
    }
}

输出:

enter image description here

正如您所看到的,在有数据的日子里,这些线条完全重叠:没有原始数据被更改了#39;插值似乎也有意义,在此图中,原始系列中的缺失值设置为0以进行比较。