通过重新索引

时间:2015-10-13 10:32:41

标签: python pandas indexing

我有dataset 温度为一列。由于加热器的工作原理,数据中存在许多空白。为了使不同的数据集直接可比,我想填写这些缺失的温度并在另一列中添加相应的NaN。

我试图使用这里给出的答案,这似乎正是我想要的:link。 但这不起作用 - 我得到一个具有我想要的新温度值的数据框,但相应的数据已经消失:

import pandas as pd 
import numpy as np           
A1 = pd.read_table('Test data.tsv', encoding='ISO-8859-1', header = 2) 
A1.columns = ['time',2,3,4,5,6,7,'freq',9,10,11,12,13,'temp',15,16,17,18,19] 
A1truncated = A1[A1.temp >= 25]; A1truncated=A1truncated[A1truncated.temp <= 350.1]
A1averaged = A1truncated.groupby(['temp'], as_index=False)['freq'].mean() 
A1averaged = np.around(A1averaged, decimals=1)

A1averaged.set_index('temp') 
new_index = pd.Index(np.arange(25, 350, 0.1), name='temp')
A1indexed = A1averaged.set_index('temp').reindex(new_index).reset_index() 

将我的19列变为1,温度为索引(A1averaged),然后变为2列,其中包含新的温度列表和一列空数据(A1indexed)。 任何想法为什么这不起作用?还是另一种方法呢?

1 个答案:

答案 0 :(得分:1)

带浮点数的索引reindex有问题,不一致可能是因为浮点精度。所以我使用小骇客 - Int64Index代替Float64Index

我尝试更简单地设置子集:

A1truncated = A1[(A1.temp >= 25) & ( A1.temp <= 350.1)]

然后省略第一个设置索引,因为设置了两次:

A1averaged.set_index('temp')

new_index设为Int64Index

new_index = pd.Index(np.arange(250, 3500), name='temp')

并使用Int64Index乘以temp10,最后此列除以10

A1averaged['temp'] = A1averaged['temp'] * 10
A1indexed['temp'] = A1indexed['temp'] / 10

所有在一起:

import pandas as pd 
import numpy as np           
A1 = pd.read_table('Test data.tsv', encoding='ISO-8859-1', header = 2) 

A1.columns = ['time',2,3,4,5,6,7,'freq',9,10,11,12,13,'temp',15,16,17,18,19] 

A1truncated = A1[(A1.temp >= 25) & ( A1.temp <= 350.1)]

A1averaged = A1truncated.groupby(['temp'], as_index=False)['freq'].mean() 
A1averaged = np.around(A1averaged, decimals=1)
new_index = pd.Index(np.arange(250, 3500), name='temp')

A1averaged['temp'] = A1averaged['temp'] * 10
A1indexed = A1averaged.set_index('temp').reindex(new_index).reset_index()
A1indexed['temp'] = A1indexed['temp'] / 10
print A1indexed.tail()
#       temp       freq
#3245  349.5  5830065.6
#3246  349.6  5830043.5
#3247  349.7  5830046.3
#3248  349.8  5830025.3
#3249  349.9  5830015.6