我有一个熊猫DataFrame代表一些测量 第一列表示连续变量,其增量很小,为0.1或0.2 我需要重新采样此变量(以及整个DataFrame),使其每增加1
0 494.84284
1 494.86824
2 494.89364
3 494.91904
4 494.94444
5 494.96984
6 494.99524
7 495.02064
8 495.04604
9 495.07144
10 495.09684
11 495.12224
12 495.14764
13 495.17304
14 495.19844
15 495.22384
16 495.24924
17 495.27464
18 495.30004
19 495.32544
20 495.35084
21 495.37624
22 495.40164
23 495.42704
24 495.45244
25 495.47784
26 495.50324
27 495.52864
28 495.55404
29 495.57944
我试图将此列设置为索引,并成功运行下面的代码
row_init = 0.0
for index, row in df.iterrows():
if (index - row_init) < 1:
#print (index)
df.drop(index, inplace=True)
row_init = index
#print (row_init)
Example output:
0 494.84284
1 495.02064
2 496.47784
3 497.50324
4 498.52864
5 499.55404
6 500.57944
答案 0 :(得分:0)
您似乎只想要每个整数的第一个值,因此可以对整数值进行分组并取第一个!
df = pd.DataFrame({'data':[494.84284,494.86824,494.89364,494.91904,494.94444,494.96984,494.99524,495.02064,495.04604,495.07144,495.66072,496.01247,497.5000,497.9777,500.01354]})
df.groupby(df['data'].astype(int)).first().reset_index(drop=True)
输出
data
0 494.84284
1 495.02064
2 496.01247
3 497.50000
4 500.01354