如何为每隔一行的数据帧的浮点索引添加一个值?

时间:2019-04-24 23:02:11

标签: python pandas numpy

我正在以2000 Hz的频率记录数据,这意味着每0.5毫秒我会有另一个数据点。但是我的记录软件只能以1毫秒的精度进行记录,因此这意味着我在使用float类型的数据框索引中有重复的值。

因此,为了修复重复项,我想向索引的其他每一行添加0.005。我试过了,但是到目前为止它不起作用:

c = df.iloc[:,0] # select the first column of the dataframe
c = c.iloc[::-1]  # reverse order so that time is increasing not decreasing
pd.set_option('float_format', '{:f}'.format) # change the print output to show the decimals (instead of 15.55567E9)
i = c.index # get the index of c - the length is 20000
rp = np.matlib.repmat([0, 0.0005], 1, 10000) # create an array to repeat .0005 0 so that we can add 0.005 to every other row
df.set_index(c, i+rp).astype(float).applymap('{:,.4f}'.format) # set the index of c to i+rp - attempt to format to 4 decimals
print(c) # see if it worked

预期输出:(为节省空间而进行了调整-不显示所有20,000行)

1555677243.401000   4.569000
1555677243.401500   4.569000
1555677243.402000   4.571000
1555677243.402500   4.574000
1555677243.403000   4.574000
1555677243.403500   4.576000
1555677243.404000   4.577000
1555677243.404500   4.577000
1555677243.405000   4.577000
1555677243.405500   4.581000
1555677243.406000   4.581000
1555677243.406500   4.582000
1555677243.407000   4.581000
1555677243.407500   4.582000
1555677243.408000   4.580000
1555677243.408500   4.580000
1555677243.409000   4.582000
1555677243.409500   4.585000
1555677243.410000   4.585000
1555677243.410500   4.585000

实际输出:(注意索引中的重复项)

1555677243.401000   4.569000
1555677243.401000   4.569000
1555677243.402000   4.571000
1555677243.402000   4.574000
1555677243.403000   4.574000
1555677243.403000   4.576000
1555677243.404000   4.577000
1555677243.404000   4.577000
1555677243.405000   4.577000
1555677243.405000   4.581000
1555677243.406000   4.581000
1555677243.406000   4.582000
1555677243.407000   4.581000
1555677243.407000   4.582000
1555677243.408000   4.580000
1555677243.408000   4.580000
1555677243.409000   4.582000
1555677243.409000   4.585000
1555677243.410000   4.585000
1555677243.410000   4.585000

4 个答案:

答案 0 :(得分:2)

df = pd.DataFrame({'A': [1,2,3,4,5,6,7,8,9],
                   'B': [1,2,3,4,5,6,7,8,9]})

df.iloc[1::2, 1] = df.iloc[1::2, :].eval('B + 0.005')

    A     B
0   1   1.000
1   2   2.005
2   3   3.000
3   4   4.005
4   5   5.000
5   6   6.005
6   7   7.000
7   8   8.005
8   9   9.000

只需确保您使用初始iloc选择正确的列。 [1 :: 2]是从索引1开始的每隔1个(所以1,3等)。您需要选择第二个iloc中的所有列,因为eval仅适用于df而不适用于系列。然后,您可以像在代码中一样将该列设置为索引。

答案 1 :(得分:1)

我没有您的数据框,但是您可能会考虑在像偶数/奇数索引之间创建一个循环。您能向我们展示原始DF吗?

data = pd.read_csv('C:/random/d2', sep=',', header=None,names=['W1','W2'])
df=pd.DataFrame(data)
dfNew=pd.DataFrame(columns=['W1','W2'])
rows,clumns=df.shape
for index in range(rows):
    if(index %2==0):
        tempRow=['{0:.6f}'.format(df.iat[index,0]), df.iat[index,1]]
    else:
        tempRow=['{0:.6f}'.format(df.iat[index,0]+0.0005), df.iat[index,1]]
    dfNew.loc[len(dfNew)]=tempRow

print(df)
print('#############')
print(dfNew)

数据

1555677243.401000,4.569000
1555677243.401000,4.569000
1555677243.402000,4.571000
1555677243.402000,4.574000
1555677243.403000,4.574000
1555677243.403000,4.576000
1555677243.404000,4.577000
1555677243.404000,4.577000
1555677243.405000,4.577000
1555677243.405000,4.581000
1555677243.406000,4.581000
1555677243.406000,4.582000
1555677243.407000,4.581000
1555677243.407000,4.582000
1555677243.408000,4.580000
1555677243.408000,4.580000
1555677243.409000,4.582000
1555677243.409000,4.585000
1555677243.410000,4.585000
1555677243.410000,4.585000

结果

                   W1     W2
0   1555677243.401000  4.569
1   1555677243.401500  4.569
2   1555677243.402000  4.571
3   1555677243.402500  4.574
4   1555677243.403000  4.574
5   1555677243.403500  4.576
6   1555677243.404000  4.577
7   1555677243.404500  4.577
8   1555677243.405000  4.577
9   1555677243.405500  4.581
10  1555677243.406000  4.581
11  1555677243.406500  4.582
12  1555677243.407000  4.581
13  1555677243.407500  4.582
14  1555677243.408000  4.580
15  1555677243.408500  4.580
16  1555677243.409000  4.582
17  1555677243.409500  4.585
18  1555677243.410000  4.585
19  1555677243.410500  4.585

答案 2 :(得分:1)

您可以拉出索引,将其转换为Series,对其进行修改,然后再放回索引(Indexes是不可变的):

import pandas as pd

df = pd.DataFrame(list(range(10)), index=[x/ 1000 for x in range(10)])

new_index = df.index.to_series()
new_index[::2] += 0.0005
result = df.set_index(new_index)
print(result)

输出:

        0
0.0005  0
0.0010  1
0.0025  2
0.0030  3
0.0045  4
0.0050  5
0.0065  6
0.0070  7
0.0085  8
0.0090  9

答案 3 :(得分:1)

来自gmds的IIUC数据

df.index+=np.arange(len(df))%2*0.0005
df
        0
0.0000  0
0.0015  1
0.0020  2
0.0035  3
0.0040  4
0.0055  5
0.0060  6
0.0075  7
0.0080  8
0.0095  9