Question

我有一个数据帧（220行×2列）和值列表[41,84,129,174,219,45]。我想在我的列表中指定的索引位置下面的数据框中插入新行（包含-999，-999）。所以举个例子。

40   400  -47.595322  
41   410   13.159509  
42     0 -235.865433  
43     8 -102.183365

会变成：

40   400  -47.595322  
41   410   13.159509  
42   -999  -999  
43     0 -235.865433  
44     8 -102.183365

等等......等等谢谢:)）

到目前为止我所拥有的：

import pandas as pd
import numpy as np
import glob
path =r'MyPath'
allFiles = glob.glob(path + "/*.dat")

frame = pd.DataFrame()
list_ = []
for file_ in allFiles:
    df = pd.read_csv(file_, delim_whitespace=True, index_col=None, header=None)
    list_.append(df)

frame = pd.concat(list_)
frame.columns = ['age', 'dt']
frame = frame.reset_index(drop=True)
idx = [] + list(frame['age'][frame['age'] == 410].index) + [df.index[-1]+1]
idx = np.array(idx)

df = pd.DataFrame(
np.insert(frame.values, idx + 1, -999, axis=0), columns=frame.columns)


print(df.to_string())

Answer 1

如果您的数据框包含单调递增的索引，则可以使用np.insert完成此操作：

idx = np.array([41, 84, 129, 174, 219, 45])
df = pd.DataFrame(
    np.insert(df.values, idx + 1, -999, axis=0), columns=df.columns
)

如果没有，您需要调用index.get_loc来获取数组中正确的索引：

idx = [df.index.get_loc(i) + 1 for i in idx]

并像以前一样调用插入代码。

演示：

df
    A    B           C
0  40  400  -47.595322
1  41  410   13.159509
2  42    0 -235.865433
3  43    8 -102.183365

idx = np.array([1, 3])
pd.DataFrame(
    np.insert(df.values, idx + 1, -999, axis=0), columns=df.columns
)

       A      B           C
0   40.0  400.0  -47.595322
1   41.0  410.0   13.159509
2 -999.0 -999.0 -999.000000
3   42.0    0.0 -235.865433
4   43.0    8.0 -102.183365
5 -999.0 -999.0 -999.000000

注意无效的索引访问。

使用值列表指定插入新行的索引位置

1 个答案: