我遇到了一个我拥有pandas数据帧的进程,该数据帧的一个列是我想要的索引。
但是这个索引有一个Str + num类型。
Index_H
['N0000', 'N0002', 'N0003', 'N0007', 'N0011', 'N0017', 'N0018', 'N0020', 'N0021', 'N0023', 'N0026', 'N0027', 'N0028', 'N0030', 'N0033', 'N0034', 'N0045', 'N0050', 'N0052', 'N0055', 'N0056', 'N0057', 'N0059']
这些索引值中的每一个都有一个相关的行值,每行N ****跨越344列(1 * 344) 我想将其转换为连续索引,并将零/ Nan添加到索引缺失的行。
答案 0 :(得分:2)
df2 = pd.DataFrame({'foo': ['one','one','two','two','two'],
'bar': [ 'B', 'C', 'A', 'B', 'C'],
'baz': [ 2, 3, 4, 5, 6]},index=['N0001','N0004','N0005','N0006','N0009'])
idx='N'+pd.Series(list(range(1,11))).astype(str).str.zfill(4)
df2.reindex(idx)
Out[365]:
bar baz foo
N0001 B 2.0 one
N0002 NaN NaN NaN
N0003 NaN NaN NaN
N0004 C 3.0 one
N0005 A 4.0 two
N0006 B 5.0 two
N0007 NaN NaN NaN
N0008 NaN NaN NaN
N0009 C 6.0 two
N0010 NaN NaN NaN
答案 1 :(得分:1)
使用reindex
In [350]: df
Out[350]:
a
N0000 1
N0002 2
N0003 3
N0007 4
In [358]: df.reindex(['N%04d' % x for x in range(int(df.index[-1][1:])+1)])
Out[358]:
a
N0000 1.0
N0001 NaN
N0002 2.0
N0003 3.0
N0004 NaN
N0005 NaN
N0006 NaN
N0007 4.0