当数据在序列中时,填充熊猫中的缺失值

时间:2019-03-30 17:48:28

标签: python pandas

当数据有一定顺序时,我很难找到有关如何填充熊猫中缺失值的资源。

例如,在下面的示例中,我想用HHC_2019_02_03_53.png填充B9,用HHC_2019_02_03_54.png填充B10fillna方法中是否有任何东西可以让我增加熊猫序列,或者有办法通过使用同一列的上一个值和下一个值来填充缺失的值

df = pd.DataFrame(np.array([['HHC_2019_02_03_0.png', 'A1'],
       ['HHC_2019_02_03_1.png', 'A2'],
       ['HHC_2019_02_03_2.png', 'A3'],
       ['HHC_2019_02_03_3.png', 'A4'],
       ['HHC_2019_02_03_4.png', 'A5'],
       ['HHC_2019_02_03_5.png', 'A6'],
       ['HHC_2019_02_03_6.png', 'A7'],
       ['HHC_2019_02_03_7.png', 'A8'],
       ['HHC_2019_02_03_8.png', 'A9'],
       ['HHC_2019_02_03_9.png', 'A10'],
       ['HHC_2019_02_03_10.png', 'A11'],
       ['HHC_2019_02_03_11.png', 'A12'],
       ['HHC_2019_02_03_12.png', 'A13'],
       ['HHC_2019_02_03_13.png', 'A14'],
       ['HHC_2019_02_03_14.png', 'A15'],
       ['HHC_2019_02_03_15.png', 'A16'],
       ['HHC_2019_02_03_16.png', 'A17'],
       ['HHC_2019_02_03_17.png', 'A18'],
       ['HHC_2019_02_03_18.png', 'A19'],
       ['HHC_2019_02_03_19.png', 'A20'],
       ['HHC_2019_02_03_20.png', 'A21'],
       ['HHC_2019_02_03_21.png', 'A22'],
       ['HHC_2019_02_03_22.png', 'A23'],
       ['HHC_2019_02_03_23.png', 'A24'],
       ['HHC_2019_02_03_24.png', 'A25'],
       ['HHC_2019_02_03_25.png', 'A26'],
       ['HHC_2019_02_03_26.png', 'A27'],
       ['HHC_2019_02_03_27.png', 'A28'],
       ['HHC_2019_02_03_28.png', 'A29'],
       ['HHC_2019_02_03_29.png', 'A30'],
       ['HHC_2019_02_03_30.png', 'A31'],
       ['HHC_2019_02_03_31.png', 'A32'],
       ['HHC_2019_02_03_32.png', 'A33'],
       ['HHC_2019_02_03_33.png', 'A34'],
       ['HHC_2019_02_03_34.png', 'A35'],
       ['HHC_2019_02_03_35.png', 'A36'],
       ['HHC_2019_02_03_36.png', 'Z3'],
       ['HHC_2019_02_03_37.png', 'Z2'],
       ['HHC_2019_02_03_38.png', 'Z1'],
       ['HHC_2019_02_03_39.png', 'Z4'],
       ['HHC_2019_02_03_40.png', 'Z5'],
       ['HHC_2019_02_03_41.png', 'Z6'],
       ['HHC_2019_02_03_42.png', 'Z7'],
       ['HHC_2019_02_03_43.png', 'Z8'],
       ['HHC_2019_02_03_44.png', 'Z9'],
       ['HHC_2019_02_03_45.png', 'B5'],
       ['HHC_2019_02_03_46.png', 'B2'],
       ['HHC_2019_02_03_47.png', 'B4'],
       ['HHC_2019_02_03_48.png', 'B4'],
       ['HHC_2019_02_03_49.png', 'B5'],
       ['HHC_2019_02_03_50.png', 'B6'],
       ['HHC_2019_02_03_51.png', 'B7'],
       ['HHC_2019_02_03_52.png', 'B8'],
       ['HHC_2019_02_03_53.png', np.nan],
       ['HHC_2019_02_03_54.png', np.nan],
       ['HHC_2019_02_03_55.png', 'C1'],
       ['HHC_2019_02_03_56.png', 'C2'],
       ['HHC_2019_02_03_57.png', 'C3'],
       ['HHC_2019_02_03_58.png', 'C4'],
       ['HHC_2019_02_03_59.png', 'C5'],
       ['HHC_2019_02_03_60.png', 'C6'],
       ['HHC_2019_02_03_61.png', 'C7'],
       ['HHC_2019_02_03_62.png', 'C8'],
       ['HHC_2019_02_03_63.png', 'C9'],
       ['HHC_2019_02_03_64.png', 'C10'],
       ['HHC_2019_02_03_65.png', 'C11'],
       ['HHC_2019_02_03_66.png', 'C12'],
       ['HHC_2019_02_03_67.png', 'C13'],
       ['HHC_2019_02_03_68.png', 'C14'],
       ['HHC_2019_02_03_69.png', 'C15'],
       ['HHC_2019_02_03_70.png', 'C16'],
       ['HHC_2019_02_03_71.png', 'C17'],
       ['HHC_2019_02_03_72.png', 'C18'],
       ['HHC_2019_02_03_73.png', 'EE1'],
       ['HHC_2019_02_03_74.png', 'EE2'],
       ['HHC_2019_02_03_75.png', 'EE3'],
       ['HHC_2019_02_03_76.png', 'EE4'],
       ['HHC_2019_02_03_77.png', 'EE5'],
       ['HHC_2019_02_03_78.png', 'EE6'],
       ['HHC_2019_02_03_79.png', 'F1'],
       ['HHC_2019_02_03_80.png', 'F2'],
       ['HHC_2019_02_03_81.png', 'F3'],
       ['HHC_2019_02_03_82.png', 'F4'],
       ['HHC_2019_02_03_83.png', 'F5'],
       ['HHC_2019_02_03_84.png', 'F6'],
       ['HHC_2019_02_03_85.png', 'F7'],
       ['HHC_2019_02_03_86.png', 'F8'],
       ['HHC_2019_02_03_87.png', 'G1'],
       ['HHC_2019_02_03_88.png', 'G2'],
       ['HHC_2019_02_03_89.png', 'G3'],
       ['HHC_2019_02_03_90.png', 'G4'],
       ['HHC_2019_02_03_91.png', 'G5'],
       ['HHC_2019_02_03_92.png', 'G6'],
       ['HHC_2019_02_03_93.png', 'G7'],
       ['HHC_2019_02_03_94.png', 'G8'],
       ['HHC_2019_02_03_95.png', 'G9'],
       ['HHC_2019_02_03_96.png', 'G10'],
       ['HHC_2019_02_03_97.png', 'G11'],
       ['HHC_2019_02_03_98.png', 'G12'],
       ['HHC_2019_02_03_99.png', 'G13'],
       ['HHC_2019_02_03_100.png', 'G14'],
       ['HHC_2019_02_03_101.png', 'G15'],
       ['HHC_2019_02_03_102.png', 'G16'],
       ['HHC_2019_02_03_103.png', '1'],
       ['HHC_2019_02_03_104.png', '2'],
       ['HHC_2019_02_03_105.png', 3],
       ['HHC_2019_02_03_106.png', 4],
       ['HHC_2019_02_03_107.png', 5],
       ['HHC_2019_02_03_108.png', 6],
       ['HHC_2019_02_03_109.png', 7],
       ['HHC_2019_02_03_110.png', 'R0'],
       ['HHC_2019_02_03_111.png', 'R1'],
       ['HHC_2019_02_03_112.png', 'R2'],
       ['HHC_2019_02_03_113.png', 'R3'],
       ['HHC_2019_02_03_114.png', 'R4'],
       ['HHC_2019_02_03_115.png', 'R5'],
       ['HHC_2019_02_03_116.png', 'R6'],
       ['HHC_2019_02_03_117.png', 'R7'],
       ['HHC_2019_02_03_118.png', 'R8'],
       ['HHC_2019_02_03_119.png', 'R9'],
       ['HHC_2019_02_03_120.png', 'R10'],
       ['HHC_2019_02_03_121.png', 'R11'],
       ['HHC_2019_02_03_122.png', 'R12'],
       ['HHC_2019_02_03_123.png', 'R13'],
       ['HHC_2019_02_03_124.png', 'R14'],
       ['HHC_2019_02_03_125.png', 'R15'],
       ['HHC_2019_02_03_126.png', 'R16'],
       ['HHC_2019_02_03_127.png', 'R17'],
       ['HHC_2019_02_03_128.png', 'R18'],
       ['HHC_2019_02_03_129.png', 'R19'],
       ['HHC_2019_02_03_130.png', 'R20'],
       ['HHC_2019_02_03_131.png', 'R21'],
       ['HHC_2019_02_03_132.png', 'R22'],
       ['HHC_2019_02_03_133.png', 'U1'],
       ['HHC_2019_02_03_134.png', 'U2'],
       ['HHC_2019_02_03_135.png', 'U3'],
       ['HHC_2019_02_03_136.png', 'U4'],
       ['HHC_2019_02_03_137.png', 'U5'],
       ['HHC_2019_02_03_138.png', 'U5'],
       ['HHC_2019_02_03_139.png', 'U6'],
       ['HHC_2019_02_03_140.png', 'U7']]), columns=['col1', 'col2'])

1 个答案:

答案 0 :(得分:0)

由于您有多个条件,因此我将使用np.select

# Show dataframe used.
print(df.head())
                   col1 col2
0  HHC_2019_02_03_0.png   A1
1  HHC_2019_02_03_1.png   A2
2  HHC_2019_02_03_2.png   A3
3  HHC_2019_02_03_3.png   A4
4  HHC_2019_02_03_4.png   A5

应用np.select

conditions = [df['col1'] == 'HHC_2019_02_03_53.png', 
              df['col1'] == 'HHC_2019_02_03_54.png']
choices = ['B9', 'B10']

df['col2'] = np.select(conditions, choices, default=df['col2'])

print(df[(df['col1'] == 'HHC_2019_02_03_53.png') | (df['col1'] == 'HHC_2019_02_03_54.png')])
                     col1 col2
53  HHC_2019_02_03_53.png   B9
54  HHC_2019_02_03_54.png  B10