df = pd.DataFrame({'From_To': ['LoNDon_paris', 'MAdrid_miLAN', 'londON_StockhOlm','Budapest_PaRis', 'Brussels_londOn'],
'FlightNumber': [10045, np.nan, 10065, np.nan, 10085],
'RecentDelays': [[23, 47], [], [24, 43, 87], [13], [67, 32]],
'Airline': ['KLM(!)', '<Air France> (12)', '(British Airways. )', '12. Air France', '"Swiss Air"']})
df
Airline FlightNumber From_To RecentDelays
0 KLM(!) 10045.0 LoNDon_paris [23, 47]
1 <Air France> (12) NaN MAdrid_miLAN []
2 (British Airways. ) 10065.0 londON_StockhOlm [24, 43, 87]
3 12. Air France NaN Budapest_PaRis [13]
4 "Swiss Air" 10085.0 Brussels_londOn [67, 32]
缺少FlightNumber列中的某些值。这些数字意味着每行增加10,因此需要设置10055和10075。填写这些缺失的数字,并使列成为整数列(而不是浮点列)。
答案 0 :(得分:1)
似乎是pd.Series.interpolate
的一个很好的用例:
df['FlightNumber'] = df['FlightNumber'].interpolate().astype(int)
df
Airline FlightNumber From_To RecentDelays
0 KLM(!) 10045 LoNDon_paris [23, 47]
1 <Air France> (12) 10055 MAdrid_miLAN []
2 (British Airways. ) 10065 londON_StockhOlm [24, 43, 87]
3 12. Air France 10075 Budapest_PaRis [13]
4 "Swiss Air" 10085 Brussels_londOn [67, 32]
默认方法是'linear'
,只要FlightNumber
线性增加,就是这里所需要的。
答案 1 :(得分:0)
希望这行得通。
for i in range(1, df['FlightNumber'].count() + 1):
if pd.isnull(df.loc[i,'FlightNumber']):
df.loc[i, 'FlightNumber'] = df.loc[i-1, 'FlightNumber'] + 10
答案 2 :(得分:0)
尝试以下代码:-
df['FlightNumber'] = df['FlightNumber'].interpolate().astype(int)