我想知道pandas数据帧中有多少点,其中index是我需要的一系列日期,以便在执行dropna()之后得到X点。我想要最新的积分。例如:
window = 504
s1 = pd.DataFrame(stuff)
len(s1.index) --> 600
dropped_series = s1.dropna()
len(dropped_series.index) --> 480
diff_points_count = len(s1.index) - len(dropped_series.index)
final_series = s1.tail(window + diff_points_count).dropna()
- > len(final_series.index)不一定等于窗口。取决于NaN的位置。
我需要它来工作,其中s1是pandas.Series或pandas.DataFrame
答案 0 :(得分:0)
这是我的解决方案,但我确信有一种更优雅的方式:
all_series_df = pd.concat([harmonized_series_set[i] for i in series_indices], axis=1)
all_series_df['is_valid'] = all_series_df.apply(lambda x: 0 if np.any(np.isnan(x)) else 1, raw=True, axis=1)
valid_point_count = all_series_df['is_valid'].sum()
all_series_df['count_valid'] = valid_point_count - all_series_df['is_valid'].cumsum() + 1
matching_row_array = all_series_df.loc[all_series_df['count_valid'] == (window + output_length - 1)]
matching_row_index = 0
if isinstance(matching_row_array, pd.DataFrame) and len(matching_row_array.index) > 0:
matching_row_index = all_series_df.index.get_loc(matching_row_array.index[0])
tail_amount = len(all_series_df.index) - matching_row_index
for i, arg in enumerate(args):
if i in series_indices:
tailed_series = harmonized_series_set[i].tail(tail_amount)
harmonized_args.append(tailed_series)
else:
harmonized_args.append(arg)
return tuple(harmonized_args)