您好,感谢您关注我的问题!我正在尝试根据另一列的条件更改数据框中的一列。
我有两个数据框,第一个称为“ df_Ckt”的数据框用于查找具有特定电路和特定年份的year_value,看起来像这样:
df_Ckt.head(5)
Circuit Key 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028
0 CKT_4340_00865 9.256492 9.320154 9.658590 9.674177 9.674177 9.674177 9.674177 9.674177 9.674177 9.674177
1 CKT_14438_00891 1.078450 1.102765 1.227634 1.412518 1.723032 1.929562 2.140825 2.339290 2.555398 2.752190
2 CKT_37_01894 6.214399 6.372979 6.549099 6.822940 7.258766 7.554228 7.865580 8.155443 8.469345 8.737263
3 CKT_3543_03099 7.658913 7.759223 7.872652 7.889068 7.915327 7.930130 8.965180 8.981075 8.998183 9.013649
4 CKT_4380_03370 8.616798 8.633209 8.830170 9.123515 9.581061 9.885816 10.192292 10.476004 9.872779 10.153234
另一个称为“ df”的数据帧如下所示:
df.head(5)
circuit_key year calculated
0 CKT_5670_00020 2019 NA
1 CKT_5670_00020 2019 NA
2 CKT_5670_00020 2019 NA
3 CKT_5670_00020 2019 NA
4 CKT_5670_00020 2019 NA
“ df”中的年份范围是2019-2028,我添加了一个名为“ calculated”的列以捕获df_Ckt中的year_value。它应该看起来像这样:
circuit_key year calculated
0 CKT_5670_00020 2019 8.241063
1 CKT_5670_00020 2019 8.241063
2 CKT_5670_00020 2019 8.241063
3 CKT_5670_00020 2019 8.241063
4 CKT_5670_00020 2019 8.241063
我的代码如下:
df["calculated"]="NA"
for year in range (2019,2029):
year_value=df_Ckt.loc[df_Ckt['Circuit Key']=="circuit",year].reset_index(drop=True)
df.loc[np.logical_and(df.year==year,df.calculated=="NA"),['calculated']]=year_value
print(year,year_value)
输出如下:
2019 0 8.241063
Name: 2019, dtype: float64
2020 0 8.252401
Name: 2020, dtype: float64
2021 0 8.309021
Name: 2021, dtype: float64
2022 0 8.403156
Name: 2022, dtype: float64
2023 0 8.55595
Name: 2023, dtype: float64
2024 0 8.656351
Name: 2024, dtype: float64
2025 0 8.759824
Name: 2025, dtype: float64
2026 0 8.856902
Name: 2026, dtype: float64
2027 0 8.940435
Name: 2027, dtype: float64
2028 0 9.008744
Name: 2028, dtype: float64
当我要测试我修改的列时,全都是NaN,似乎loc函数无法分配该值。
df['calculated']
...
96440 NaN
96441 NaN
96442 NaN
Name: calculated, Length: 96443, dtype: object
然后,我尝试将常数变量分配给该列。我做了如下测试:
df["calculated"]="NA"
for year in range (2019,2029):
year_value=df_Ckt.loc[df_Ckt['Circuit Key']=="circuit",year].reset_index(drop=True)
df.loc[np.logical_and(df.year==year,df.calculated=="NA"),['calculated']]=1
在这种情况下,输出看起来是正确的:
0 1
1 1
2 1
..
Name: calculated1, Length: 96443, dtype: object
似乎我的“ year_value”存在一些问题,无法将其分配给数据框值。有谁知道如何使它工作?
答案 0 :(得分:0)
获得$ # -2 is number of columns needed
$ # -s option specifies delimiter, default is tab
$ seq 6 | pr -2ts','
1,4
2,5
3,6
$ seq 6 | pr -3ts','
1,3,5
2,4,6
$ # you can also change horizontal/vertical order
$ seq 6 | pr -3ats','
1,2,3
4,5,6
的原因是因为NaN
是一个序列,而不是单个浮点值。要分配计算值,请从year_value
系列中提取计算值,然后将其求解。
year_value