我就是这种情况, 我的 df 就是这样
A B
0 0.0 2.0
1 3.0 4.0
2 NaN 1.0
3 2.0 NaN
4 NaN 1.0
5 4.8 NaN
6 NaN 1.0
我想应用这行代码:
df['A'] = df['B'].fillna(df['A'])
我期待这样的工作流程和最终输出:
A B
0 2.0 2.0
1 4.0 4.0
2 1.0 1.0
3 NaN NaN
4 1.0 1.0
5 NaN NaN
6 1.0 1.0
A B
0 2.0 2.0
1 4.0 4.0
2 1.0 1.0
3 2.0 NaN
4 1.0 1.0
5 4.8 NaN
6 1.0 1.0
但我收到此错误:
TypeError: Unsupported type Series
可能是因为每次出现 NA 时,它都会尝试用整个系列填充它,而不是用与 B 列具有相同索引的单个元素填充它。
我收到同样的错误,语法如下:
df['C'] = df['B'].fillna(df['A'])
所以问题似乎不在于我首先用 B 的值更改 A 的值,然后尝试用技术上与 B 相同的列的值填充“B”NA
我在 databricks 环境中,我正在处理考拉数据框,但它们与熊猫数据框一样工作。 你能帮我吗?
答案 0 :(得分:0)
IIUC:
尝试使用max()
:
df['A']=df[['A','B']].max(axis=1)
df
的输出:
A B
0 2.0 2.0
1 4.0 4.0
2 1.0 1.0
3 2.0 NaN
4 1.0 1.0
5 4.8 NaN
6 1.0 1.0
答案 1 :(得分:0)
另一种选择
假设有以下数据集
.ConfigureLogging(logging =>
{
//Added these filters to stop seeing too many details about Http requests/responses
logging.AddFilter("System.Net.Http.HttpClient.IApiClient.ClientHandler", LogLevel.Warning);
logging.AddFilter("System.Net.Http.HttpClient.IApiClient.LogicalHandler", LogLevel.Warning);
});
然后
import pandas as pd
import numpy as np
df = pd.DataFrame(data={'State':[1,2,3,4,5,6, 7, 8, 9, 10],
'Sno Center': ["Guntur", "Nellore", "Visakhapatnam", "Biswanath", "Doom-Dooma", "Guntur", "Labac-Silchar", "Numaligarh", "Sibsagar", "Munger-Jamalpu"],
'Mar-21': [121, 118.8, 131.6, 123.7, 127.8, 125.9, 114.2, 114.2, 117.7, 117.7],
'Apr-21': [121.1, 118.3, 131.5, np.NaN, 128.2, 128.2, 115.4, 115.1, np.NaN, 118.3]})
df
State Sno Center Mar-21 Apr-21
0 1 Guntur 121.0 121.1
1 2 Nellore 118.8 118.3
2 3 Visakhapatnam 131.6 131.5
3 4 Biswanath 123.7 NaN
4 5 Doom-Dooma 127.8 128.2
5 6 Guntur 125.9 128.2
6 7 Labac-Silchar 114.2 115.4
7 8 Numaligarh 114.2 115.1
8 9 Sibsagar 117.7 NaN
9 10 Munger-Jamalpu 117.7 118.3