用另一列的元素填充一列的 NA

时间:2021-07-29 14:00:41

标签: python pandas dataframe spark-koalas

我就是这种情况, 我的 df 就是这样

    A   B   
0   0.0 2.0 
1   3.0 4.0 
2   NaN 1.0 
3   2.0 NaN 
4   NaN 1.0 
5   4.8 NaN 
6   NaN 1.0 

我想应用这行代码: df['A'] = df['B'].fillna(df['A'])

我期待这样的工作流程和最终输出:

    A   B   
0   2.0 2.0 
1   4.0 4.0 
2   1.0 1.0 
3   NaN NaN 
4   1.0 1.0 
5   NaN NaN 
6   1.0 1.0 

    A   B   
0   2.0 2.0 
1   4.0 4.0 
2   1.0 1.0 
3   2.0 NaN 
4   1.0 1.0 
5   4.8 NaN 
6   1.0 1.0 

但我收到此错误:

TypeError: Unsupported type Series

可能是因为每次出现 NA 时,它都会尝试用整个系列填充它,而不是用与 B 列具有相同索引的单个元素填充它。

我收到同样的错误,语法如下: df['C'] = df['B'].fillna(df['A']) 所以问题似乎不在于我首先用 B 的值更改 A 的值,然后尝试用技术上与 B 相同的列的值填充“B”NA

我在 databricks 环境中,我正在处理考拉数据框,但它们与熊猫数据框一样工作。 你能帮我吗?

2 个答案:

答案 0 :(得分:0)

IIUC:

尝试使用max()

df['A']=df[['A','B']].max(axis=1)

df 的输出:

    A       B
0   2.0     2.0
1   4.0     4.0
2   1.0     1.0
3   2.0     NaN
4   1.0     1.0
5   4.8     NaN
6   1.0     1.0

答案 1 :(得分:0)

另一种选择

假设有以下数据集

            .ConfigureLogging(logging =>
            {
                //Added these filters to stop seeing too many details about Http requests/responses
                logging.AddFilter("System.Net.Http.HttpClient.IApiClient.ClientHandler", LogLevel.Warning);
                logging.AddFilter("System.Net.Http.HttpClient.IApiClient.LogicalHandler", LogLevel.Warning);
            });

然后

import pandas as pd
import numpy as np

df = pd.DataFrame(data={'State':[1,2,3,4,5,6, 7, 8, 9, 10], 
                         'Sno Center': ["Guntur", "Nellore", "Visakhapatnam", "Biswanath", "Doom-Dooma", "Guntur", "Labac-Silchar", "Numaligarh", "Sibsagar", "Munger-Jamalpu"], 
                         'Mar-21': [121, 118.8, 131.6, 123.7, 127.8, 125.9, 114.2, 114.2, 117.7, 117.7],
                         'Apr-21': [121.1, 118.3, 131.5, np.NaN, 128.2, 128.2, 115.4, 115.1, np.NaN, 118.3]})

df
State   Sno Center      Mar-21  Apr-21
0   1   Guntur          121.0   121.1
1   2   Nellore         118.8   118.3
2   3   Visakhapatnam   131.6   131.5
3   4   Biswanath       123.7   NaN
4   5   Doom-Dooma      127.8   128.2
5   6   Guntur          125.9   128.2
6   7   Labac-Silchar   114.2   115.4
7   8   Numaligarh      114.2   115.1
8   9   Sibsagar        117.7   NaN
9   10  Munger-Jamalpu  117.7   118.3