Question

我对python非常陌生，所以在此道歉，我已经尝试解决了几个小时。

我有一组这样的数据，但是要大得多：

  A   B   C   D   
1 23  16  NaN 14
2 26  17  23  23
3 23  NaN 22  25
4 24  34  28  28

我需要创建另一列（E），该列采用该特定行的特定列（例如B，C和D）的平均得分。

如果在这一行中有任何缺失值（NaN），那么我需要在“ E”行中显示“缺失数据”，以代替平均得分所在的位置。

我尝试将NaN输出更改为0（这是成功的），然后运行类似于以下内容的代码（我的代码变得一团糟，我忘记了我从哪里开始或尝试过的事情）：

composite = []
for df in column ["A","B","C"]:
    if value > 0:
        composite.append(df[:, ["A","B","C"]].mean(axis=1))
    else:
        composite.append("missing value(s)")

df["composite"] = composite
print(df)

我知道这里的代码可能有很多错误，但是它只是我要尝试做的一个粗略结构。

我还尝试了我可以在Google上找到的每种方法，包括其他技术，例如.loc函数。我不想寻求帮助，通常我相信我可以使用以前发布的问题自行找到解决方案，但是在这种情况下，尽管我花了数小时通过Google进行梳理，但我还是无法使任何工作。

任何帮助将不胜感激。我也被告知我必须使用for循环。此外，如果可以在不将NaN值更改为0的情况下完成此操作，则将是更好的选择。

谢谢您的输入。

Answer 1

我相信您可能会尝试这样的事情：

# Create column E with missing values
df["E"] = "missing values"

for idx, row in df.iterrows():
    # Try to convert your values as foat, in case any of them is NaN it will thow
    # an exception and pass
    try:
        B = float(df.iloc[idx,"B"])
        C = float(df.iloc[idx,"C"])
        D = float(df.iloc[idx,"D"])
        df.iloc[idx,"E"] = (B + C + D)/3  # Calculate the mean value and place it on E
    except:
        pass

让我知道是否有帮助！干杯！

for循环创建一个包含特定列平均值的新列，并生成一条“缺少值”消息，其中NaN

1 个答案: