Question

我有一个csv文件，如下：

         Landform              Number         Name   Class
0        Deltaic Plain         912            Lx     NaN
1    Hummock and Swale         912            Lx     NaN
2           Sand Dunes         912            Lx     NaN
3    Hummock and Swale         939       Woodbury    NaN
4           Sand Dunes         939       Woodbury    NaN

当地形包含特定Deltaic Plain的{{1}}，Hummock and Swale和Sand Dunes时，我想将值1分配给Name。

当Class包含Landform和Hummock and Swale时，我想为Sand Dunes分配值2。

我想要的输出是：

Class

我知道如何只为这样做一行：

         Landform              Number         Name   Class
0        Deltaic Plain         912            Lx     1
1    Hummock and Swale         912            Lx     1
2           Sand Dunes         912            Lx     1
3    Hummock and Swale         939       Woodbury    2
4           Sand Dunes         939       Woodbury    2

但我不确定如何按def f(x): if x['Landform'] == 'Hummock and Swale' : return '1' else: return '2' df['Class'] = df.apply(f, axis=1)进行分组，然后根据多行创建条件函数。

Answer 1

我们的想法是对您的Number列进行分组，并应用一个函数来查看该组中的所有地形并返回一个合适的类。这是一个例子：

def determineClass(landforms):
    if all(form in landforms.values for form in ('Deltaic Plain', 'Hummock and Swale', 'Sand Dunes')):
        return 1
    elif all(form in landforms.values for form in ('Hummock and Swale', 'Sand Dunes')):
        return 2
    # etc.
    else:
        # return "default" class
        return 0

>>> df.groupby('Number').Landform.apply(determineClass)
Number
912    1
939    2
Name: Landform, dtype: int64

如果您想将这些值分配回“类”列，请使用map，如20分钟前的this question中所述：

>>> classes = df.groupby('Number').Landform.apply(determineClass)
>>> df['Class'] = df.Number.map(classes)
>>> df
            Landform  Number      Name  Class
0      Deltaic Plain     912        Lx      1
1  Hummock and Swale     912        Lx      1
2         Sand Dunes     912        Lx      1
3  Hummock and Swale     939  Woodbury      2
4         Sand Dunes     939  Woodbury      2

多行

1 个答案: