多行

时间:2016-01-17 19:19:46

标签: python pandas

我有一个csv文件,如下:

         Landform              Number         Name   Class
0        Deltaic Plain         912            Lx     NaN
1    Hummock and Swale         912            Lx     NaN
2           Sand Dunes         912            Lx     NaN
3    Hummock and Swale         939       Woodbury    NaN
4           Sand Dunes         939       Woodbury    NaN

当地形包含特定Deltaic Plain的{​​{1}},Hummock and SwaleSand Dunes时,我想将值1分配给Name

Class包含LandformHummock and Swale时,我想为Sand Dunes分配值2。

我想要的输出是:

Class

我知道如何只为这样做一行:

         Landform              Number         Name   Class
0        Deltaic Plain         912            Lx     1
1    Hummock and Swale         912            Lx     1
2           Sand Dunes         912            Lx     1
3    Hummock and Swale         939       Woodbury    2
4           Sand Dunes         939       Woodbury    2

但我不确定如何按def f(x): if x['Landform'] == 'Hummock and Swale' : return '1' else: return '2' df['Class'] = df.apply(f, axis=1) 进行分组,然后根据多行创建条件函数。

1 个答案:

答案 0 :(得分:1)

我们的想法是对您的Number列进行分组,并应用一个函数来查看该组中的所有地形并返回一个合适的类。这是一个例子:

def determineClass(landforms):
    if all(form in landforms.values for form in ('Deltaic Plain', 'Hummock and Swale', 'Sand Dunes')):
        return 1
    elif all(form in landforms.values for form in ('Hummock and Swale', 'Sand Dunes')):
        return 2
    # etc.
    else:
        # return "default" class
        return 0

>>> df.groupby('Number').Landform.apply(determineClass)
Number
912    1
939    2
Name: Landform, dtype: int64

如果您想将这些值分配回“类”列,请使用map,如20分钟前的this question中所述:

>>> classes = df.groupby('Number').Landform.apply(determineClass)
>>> df['Class'] = df.Number.map(classes)
>>> df
            Landform  Number      Name  Class
0      Deltaic Plain     912        Lx      1
1  Hummock and Swale     912        Lx      1
2         Sand Dunes     912        Lx      1
3  Hummock and Swale     939  Woodbury      2
4         Sand Dunes     939  Woodbury      2