根据另一列更改熊猫列的内容

时间:2020-07-08 21:17:55

标签: python python-3.x pandas dataframe mapping

我有一个类似于以下内容的熊猫数据框:

Neighborhood      High School      ...
WOODLEY           LIBERTY
WOODLEY 
COUNTRY CLUB  
COUNTRY CLUB      HERITAGE
COUNTRY CLUB      HERITAGE
COUNTRY CLUB      TUSCORORA
...

如您所见,某些条目为空白或不正确,因此我正在尝试解决这些问题。我首先创建了如下函数。

def cleanHS(dat):
    if dat.Neighborhood == "WOODLEY":
        dat["High School"] == "LIBERTY"
    elif dat.Neighborhood == "COUNTRY CLUB":
        dat["High School"] == "HERITAGE"
    ...

    return dat

然后我调用该函数。

dirty["High School"] = dirty["High School"].map(cleanHS)

这是我收到属性错误的地方: AttributeError: 'str' object has no attribute 'Neighborhood'

我该如何解决?

2 个答案:

答案 0 :(得分:0)

这里不需要循环。您可以创建从NeighbourhoodmappingHigh School

的更正值的键值对字典
d = {"WOODLEY": "LIBERTY", "COUNTRY CLUB": "HERITAGE"}
dirty['High School'] = dirty['Neighborhood'].map(d)

输出

Neighborhood      High School
WOODLEY           LIBERTY
WOODLEY           LIBERTY
COUNTRY CLUB      HERITAGE
COUNTRY CLUB      HERITAGE
COUNTRY CLUB      HERITAGE
COUNTRY CLUB      HERITAGE

答案 1 :(得分:-1)

这是正确的答案。使用字典进行映射很容易(如另一个答案所示)。

cleanHS = {"WOODLEY": "LIBERTY", "COUNTRY CLUB": "HERITAGE", ...}

但是,为了正确地映射两列,必须包括邻居列。这是因为您正在将“高中”中的值映射到其他值,但是映射值的起始列应该是“邻居”。

dirty["High School"] = dirty["Neighborhood"].map(cleanHS)