Question

我正在逐行解析数据，如何在循环中更新数据帧单元的值（读取值，对其进行解析，将其写入另一列n）

我尝试了以下代码

data = pd.read_csv("MyNames.csv") 


data["title"] = ""
i = 0
for row in data.iterrows():

name = (HumanName(data.iat[i,1]))
print(name)
data.ix['title',i] = name["title"]
i = i + 1
data.to_csv('out.csv')

我希望以下

name = "Mr John Smith"
              | Title
Mr John Smith | Mr

感谢所有帮助！

编辑：我意识到我可能不需要迭代。如果我可以为列中的所有行调用该函数，然后将结果转储到另一列中，这将更容易-例如SQL更新语句。谢谢

Answer 1

假定HumanName是一个函数或任何接受字符串并返回所需字典的函数。 无法从此处测试此代码，但您掌握了要点

data['title'] = data['name'].apply(lambda name: HumanName(name)['title'])

编辑，我之所以使用row[1]是因为您的data.iat[i,1]可能实际上需要将索引确定为0而不是1不确定

Answer 2

您可以尝试.apply

def name_parsing(name):
    "This function parses the name anyway you want"""
    return HumanName(name)['title']

# with .apply, the function will be applied to every item in the column
# the return will be a series. In this case, the series will be attributed to 'title' column
data['title'] = data['name'].apply(name_parsing)

另外，正如我们在下面讨论的那样，另一个选择是将HumanName的实例保留在数据框中，因此，如果以后需要它的其他信息，则无需实例化和解析名称。再次（在大数据帧上，字符串操作可能非常慢）。
如果是这样，解决方案的一部分将是创建一个新列。之后，您将从中获得['title']属性：

# this line creates a HumanName instance column
data['HumanName'] = data['name'].apply(lambda x: HumanName(x))
# this lines gets the 'title' from the HumanName object and applies to a 'title' column
data['title'] = data['HumanName'].apply(lambda x: x['title'])

更新数据框中的单元格值

2 个答案: