大熊猫替换的问题

时间:2021-07-18 19:25:09

标签: pandas dataframe replace

我无法将葡萄牙语中的特殊字符(例如 é)替换为列中的 UTF-8 代码,例如:

https://www.linkedin.com/in/andré-pieroni-mesa.. 之前。 https://www.linkedin.com/in/andr%c3%a9-pieroni-mesa.. 之后。

import numpy as np

import pandas as pd 

import json

dfgetprospect=pd.read_excel(r'C:\Users\PICHAU\Desktop\Cargo Sapiens\Inteligencia Comercial\Upload empresas\Get Prospect.xlsx')

df= pd.read_csv(r'C:\Users\PICHAU\Desktop\Curso Python\Encoding links.csv', delimiter=';')

df=df[['Character','UTF-8']]

df.set_index(keys=['Character'], inplace=True)

​

lista = df.to_dict()

lista=lista['UTF-8']

lista

#lista = json.dumps(lista)

#lista=str(lista).replace("{","").replace("}","")

dfgetprospect['Linkedin Url'].str.replace({lista})

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-41-fd99a3f16c6e> in <module>
----> 1 dfgetprospect['Linkedin Url'].str.replace({lista})

TypeError: unhashable type: 'dict'

列表中的示例: {'空格':'%20', “!”:“%21”, '"': '%22', “#”:“%23”, “$”:“%24”, “%”:“%25”, '&': '%26', "'": '%27', '(': '%28', ')': '%29', '*': '%2a', 来自 df 的示例: û %c3%bb ü %c3%bc ý %c3%bd þ %c3%be ÿ %c3%bf

1 个答案:

答案 0 :(得分:0)

您可以使用 urllib.parse.quote 模块中的 urllib 函数

df = pd.DataFrame({'urls': ['http://example.org/andré-pieroni-mesa',
                            'http://example.org/!"#$%&\'()*ûüýþÿ']})

import urllib
df['urls_quoted'] = df['urls'].apply(urllib.parse.quote)

输入:

                                    urls
0  http://example.org/andré-pieroni-mesa
1     http://example.org/!"#$%&'()*ûüýþÿ

输出:

                                    urls                                                                        urls_quoted
0  http://example.org/andré-pieroni-mesa                                       http%3A//example.org/andr%C3%A9-pieroni-mesa
1     http://example.org/!"#$%&'()*ûüýþÿ  http%3A//example.org/%21%22%23%24%25%26%27%28%29%2A%C3%BB%C3%BC%C3%BD%C3%BE%C3%BF