Question

我正在进行一些webscraping，并希望删除部分字符串。

PlayerDataHeadings = soup.select(".auflistung th")
PlayerDataItems = soup.select(".auflistung td")

    PlayerData = pd.DataFrame(
        {'PlayerDataHeadings': PlayerDataHeadings,
         'PlayerDataItems': PlayerDataItems
        })

上面的代码创建了一个数据框并按预期工作。在＆＃39; PlayerDataHeadings＆＃39;列在开头有一个不需要的<th>，在我要删除的每个值的末尾有</th>。

我使用的代码是：

PlayerData['PlayerDataHeadings'].replace(
    to_replace['<th>', ':</th>'],
    value='',
    inplace=True
    )

这将返回＆＃34; NameError：name＆＃39; to_replace＆＃39;没有定义＆＃34;作为一个错误。

关于如何解决这个或更好的替代方案的任何想法都会很棒

Answer 1

好像你错过了=：

to_replace=

或省略关键字并添加regex=True：

PlayerData['PlayerDataHeadings'].replace(['<th>', ':</th>'], '', inplace=True, regex=True)

样品：

PlayerData = pd.DataFrame({'PlayerDataHeadings':['<th>a:</th>','g']})
print (PlayerData)
  PlayerDataHeadings
0        <th>a:</th>
1                  g
  PlayerDataHeadings

PlayerData['PlayerDataHeadings'].replace(['<th>', ':</th>'], '', inplace=True, regex=True)
print (PlayerData)
  PlayerDataHeadings
0                  a
1                  g

使用所有关键字：

PlayerData['PlayerDataHeadings'].replace(to_replace=['<th>', ':</th>'],
                                         value='', 
                                         inplace=True, 
                                         regex=True)
print (PlayerData)
  PlayerDataHeadings
0                  a
1                  g

Pandas字符串替换错误Python

1 个答案: