我编写了一个函数,用于从Twitter数据集中的推文中删除主题标签。我正在尝试使用map函数在pandas数据框中的推文上运行它。但我不断收到此错误:“ TypeError:预期的字符串或类似字节的对象”
我已经搜索了错误消息,并在这里查看了很多类似的问题,但是到目前为止,仍然没有任何效果。是否需要将tweet对象转换为其他类型?
def remove_hashtags(tweet):
no_hashtags = []
if len(re.findall("(#[^#\s]+)", tweet)) > 0:
tweet = re.sub("(#[^#\s]+)", "", tweet)
no_hashtags.append(tweet)
return no_hashtags[0]
text_all_removed = text_no_links.map(remove_hashtags)
TypeError Traceback (most recent call last)
<ipython-input-143-84c94835f61a> in <module>
----> 1 text_all_removed = text_no_links.map(remove_hashtags)
~/venv/lib/python3.6/site-packages/pandas/core/series.py in map(self, arg, na_action)
3380 """
3381 new_values = super(Series, self)._map_values(
-> 3382 arg, na_action=na_action)
3383 return self._constructor(new_values,
3384 index=self.index).__finalize__(self)
~/venv/lib/python3.6/site-packages/pandas/core/base.py in _map_values(self, mapper, na_action)
1216
1217 # mapper is a function
-> 1218 new_values = map_f(values, mapper)
1219
1220 return new_values
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
<ipython-input-142-5e6df146cb08> in remove_hashtags(tweet)
3 def remove_hashtags(tweet):
4 no_hashtags = []
----> 5 if len(re.findall("(#[^#\s]+)", tweet)) > 0:
6 tweet = re.sub("(#[^#\s]+)", "", tweet)
7 no_hashtags.append(tweet)
~/venv/lib64/python3.6/re.py in findall(pattern, string, flags)
220
221 Empty matches are included in the result."""
--> 222 return _compile(pattern, flags).findall(string)
223
224 def finditer(pattern, string, flags=0):
TypeError: expected string or bytes-like object
text_no_links.head()的输出
0 [ #bbcqt Remoaners on about post Brexit racial...
1 [@sarahwollaston Shut up, you like all remoane...
2 [ what have the Brextremists ever done for us ...
3 [ Remoaner in bizarre outburst ]
4 [ Anyone who disagrees with brexit is called n...
Name: text, dtype: object