我有一个Pandas系列,需要清除反斜杠(terms3,包含成千上万个类似的记录):
terms3[1] = 'blue-eyed soul\' \'pop rock\' \'blues-rock\' \'beach music\' \'soft rock\' \'soul\' \'classic rock\' \'oldies\' \'power pop\' \'psychedelic rock\' \'rock\' \'sunshine pop\' \'blues\' \'singer-songwriter\' \'pop\' \'united states\' \'male vocalist\' "rock \'n roll" \'60s\' \'am pop\' \'r&b\' \'american\' \'male\' \'psychedelic\' \'classic\' \'vocal\' \'americana\' \'game music\' \'mod\' \'trippy\' \'french\' \'germany\' \'canada\' \'70s\' \'belgium\' \'cover\' \'nederland\' \'confident'
如果我输入type(terms3 [1]),我会得到str
此代码有效:
import re
regex = r"\\"
test_str = "'blue-eyed soul\\' \\'pop rock\\' \\'blues-rock\\' \\'beach music\\' \\'soft rock\\' \\'soul\\' \\'classic rock\\' \\'oldies\\' \\'power pop\\' \\'psychedelic rock\\' \\'rock\\' \\'sunshine pop\\' \\'blues\\' \\'singer-songwriter\\' \\'pop\\' \\'united states\\' \\'male vocalist\\' \"rock \\'n roll\" \\'60s\\' \\'am pop\\' \\'r&b\\' \\'american\\' \\'male\\' \\'psychedelic\\' \\'classic\\' \\'vocal\\' \\'americana\\' \\'game music\\' \\'mod\\' \\'trippy\\' \\'french\\' \\'germany\\' \\'canada\\' \\'70s\\' \\'belgium\\' \\'cover\\' \\'nederland\\' \\'confident'"
#test_str = terms3[1]
matches = re.finditer(regex, test_str, re.MULTILINE)
for matchNum, match in enumerate(matches, start=1):
print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
for groupNum in range(0, len(match.groups())):
groupNum = groupNum + 1
print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))
但是,如果我取消注释#test_str = terms3[1]
并运行代码,它将不会返回任何内容。甚至认为test_str是term3 [1]的副本。
有什么办法可以解决这个问题吗?