Question

我正在尝试运行一个 python 脚本来解析一些文本并保存所有以“$”开头的单词。

我正在运行的代码行是：

for post in posts
   ticker = re.findall(r'\b[$]\w+', post.title)
   tickers.append([ticker])

当我运行这个脚本时，输出是：

[[[]], [[]], [[]], [[]], [[]], [[]], [[]], [[]], [[]]]

我认为我的正则表达式中只是一个错误，但我似乎无法在任何地方找到解决方案。

Answer 1

看看re。 “$”是一个特殊字符，用于标记字符串的结尾。你必须逃避它。

可能是您要找的：

ticker = [re.findall(r'\$\w+', post.title) for post in posts]

如果你想排除空列表/日志

# bool([]) == False
# bool([...]) == True
ticker = list(filter(bool, re.findall(r'\$\w+', post.title) for post in posts))