Question

删除引文的过程是什么？

例如，文本文件可能包含以下变体：

1.Smith & Smith 2016 stressed "Johnny Johnny Yes Papa" p(5).

预期产出：强调。

2. Smith, Smith & Smith 2016

预期输出： - 没有，整个刺痛被删除。

3. Smith et al. 2015

预期输出： - 没有，整个刺痛被删除。

4. [18]

预期输出： - 没有，整个刺痛被删除。

5  (Smith & Smith 2016: 326)

预期输出： - 没有，整个刺痛被删除。

只需擦拭干净就好了。因为当你进行频率分析时，引用会使事情略有不同。

非常感谢您的意见。

Answer 1

>>> import re
>>> a = """1.Smith & Smith 2016 stressed "Johnny Johnny Yes Papa" p(5). 2. Smith, Smith & Smith 2016 3. Smith et al. 2015 4. [18] 5 (Smith & Smith 2016: 326)"""
>>> re.sub(r'"[^"]*"','***', a)
'1.Smith & Smith 2016 stressed *** p(5). 2. Smith, Smith & Smith 2016 3. Smith et al. 2015 4. [18] 5 (Smith & Smith 2016: 326)'

它也适用于字符串中的更多引号：

>>> b = """Hallo "H.urz" Welt "hurz" Wie geht's?"""
>>> re.sub(r'"[^"]*"','***', b)
"Hallo *** Welt *** Wie geht's?"

但也许我误解了这个问题。也许您想更详细地在问题中指定您的要求。

如何使用python删除文本引用

1 个答案: