Question

我有一个包含数千个条目的文本文件，如：

@INBOOK{Abu-Lughod1991,
  chapter = {Writing against culture},
  pages = {137-162},
  title = {Recapturing anthropology},
  publisher = {School of American Research Press},
  year = {1991},
  editor = {Richard Fox},
  author = {Abu-Lughod, Lila},
  address = {Santa Fe /NM},
  abstract = {Im Zusammenhang mit der Debatte um die writing culture fomuliert AL
        eine feministische Kritik und zeigt, wie von dort doch Anregungen
        für die Reflektion der Schreibweise und Repräsentation gekommen sind.*},
  crossref = {Rabinow1986},
  keywords = {Frauen; Feminismus; Erzählung als EG; Repräsentation; Roman; Schreibtechnik;
        James Clifford; writing culture; Dialog;},
  owner = {xko},
  systematik1 = {Anth\theor\Ethnographie},
  systematik2 = {Anth\theor\Text & Ges},
  timestamp = {1995-12-02}
}

我会将关键字 - 字段中的所有分号替换为逗号。但只有在关键字字段中 - 其他字段应该不受影响：

keywords = {Frauen, Feminismus, Erzählung als EG, Repräsentation, Roman, Schreibtechnik, James Clifford, writing culture, Dialog,},

我不是程序员，也许下面的代码片段是一个很好的起点，如果有人能够完成它，我真的很感激。

outfile = open("literatur_comma.txt", "w") 
for line in open("literatur_semicolon.txt", "r"): 
    if line  # starts with "keywords" replace all semicolon with comma
        outfile.write(line) # write in new file
outfile.close()

非常感谢！

编辑：感谢您的所有答案和代码，这太棒了！我在想法中犯了一个错误，如果我使用我的代码包装器（带有outfile），那么它会创建一个包含关键字的新文件。如何使用相同的文件，并在关键字行中仅用分号替换为逗号？

Answer 1

这样的东西只适用于一行。

if line.strip().startswith('keywords'):
    line = line.replace(';',',')
outfile.write(line)

如果关键字跨越实际文本文件中的多行，则无法完成工作。

Answer 2

outfile = open("literatur_comma.txt", "w") 
for line in open("literatur_semicolon.txt", "r"): 
    if line.startswith('keywords'):  # starts with "keywords" replace all semicolon with comma
        outfile.write(line.replace(';',',')) # write in new file
outfile.close()

Answer 3

使用pyparsing

注意：这是一种方法，但是大脑不处于解析模式 - 所以这是想法而不是正确的答案......当然需要一些工作，但可能是正确的方向......

使用pyparsing的一个有点混乱的例子......（可能更好，有一些@INBOOK和wotsit检查和解析，但无论如何......）

from pyparsing import *

keywords = originalTextFor(Keyword('keywords') + '=')
values = delimitedList(Regex('[^;}]+'), ';')
values.setParseAction(lambda L: ', '.join(L))

text是你的榜样：

>>> print values.transformString(text)
@INBOOK{Abu-Lughod1991,
  chapter = {Writing against culture},
  pages = {137-162},
  title = {Recapturing anthropology},
  publisher = {School of American Research Press},
  year = {1991},
  editor = {Richard Fox},
  author = {Abu-Lughod, Lila},
  address = {Santa Fe /NM},
  abstract = {Im Zusammenhang mit der Debatte um die writing culture fomuliert AL
        eine feministische Kritik und zeigt, wie von dort doch Anregungen
        für die Reflektion der Schreibweise und Repräsentation gekommen sind.*},
  crossref = {Rabinow1986},
  keywords = {Frauen, Feminismus, Erzählung als EG, Repräsentation, Roman, Schreibtechnik, James Clifford, writing culture, Dialog;},
  owner = {xko},
  systematik1 = {Anth   heor\Ethnographie},
  systematik2 = {Anth   heor\Text & Ges},
  timestamp = {1995-12-02}

python替换;如果，行以关键字开头

3 个答案: