使用索引python在字符串中插入引号

时间:2014-10-16 19:23:16

标签: python string replace insert

我想在字符串中的日期和文本周围插入引号("")(在文件input.txt中)。这是我的输入文件:

created_at : October 9, article :   ISTANBUL — Turkey is playing a risky game of chicken in its negotiations with NATO partners who want it to join combat operations against the Islamic State group — and it’s blowing back with violence in Turkish cities. As the Islamic militants rampage through Kurdish-held Syrian territory on Turkey’s border, Turkey says it won’t join the fight unless the U.S.-led coalition also goes after the government of Syrian President Bashar Assad.
created_at : October 9, article :    President Obama chairs a special meeting of the U.N. Security Council last month. (Timothy A. Clary/AFP/Getty Images)  When it comes to President Obama’s domestic agenda and his maneuvers to (try to) get things done, I get it. I understand what he’s up to, what he’s trying to accomplish, his ultimate endgame. But when it comes to his foreign policy, I have to admit to sometimes thinking “whut?” and agreeing with my colleague Ed Rogers’s assessment on the spate of books criticizing Obama’s foreign policy stewardship.

我想在日期和文字周围加上引号如下:

created_at : "October 9", article :   "ISTANBUL — Turkey is playing a risky game of chicken in its negotiations with NATO partners who want it to join combat operations against the Islamic State group — and it’s blowing back with violence in Turkish cities. As the Islamic militants rampage through Kurdish-held Syrian territory on Turkey’s border, Turkey says it won’t join the fight unless the U.S.-led coalition also goes after the government of Syrian President Bashar Assad".
created_at : "October 9", article :    "President Obama chairs a special meeting of the U.N. Security Council last month. (Timothy A. Clary/AFP/Getty Images)  When it comes to President Obama’s domestic agenda and his maneuvers to (try to) get things done, I get it. I understand what he’s up to, what he’s trying to accomplish, his ultimate endgame. But when it comes to his foreign policy, I have to admit to sometimes thinking “whut?” and agreeing with my colleague Ed Rogers’s assessment on the spate of books criticizing Obama’s foreign policy stewardship".

这是我的代码,它找到逗号的索引(日期之后的,)和文章的索引,然后使用这些,我想在日期周围插入引号。另外我想在文本周围插入引号,但是如何做到这一点?

f = open("input.txt", "r")
for line in f:
    article_pos = line.find("article")
    print article_pos
    comma_pos = line.find(",")
    print comma_pos

2 个答案:

答案 0 :(得分:1)

虽然可以使用find这样的低级操作和切片来执行此操作,但这并不是简单或惯用的方法。

首先,我将告诉你如何按照自己的方式去做:

comma_pos = line.find(", ")
first_colon_pos = line.find(" : ")
second_colon_pos = line.find(" : ", comma_pos)
line = (line[:first_colon_pos+3] + 
        '"' + line[first_colon_pos+3:comma_pos] + '"' +
        line[comma_pos:second_colon_pos+3] +
        '"' + line[second_colon_pos+3:] + '"')

但你可以更容易地将线分成比特,将这些比特混合在一起,并将它们重新组合在一起:

dateline, article = line.split(', ', 1)
key, value = dateline.split(' : ')
dateline = '{} : "{}"'.format(key, value)
key, value = article.split(' : ')
article = '{} : "{}"'.format(key, value)
line = '{}, {}'.format(dateline, article)

然后你可以把重复的部分重构成一个简单的函数,这样你就不必两次写同样的东西(如果你以后需要写四次就可以派上用场)。 / p>

使用正则表达式更容易,但对于新手来说可能不那么容易理解:

line = re.sub(r'(.*?:\s*)(.*?)(\s*,.*?:\s*)(.*)', r'\1"\2"\3"\4"', line)

这可以通过捕获一个组中的第一个:(以及它后面的任何空格)的所有内容,然后从第二组到第一组中的第一个逗号的所有内容,依此类推:

(.*?:\s*)(.*?)(\s*,.*?:\s*)(.*)

Regular expression visualization

Debuggex Demo

请注意,正则表达式的优势在于我可以说"之后的任何空格"非常简单,在使用findsplit时,我必须明确指定冒号两侧只有一个空格,而逗号之后只有一个空格,因为搜索" 0或更多空格& #34;没有某种方式表达它就像\s*

更难

答案 1 :(得分:0)

您还可以查看正则表达式库re。 E.g。

>>> import re
>>> print(re.sub(r'created_at:\s(.*), article:\s(.*)',
...              r'created_at: "\1", article: "\2"',
...              'created_at: October 9, article: ...'))
created_at: "October 9", article: "..."

re.sub的第一个参数是您要匹配的模式。 parens ()捕获匹配项,可以在\1的第二个参数中使用。第三个参数是文本行。