Question

我正在尝试使用python脚本打开并解析Json文件，并在根据需要格式化后将其内容写入另一个Json文件。现在我的源Json文件有 /“字符我想用空白替换它。我在解析或创建新闻文件时没有任何问题，只有问题是该字符不会被空格替换。我该怎么做。之前我已经完成了同样的任务，但那时文档中没有这样的字符。

这是我的代码

doubleQuote = "\""


try:

    destination = open("TodaysHtScrapedItemsOutput.json","w") # open JSON file for    output
except IOError:
    pass

with open('TodaysHtScrapedItems.json') as f: #load json file
    data = json.load(f)
print "file successfully loaded"
for dataobj in data:
    for news in data[cnt]["body"]:
        news = news.encode("utf-8")
        if(news.find(doubleQuote) != -1): # if doublequotes found in first body tag
        #   print "found double quote"
            news.replace(doubleQuote,"")
        if(news !=""):
            my_news = my_news +" "+ news

    destination.write("{\"body\":"+ "\""+my_news+"\"}"+"\n")
    my_news = ""
    cnt= cnt + 1

Here is how the file looks and the quotes near the red marked text should disappear

Answer 1

有些事情要尝试：

您应该将json文件写为二进制文件，因此“w”变为“wb”，您需要添加“rb”。

您可以将搜索字符串定义为unicode，其中包含：

doubleQuote = u'"'

您可以使用此命令查找字符的整数值。

ord(u'"')

我得到34作为回应。反向函数是chr（34）。您是否正在寻找与json包含的双引号相同的双引号？有关详细信息，请参阅here。

你不需要if循环来检查新闻是否包含'“'。对'新闻'进行替换就足够了。

尝试以下步骤，如果仍然无效，请告诉我。

Answer 2

str.replace不会更改原始字符串。因此您需要将字符串分配回news。

    if(news.find(doubleQuote) != -1): # if doublequotes found in first body tag
    #   print "found double quote"
        news = news.replace(doubleQuote,"")

Unicode编码的一些问题

2 个答案: