使用不同的线程消除JSON文件中不需要的字符(Python)

时间:2017-11-12 17:06:35

标签: python json multithreading replace python-multithreading

在我的python文件中,我创建了一个名为Download的类。该类的代码是:

import requests, json, os, pytube, threading

class Download:


    def __init__(self, url, json=False, get=False, post=False, put=False, unwanted="", wanted="", unwanted2="", wanted2="", unwanted3="", wanted3=""):
        self.url = url
        self.json = json
        self.get = get
        self.post = post
        self.put = put
        self.unwanted = unwanted
        self.wanted = wanted
        self.unwanted2 = unwanted2
        self.wanted2 = wanted2
        self.unwanted3 = unwanted3
        self.wanted3 = wanted3 

    def downloadJson(self):
        if self.get is True:
            downloadJson = requests.get(self.url)
            downloadJson = str(downloadJson.content)
            downloadJsonS = str(downloadJson) # This saves the downloaded JSON file as string

            if self.json is True:
                with open("downloadedJson.json", "w") as writeDownloadedJson:
                    writeDownloadedJson.write(json.dumps(downloadJson))
                    writeDownloadedJson.close()

                with open("downloadedJson.json", "r") as replaceUnwanted:
                    a = replaceUnwanted.read()
                    x = a.replace(self.unwanted, self.wanted)
                    # y = a.replace(self.unwanted2, self.wanted2)
                    # z = a.replace(self.unwanted3, self.wanted3)
                    print(x)

                with open("downloadedJson.json", "w") as writeUnwanted:
                    # writeUnwanted.write(y)
                    # writeUnwanted.write(z)
                    writeUnwanted.write(x)

            else:
                # with open("downloadedJson.json", "w")as j:
                #     j.write(downloadJsonS)
                #     j.close()
                pass

我自己写了这些,我理解它是如何工作的。我的目标是删除下载后JSON文件中出现的所有不需要的字符,例如: \\ n \' \ n 。我在__init__()函数中有很多参数,比如__init__(unwanted="", wanted="", unwanted2="") 等。

这样,当向unwanted参数添加任何字符时,例如: \\ n ,它应该用空格替换所有这些字符。这是正确完成的,并且有效。作为注释的代码行是我正在使用的代码行,但这不起作用。它只会替换 1 参数中的字符。

有没有办法使用线程为每个参数传递每个不需要的字符。如果无法使用线程,还有其他选择吗?

顺便说一句,我正在执行课程的文件:( main.py ):

from downloader import Download

with open("url.txt", "r")as url:
    x = Download(url.read(), get=True, json=True, unwanted="\\n")
    x.downloadJson()

感谢

1 个答案:

答案 0 :(得分:0)

您可以一个接一个地应用替换:

x = a.replace(self.unwanted, self.wanted)
x = x.replace(self.unwanted2, self.wanted2)
x = x.replace(self.unwanted3, self.wanted3)

您也可以将替换链接在一起,但很快就会难以阅读:

x = a.replace(...).replace(...).replace(...)

顺便说一句,而不是多个unwantedNwantedN, 使用(unwanted, wanted)对列表可能要容易得多,如下所示:

def __init__(self, url, json=False, get=False, post=False, put=False, replacements=[]):
    self.url = url
    self.json = json
    self.get = get
    self.post = post
    self.put = put
    self.replacements = replacements

然后你可以在循环中执行替换:

x = a
for unwanted, wanted in self.replacements:
    x = x.replace(unwanted, wanted)