在HTML txt文件夹中查找和替换参数

时间:2019-01-22 16:18:04

标签: python html python-3.x replace export-to-csv

我有一个数据库,我不知道您是否将其称为数据库,但是它已经到了那里,我们有许多来自混乱的旧公司文件的URL。我想查找并使用公司新的横幅/ HTML信息替换多个URL的.txt文件。该脚本将与其他用Python编写的数据框架程序(CVS解析器)结合使用。这是我的代码。为什么我的.TXT文件没有被替换?

我尝试查看我知道是字符串的读取对象的输出。以及研究函数replace()的功能。

    import json
    import csv


    class HTML_Parser:
        def _init_(self, data):
            data.self = data

    F = open(r"C:\Users\Ultrarev\Desktop\Emeran-Parser\HTMLtoBeReplaced.txt", 
    "r")

    str1 = F.read()







 str1.replace("http://www.ultrarev.com/processedimages/ebay_layout/banner750x150     .jpg","https://xcart.amcoautoparts.com/ebay_layout/ebay_tmp_top.jpg")

 str1.replace("http://www.ultrarev.com/processedimages/manufacturers/Ultrarev- 
   Footer.jpg","https://xcart.amcoautoparts.com/manufacturers/Ultrarev-F 
   ooter.jpg")
    str1.replace("ULTRAREV INC.","AMCO Auto Parts, LLC")
    str1.replace("ULTRAREV INC",'AMCO Auto Parts, LLC')
    str1.replace("Should you have any question  please call 1(877) 858- 
   7272.","Should you have any question, please message us!")
    str1.replace("120 Central Ave. Farmingdale  NJ 07727"," ")
    str1.replace("CALL FOR CUSTOMER SUPPORT"," ")
    str1.replace("Please Call us toll free 1-877-858-7272!","Should you have any 
    question, please message us!")
    str1.replace('<a style="color: #000000; font-weight:bold; text- 
    decoration:none" href="tel:732-938-3999">'," ")
    str1.replace('<a style="color: #000000; font-weight: bold; text-decoration: 
    none" href="tel:1-877-858-7272">1-877-858-7272</a>',' ')
    str1.replace('<a style="color: #000000; font-weight: bold; text-decoration: 
    none" href="tel:732-938-3999">732-938-3999</a>',' ')
    str1.replace(' OR ',' ')
    str1.replace('OEM (Match Case) - Find them in both Title and Description', 
    'OE')
    str1.replace("http://www.ultrarev.com", "https://xcart.amcoautoparts.com")
    str1.replace('http://www.ultrarev.com/processedimages/manufacturers/ralco- 
   rz- 
     logo_texture.png', 'http://amcoautoparts.com/images/P/RalcoRZLogo.png')



print(str1)

我希望它返回具有替换值的字符串,而不是返回先前字符串的值。

2 个答案:

答案 0 :(得分:0)

str.replace()不会修改原始字符串-而是返回完成替换的新字符串。

您必须挂断电话。像这样:

new_str = original_str.replace(...).replace(...).replace(...)

我还建议您使用元组来存储替换对,例如:

replaces = (('from1', 'to1'), ('from2', 'to2'), ('from3', 'to3'))
for src, dest in replaces:
  str1 = str1.replace(src, dest)
print (str1)

答案 1 :(得分:0)

我只是附加了所有替换项。

import json
import csv


class HTML_Parser:
    def _init_(self, data):
        data.self = data

F = open(r"C:\Users\Ultrarev\Desktop\Emeran-Parser\HTMLtoBeReplaced.txt")

str1 = F.read()

str2 = str1.replace("http://www.ultrarev.com/processedimages/ebay_layout/banner750x150.jpg","https://xcart.amcoautoparts.com/ebay_layout/ebay_tmp_top.jpg").replace("http://www.ultrarev.com/processedimages/manufacturers/Ultrarev-Footer.jpg","https://xcart.amcoautoparts.com/manufacturers/Ultrarev-Footer.jpg").replace("ULTRAREV INC.","AMCO Auto Parts, LLC").replace("ULTRAREV INC",'AMCO Auto Parts, LLC').replace("Should you have any question  please call 1(877) 858-7272.","Should you have any question, please message us!").replace("120 Central Ave. Farmingdale  NJ 07727"," ").replace("CALL FOR CUSTOMER SUPPORT"," ").replace("Please Call us toll free 1-877-858-7272!","Should you have any question, please message us!").replace('<a style="color: #000000; font-weight:bold; text-decoration:none" href="tel:732-938-3999">'," ").replace('<a style="color: #000000; font-weight: bold; text-decoration: none" href="tel:1-877-858-7272">1-877-858-7272</a>',' ').replace('<a style="color: #000000; font-weight: bold; text-decoration: none" href="tel:732-938-3999">732-938-3999</a>',' ').replace(' OR ',' ').replace('OEM (Match Case) - Find them in both Title and Description', 'OE').replace("http://www.ultrarev.com", "https://xcart.amcoautoparts.com").replace('http://www.ultrarev.com/processedimages/manufacturers/ralco-rz-logo_texture.png', 'http://amcoautoparts.com/images/P/RalcoRZLogo.png')



print(str2)

这似乎是目前最好的解决方案。