CSV文件中行的匹配字符串

时间:2018-08-29 15:27:24

标签: python csv

该脚本几乎可以正常工作。但是,它永远不会匹配,并且当匹配时,值是不正确的。示例:

no match
Lower 117, $331.50, F, 8, 193
Upper 218, $155.00, AA, 8, 195

match
Floor 6, $273.00, N, 2, 195
SECTION,PRICE,ROW,QTY,DYSLSTED 

所以我不确定为什么它不起作用。第一次加载html文件中的所有值后,该程序应仅输出match到偶数列表,因为它们都在csv中。但是当我在当前配置中运行它时,结果相反。

HTML文件eagles.htmlhere

这是我的剧本:

import os
import sys
from bs4 import BeautifulSoup
import lxml.html as lh
import csv


soup = BeautifulSoup(open("eagles.html"), "lxml")
###################################################################
variable = 'test_csv_1' ########DELETE
dir_path = os.path.dirname(os.path.realpath(__file__))
file_path = (dir_path+'\Sheets')
try:
    os.makedirs(file_path)
except:
    pass
#######################
for mytable in soup.find_all('table'):
    for trs in mytable.find_all('tr'):
        tds = trs.find_all('td')
        row1 = [elem.text.strip() for elem in tds]
        row = str(row1)
        cool = row.replace("[", "")
        coolp = cool.replace("]", "")
        cool2 = coolp.replace("'", "")
        cool3 = cool2.replace(" , ", "")
        row = cool3
        rowtest = (row.split(','))
        if len(rowtest) != 5:
            rowtest = ['NULL', 'NULL', 'NULL', 'NULL', 'NULL']
        ###TABLE STUFF###
        rowtest0 = rowtest[:4] # LISTING WITHOUT DAYS LISTED
        rowtest1 = rowtest[0:1] # SECTION LOCATION
        rowtest2 = rowtest[1:2] # TICKET PRICE
        rowtest3 = rowtest[2:3] # ROW
        rowtest4 = rowtest[3:4] # TICKET QTY  
        rowtest5 = rowtest[4:5] # DAYS LISTED
        ###TABLE STUFF#
        ###CREATE CSV HEADER###
        with open(file_path+'\\'+variable+'.csv', 'a+') as headercsv:
            if os.stat(file_path+'\\'+variable+'.csv').st_size == 0:
                writer = csv.writer(headercsv)
                writer.writerow(["SECTION", "PRICE", "ROW", "QTY", "DYSLSTED"])
                print('CREATED HEADERS FOR NEW FILE')
            else:
                pass
        ###WRITE TO CSV###
        with open(file_path+'\\'+variable+'.csv', 'r') as rowin:
            if rowtest == ['NULL', 'NULL', 'NULL', 'NULL', 'NULL']:
                continue
            else:
                pass
            for boogie in rowin:
                if row in boogie:
                    print(row)
                    print(boogie)
                    print('match')
                    break
                else:
                    print(row)
                    print(boogie)
                    print('no match')
                    with open(file_path+'\\'+variable+'.csv', 'a+') as ruts:
                        writer = csv.writer(ruts)
                        writer.writerow(rowtest)

0 个答案:

没有答案