只能将一部分数据插入到csv中

时间:2019-04-13 11:51:40

标签: python csv web-scraping

我想将所有从Web抓取的数据插入到csv中,但是只能插入部分信息!

为什么会这样

我应该使用MySQL数据库而不是csv 我想使用flask将这个csv文件连接到Web浏览器。 另外,当我在“ import csv”之后的函数“ input_to_output(site,oc,id,place,pg)”的顶部插入许多类似“ import re”的导入文本时,我无法导入其中的大多数。这是因为csv方法会执行某些操作吗?

我最想知道的是第一个为何只能插入部分数据的原因。

def townwork_page(oc,place,pg):
    import csv
    import re
    from bs4 import BeautifulSoup
    import requests
    name_list=[]
    ac={"":"","北海道":"101","青森":"110","東京":"041","茨城":"047","千葉":"043","埼玉":"044","群馬":"045","神奈川":"042","栃木":"046",\
    "秋田":"110","岩手":"109","山形":"108","宮城":"106","福島":"107","静岡":"600","愛知":"081","岐阜":"085","山梨":"118","新潟":"117","福井":"124",\
    "長野":"116","滋賀":"064","富山":"122","石川":"123","三重":"084","奈良":"065","和歌山":"066","京都":"063","大阪":"061","兵庫":"062","鳥取":"133",\
    "島根":"134","岡山":"132","佐賀":"441","山口":"135","広島":"131","香川":"136","徳島":"138","愛媛":"137","高知":"139","福岡":"440","長崎":"442",\
    "大分":"444","熊本":"443","宮崎":"445","鹿児島":"446","沖縄":"447"}
    jc={"":"","IT/コンピュータ":"005","営業":"009","専門職/その他":"017","レジャー/エンタメ":"011","接客/サービス":"010","物流/配送":"012","建築/土木":"013","教育":"014"\
    ,"医療/介護/福祉":"015","飲食/フード":"001","販売":"002","事務":"003","総務/企画":"004","IT/コンピュータ":"005","軽作業":"018","芸能":"008","マスコミ/出版":"007",\
    "工場/製造":"019"}
    if any([oc not in jc, place not in ac]):
        print("Input collectlly!")
        return ""
    if type(oc)==str:
        oc=oc,""
    if type(place)==str:
        place=place,""
    url_form="https://townwork.net/joSrchRsltList/?"
    if oc:
        for i in oc:
            if i:
                url_form+="jc="+jc[i]+"&"
    if place:
        for j in place:
            if j:
                url_form+="ac="+ac[j]+"&"
    if pg:
        for l in range(1,pg+1):
            url_form+="pg="+str(l)+"&"
            if url_form[-1]=="&":
                url_form=url_form[:-1]
                res=requests.get(url_form)
                soup=BeautifulSoup(res.text,"html.parser")
                company_name=soup.findAll(class_="job-lst-main-ttl-txt")
                for i in company_name:
                    if i.get_text().split() not in name_list:
                        name_list.append(i.get_text().split())
    return name_list
def input_to_output(site,oc,id,place,pg):
    import csv
    if site=="タウンワーク":
        with open ("output.csv","w+",newline='',encoding="utf-8") as csvFile:
            try:
                writer=csv.writer(csvFile,lineterminator='\n')
                writer.writerow(["information"])
                for line in townwork_page(oc,place,pg):
                    writer.writerow(line)
            finally:
                csvFile.close()
    elif site=="食べログ":
        with open ("output.csv","w+",newline='') as csvFile:
            try:
                writer=csv.writer(csvFile)
                writer.writerow(("information"))
                writer.writerow((tabelog_page(place,pg)))
            finally:
                csvFile.close()
    else:
        print("write collectlly!")

input_to_output("タウンワーク","営業","","北海道",10)

抓取网站是“タウンワーク”。 有语法问题吗?

0 个答案:

没有答案