创建一个将读取csv文件的python脚本,并使用该输入从finviz.com网站抓取数据,然后将数据导出到csv文件中

时间:2019-03-16 05:09:00

标签: python csv

我正在尝试从csv文件提取股票列表,将每个股票行情收录器上传到finviz.com,然后将数据导出到csv文件。我是Python编程的新手,但我知道这会对我和其他人有所帮助。这就是我到目前为止所得到的。

    import csv
import urllib.request
from bs4 import BeautifulSoup

with open('shortlist.csv', 'r') as csvfile:
    reader = csv.reader(csvfile, delimiter=',')
    name = None
    for row in reader:
        if row[0]:
            name = row[0]
        print(name)
write_header = True

sauce = print(name)
soup = BeautifulSoup(sauce.text, 'html.parser')

print(soup.title.text)

symbols = name
""""
print(symbols)
"""
URL_BASE = "https://finviz.com/quote.ashx?t="

with open('output.csv', 'w', newline='') as file:
    writer = csv.writer(file)

    for ticker in symbols:
        URL = URL_BASE + ticker
        try:
            fpage = urllib.request.urlopen(URL)
            fsoup = BeautifulSoup(fpage, 'html.parser')

            if write_header:
                # note the change
                writer.writerow(['ticker'] + list(map(lambda e: e.text, fsoup.find_all('td', {'class': 'snapshot-td2-cp'}))))
                write_header = False

            # note the change
            writer.writerow([ticker] + list(map(lambda e: e.text, fsoup.find_all('td', {'class': 'snapshot-td2'}))))
        except urllib.request.HTTPError:
            print("{} - not found".format(URL))

我缺少csv文件“ output.csv”上的输出。我只从输入的csv文件“ shortlist”中看到数据。领带或链接未正确链接。我花了几周时间研究/研究如何执行此操作。非常感谢您的帮助。

1 个答案:

答案 0 :(得分:0)

import csv
import urllib.request
from bs4 import BeautifulSoup

with open('shortlist.csv', 'r') as csvfile:
    reader = csv.reader(csvfile, delimiter=',')
    name = None
    for row in reader:
        if row[0]:
            name = row[0]
        print(name)
write_header = True

#sauce = print(name)
#soup = BeautifulSoup(sauce.text, 'html.parser')

#print(soup.title.text)

symbols = name
""""
print(symbols)
"""
URL_BASE = "https://finviz.com/quote.ashx?t="

with open('output.csv', 'w', newline='') as file:
    writer = csv.writer(file)

    for ticker in symbols:
        URL = URL_BASE + ticker
        try:
            fpage = urllib.request.urlopen(URL)
            fsoup = BeautifulSoup(fpage, 'html.parser')

            if write_header:
                # note the change
                writer.writerow(['ticker'] + list(map(lambda e: e.text, fsoup.find_all('td', {'class': 'snapshot-td2-cp'}))))
                write_header = False

            # note the change
            writer.writerow([ticker] + list(map(lambda e: e.text, fsoup.find_all('td', {'class': 'snapshot-td2'}))))
        except urllib.request.HTTPError:

这是输出: enter image description here