import urllib
from datetime import date,timedelta
import datetime
import re
list =["infy.ns","grasim.ns","idea.ns","asianpain.ns","bajaj-auto-eq.ns",
"drreddy.ns","boschltd.ns","kotakbank.ns","M&M.ns","ultracemc.ns",
"sunpharma.ns","lt.ns","acc.ns","sbin.ns","bhartiartl.ns",
"lupin.ns","reliance.ns","hdfcbank.ns","zeel.ns","ntpc.ns",
"icicibank.ns","cipla.ns","tcs.ns","bpcl.ns","heromotoc.ns"]
i=0
while i<len(list):
url="http://finance.yahoo.com/q?s="+list[i]+"&ql=1"
htmlfile = urllib.urlopen(url)
htmltext=htmlfile.read()
regex='<span id="yfs_l84_'+list[i]+'">(.+?)</span>'
pattern = re.compile(regex)
price = re.findall(pattern,htmltext)
print(price)
i=i+1
我必须从finance.yahoo.com获取价值 当我通过使用终端运行该代码然后我得到终端上的所有值,但我想将该值放在我的桌面文本文件
答案 0 :(得分:0)
最简单的方法不需要编码。只需将脚本的输出重定向到文件,例如
python yahoo_scraper.py > prices.txt
或
python yahoo_scraper.py >> prices.txt
附加到现有文件。
在Python中完成它也很容易。打开一个文件进行写入并写入:
with open('prices.txt', 'w') as price_file:
i=0
while i<len(list):
url="http://finance.yahoo.com/q?s="+list[i]+"&ql=1"
htmlfile = urllib.urlopen(url)
htmltext=htmlfile.read()
regex='<span id="yfs_l84_'+list[i]+'">(.+?)</span>'
pattern = re.compile(regex)
price = re.findall(pattern,htmltext)
print(price, file=price_file)
i=i+1
请注意,每次运行脚本时都会覆盖该文件。如果您想追加到文件的末尾,请将'w'
替换为'a'
,以追加模式打开它。
你的while循环最好写成for循环。这是一个示例 - 我假设list
已重命名为stocks
,以避免影响内置list
:
stocks = ["infy.ns","grasim.ns",....]
with open('prices.txt', 'w') as price_file:
for stock in stocks:
url = "http://finance.yahoo.com/q?s={}&q1=1".format(stock)
html = urllib.urlopen(url).read()
pattern = r'<span id="yfs_l84_{}>(.+?)</span>'.format(stock)
price = re.findall(pattern, html)
print(price, file=price_file)
您可能需要更改最后一行以打印re.findall()
返回的列表的第一个元素。