我在将数据存储到CSV文件时遇到问题,但是存储不正确。
import urllib2
import csv
from bs4 import BeautifulSoup
url='http://contentlinks.dionglobal.in/ib/closeprices.asp?Exchange=NSE&Startname=A'
page = urllib2.urlopen(url)
soup = BeautifulSoup(page, 'html.parser')
for name_box in soup.find_all('tr', class_='alternate'):
name = name_box.text.strip()
print(name)
with open('Book1.csv', 'a') as csv_file:
writer = csv.writer(csv_file)
writer.writerow([name])
for namebox2 in soup.find_all('tr', class_='alternate1'):
name3 = namebox2.text.strip()
with open('Book1.csv', 'a') as csv_file:
writer = csv.writer(csv_file)
writer.writerow([name3])
print(type(name3))
答案 0 :(得分:2)
尝试使用pandas
执行此任务:
import urllib2
from bs4 import BeautifulSoup
import pandas as pd
url='http://contentlinks.dionglobal.in/ib/closeprices.asp?Exchange=NSE&Startname=A'
page = urllib2.urlopen(url)
soup = BeautifulSoup(page, 'html.parser')
df = pd.read_html(str(soup.find_all('table', cellspacing=2)),skiprows=1,header=0)[0].dropna()
df.to_csv('Book1.csv',index=False)
输出是:
Security Name Open Pr High Pr Low Pri Close P Traded Number Traded Value (R 20 Microns 39.4 39.9 38.2 38.35 41131 203 1585478.1 3i Infotech 4.25 4.4 4.15 4.15 3465068 1332 14679467.7 3M India 13849.4 13899.9 13701.1 13732.8 3559 358 48934409.85 A2Z Mainten 45 45.15 42.75 43.15 896729 4280 39108262.2 Aarti Drug 521 533 506.6 530.3 23832 1516 12430577.4 Aarti Inds 940 943.95 916 923.45 184109 3043 172642406 Aarve Denim 64.5 65.95 61.05 61.4 19461 258 1216503.8 Aban Offshor 187.8 191.9 183.3 187.15 1705780 20844 322208420.1 ABB 1502 1506 1424.05 1443.35 108282 3793 156994217.4 Abbott India 4389 4389 4275 4298.35 873 331 3773028.6 ABG Shipyard 12.6 12.6 12.5 12.6 40626 126 511837.6 ABM Intl 113.05 113.05 102.35 106.35 3451 52 374868.15 Abshek Inds 80.8 81.3 77.8 79.1 418308 5783 33137029.05 ACC 1653.9 1655.95 1626.4 1636.7 171707 11603 281527492.6 Accel Front 51.1 51.1 48.3 50 6608 17 334956