尊敬的Stackoverflow社区,
最近我开始玩Python。在观看YouTube视频和浏览该平台方面,我学到了很多东西。但是我无法解决我的问题。
希望你们能帮助我。
因此,我尝试使用Python(Anaconda)从网站上抓取信息。并将此信息放入CSV文件中。我试图通过在脚本中添加“,”来分隔列。但是,当我打开CSV文件时,所有数据都放在1列(A)中。相反,我希望将数据分为不同的列(A和B(当我想添加信息时,还有C,D,E,F等))。
我必须在此代码中添加什么:
filename = "brands.csv"
f = open(filename, "w")
headers = "brand, shipping\n"
f.write(headers)
for container in containers:
brand_container = container.findAll("h2",{"class":"product-name"})
brand = brand_container[0].a.text
shipping_container = container.findAll("p",{"class":"availability in-stock"})
shipping = shipping_container[0].text.strip()
print("brand: " + brand)
print("shipping: " + shipping)
f.write(brand + "," + shipping + "," + "\n")
f.close()
感谢您的帮助!
亲切的问候,
在Game0ver的建议下完成脚本:
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = 'https://www.scraped-website.com'
# opening up connection, grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
# html parsing
page_soup = soup(page_html, "html.parser")
# grabs each product
containers = page_soup.findAll("li",{"class":"item last"})
container = containers[0]
import csv
filename = "brands.csv"
with open(filename, 'w') as csvfile:
fieldnames = ['brand', 'shipping']
# define your delimiter
writer = csv.DictWriter(csvfile, delimiter=',', fieldnames=fieldnames)
writer.writeheader()
for container in containers:
brand_container = container.findAll("h2",{"class":"product-name"})
brand = brand_container[0].a.text
shipping_container = container.findAll("p",{"class":"availability in-stock"})
shipping = shipping_container[0].text.strip()
print("brand: " + brand)
print("shipping: " + shipping)
正如我提到的那样,此代码无效。我一定做错了吗?
答案 0 :(得分:1)
您最好使用python's csv module来做到这一点:
import csv
filename = "brands.csv"
with open(filename, 'w') as csvfile:
fieldnames = ['brand', 'shipping']
# define your delimiter
writer = csv.DictWriter(csvfile, delimiter=',', fieldnames=fieldnames)
writer.writeheader()
# write rows...
答案 1 :(得分:0)
尝试将值用双引号引起来,例如
f.write('"'+brand + '","' + shipping + '"\n')
尽管有更多更好的方法来处理此通用任务和此功能。
答案 2 :(得分:0)
您可以选择以下两种显示方式之一。由于您脚本中的网址不可用,因此我提供了一个有效的网址。
import csv
import requests
from bs4 import BeautifulSoup
url = "https://yts.am/browse-movies"
response = requests.get(url)
soup = BeautifulSoup(response.content, 'lxml')
with open("movieinfo.csv", 'w', newline="") as f:
writer = csv.DictWriter(f, ['name', 'year'])
writer.writeheader()
for row in soup.select(".browse-movie-bottom"):
d = {}
d['name'] = row.select_one(".browse-movie-title").text
d['year'] = row.select_one(".browse-movie-year").text
writer.writerow(d)
或者您可以尝试以下操作:
soup = BeautifulSoup(response.content, 'lxml')
with open("movieinfo.csv", 'w', newline="") as f:
writer = csv.writer(f)
writer.writerow(['name','year'])
for row in soup.select(".browse-movie-bottom"):
name = row.select_one(".browse-movie-title").text
year = row.select_one(".browse-movie-year").text
writer.writerow([name,year])