这是我的代码:
import requests, re, pandas, csv
from bs4 import BeautifulSoup
r=requests.get("http://www.hltv.org/?pageid=188&statsfilter=2816&offset=0")
c=r.content
table=BeautifulSoup(c,"html.parser")
for row in table.find_all('div', style=re.compile(r'width:606px;height:22px;background-color')):
data=row.get_text(strip=True, separator=',')
print(data)
这是刮擦输出:
5/3 17,Astralis (16),FaZe (13),inferno,IEM Katowice 2017
5/3 17,Astralis (16),FaZe (12),nuke,IEM Katowice 2017
5/3 17,Astralis (16),FaZe (12),overpass,IEM Katowice 2017
5/3 17,FaZe (16),Astralis (9),cache,IEM Katowice 2017
4/3 17,Astralis (16),Heroic (12),nuke,IEM Katowice 2017
4/3 17,Astralis (16),Heroic (12),train,IEM Katowice 2017
4/3 17,Immortals (10),FaZe (16),mirage,IEM Katowice 2017
2/3 17,Virtus.pro (14),Heroic (16),nuke,IEM Katowice 2017
2/3 17,Cloud9 (6),Natus Vincere (16),mirage,IEM Katowice 2017
2/3 17,SK (16),North (8),cbble,IEM Katowice 2017
2/3 17,Cloud9 (12),North (16),cbble,IEM Katowice 2017
2/3 17,Natus Vincere (12),Heroic (16),overpass,IEM Katowice 2017
2/3 17,Virtus.pro (16),SK (14),inferno,IEM Katowice 2017
从此输出中制作pandas.DataFrame的好方法是什么?
答案 0 :(得分:1)
您可以使用pandas.read_csv功能。如果由于某种原因,您不想将字符串写入实际文件,您可以通过将字符串包装在StringIO对象中来让大熊猫认为您正在传递它。
import pandas as pd
from io import StringIO
csv_string = '''
5/3 17,Astralis (16),FaZe (13),inferno,IEM Katowice 2017
5/3 17,Astralis (16),FaZe (12),nuke,IEM Katowice 2017
5/3 17,Astralis (16),FaZe (12),overpass,IEM Katowice 2017
5/3 17,FaZe (16),Astralis (9),cache,IEM Katowice 2017
4/3 17,Astralis (16),Heroic (12),nuke,IEM Katowice 2017
4/3 17,Astralis (16),Heroic (12),train,IEM Katowice 2017
4/3 17,Immortals (10),FaZe (16),mirage,IEM Katowice 2017
2/3 17,Virtus.pro (14),Heroic (16),nuke,IEM Katowice 2017
2/3 17,Cloud9 (6),Natus Vincere (16),mirage,IEM Katowice 2017
2/3 17,SK (16),North (8),cbble,IEM Katowice 2017
2/3 17,Cloud9 (12),North (16),cbble,IEM Katowice 2017
2/3 17,Natus Vincere (12),Heroic (16),overpass,IEM Katowice 2017
2/3 17,Virtus.pro (16),SK (14),inferno,IEM Katowice 2017
'''
csv_string_io = StringIO(csv_string)
frame = pd.read_csv(csv_string_file)