用于抓取aspx页面的Python代码是什么?

时间:2020-03-17 22:46:19

标签: asp.net python-3.x web-scraping beautifulsoup scrapy

我的要求是将mcode传递给下面的web aspx查询,然后将结果网页打印为pdf。

https://wwww.abcd.com/xyz/subject.aspx?mcode=99999

在上面的URL中,唯一的变量是99999。因此,我的目标是每次传递mcode,然后将生成的aspx页面打印为pdf。 请原谅我的简化语言,因为我是新手。

1 个答案:

答案 0 :(得分:0)

  1. 使用{ "$schema": "https://vega.github.io/schema/vega/v5.json", "description": "A histogram of film ratings, modified to include null values.", "width": 400, "height": 200, "padding": 5, "autosize": {"type": "fit", "resize": true}, "signals": [ {"name": "binCount", "update": "(bins.stop - bins.start) / bins.step"}, {"name": "barStep", "update": "(width ) / (1 + binCount)"}, { "name": "tooltip", "value": {}, "on": [ {"events": "rect:mouseover", "update": "slice(datum.dlist,0,2)"}, {"events": "rect:mouseout", "update": "{}"} ] } ], "data": [ { "name": "table", "values": [ {"name": "alpha", "data": "123"}, {"name": "alpha", "data": "789"}, {"name": "beta", "data": "456"}, {"name": "beta", "data": "789"}, {"name": "gamma", "data": "789"}, {"name": "beta", "data": "300"} ], "transform": [ {"type": "extent", "field": "data", "signal": "extent"}, { "type": "bin", "signal": "bins", "field": "data", "extent": {"signal": "extent"}, "maxbins": 4 } ] }, { "name": "aggregat", "source": "table", "transform": [ { "type": "aggregate", "groupby": ["bin0", "bin1"], "ops": ["values"], "fields": ["undefined"], "as": ["values"] } ] }, { "name": "counts", "source": "table", "transform": [{"type": "aggregate", "groupby": ["bin0", "bin1"]}, { "type": "lookup", "from": "aggregat", "key": "bin0", "fields": ["bin0"], "values": ["values"], "as": ["dlist"] }] } ], "scales": [ { "name": "yscale", "type": "linear", "range": "height", "round": true, "nice": true, "domain": {"fields": [{"data": "counts", "field": "count"}]} }, { "name": "xscale", "type": "linear", "range": [{"signal": "barStep "}, {"signal": "width"}], "round": true, "domain": {"signal": "[bins.start, bins.stop]"}, "bins": {"signal": "bins"} } ], "axes": [ {"orient": "bottom", "scale": "xscale", "tickMinStep": 0.5}, {"orient": "left", "scale": "yscale", "tickCount": 5, "offset": 5} ], "marks": [ { "type": "rect", "from": {"data": "counts"}, "encode": { "update": { "tooltip" : { "signal": "tooltip"}, "x": {"scale": "xscale", "field": "bin0", "offset": 1}, "x2": {"scale": "xscale", "field": "bin1"}, "y": {"scale": "yscale", "field": "count"}, "y2": {"scale": "yscale", "value": 0}, "fill": {"value": "steelblue"} }, "hover": {"fill": {"value": "firebrick"}} } } ] } 来获取类似于here的页面,并使用发布时的网址。
  2. 使用https://weasyprint.org/将html打印为pdf
  3. 这与抓取无关。抓取是解析和进一步处理网页内容的过程。如果需要的话,搜索beautifulsoup4 python软件包