我正在尝试使用cmdline.execute
将html代码存储到名为response的变量中,如下面的代码所示,但是无法在scrapy shell
存储和编程代码中断。谁能告诉我如何将原始HTML存储到变量中?
import scrapy
from scrapy import cmdline
linkedinnurl = "https://stackoverflow.com/users/5597065/adnan-stab=profile"
response = cmdline.execute("scrapy shell https://stackoverflow.com/users/5597065/adnan-s?tab=profile".split()))
print(response)
答案 0 :(得分:1)
您可以这样将原始html存储到变量中:
class Foo extends StatelessWidget {
@override
Widget build(BuildContext context) {
var widgetList = new List<Widget>();
for (var item in items) {
X content = fetchContentFromAPI();
widgetList.add(abstractWidgetWith(content: content, id: item.id));
}
return Column(children: widgetList);
}
Widget abstractWidgetWith({@required int id, @required X content}) {
switch (id) {
case 1:
return Implementation1(content);
default:
return Implementation2(content);
}
}
}
abstract class AbstractWidget {
final X content;
AbstractWidget(this.content);
}
class Implementation1 extends StatelessWidget implements AbstractWidget {
final X content;
Implementation1(this.content);
@override
Widget build(BuildContext context) {
// Display content in some type of way
}
}
class Implementation2 extends StatelessWidget implements AbstractWidget {
final X content;
Implementation2(this.content);
@override
Widget build(BuildContext context) {
// Display content in some type of way
}
}
如果不需要动态文件名,则只需:
class MySpider(scrapy.Spider):
def parse(self, res):
with open(dynamic_file_name_function(res.url), 'w') as f:
f.write(res.body)