Python使用BeautifulSoup刮取图像

时间:2018-11-21 13:59:59

标签: python beautifulsoup with-statement

我正在尝试使用BeautifulSoup从网站上抓取并下载图像。我已经抓取了存储在imgVal中的链接列表,然后代码可以创建一个新目录来存储图像。我的问题是代码只会从链接列表中下载一张图片。我想全部下载。我该怎么做?

import java.util.List;
import java.util.Map;
import javax.enterprise.context.ApplicationScoped;
import javax.faces.component.UIComponent;
import javax.faces.component.ValueHolder;
import javax.faces.context.FacesContext;
import javax.inject.Named;
import org.apache.commons.lang3.StringUtils;
import org.primefaces.component.api.UIColumn;
import org.primefaces.component.repeat.UIRepeat;
import org.primefaces.util.ComponentUtils;

@Named
@ApplicationScoped
public class ColumnExporter {

    public String export(final UIColumn column) {
        String value = StringUtils.EMPTY;
        for (final UIComponent child : column.getChildren())
            value += buildValue(child);
        return value;
    }

    private String buildValue(final UIComponent uiComponent) {
        if (uiComponent.isRendered()) {
            if (uiComponent instanceof ValueHolder) {
                return ComponentUtils.getValueToRender(FacesContext.getCurrentInstance(), uiComponent);
            } else if (uiComponent instanceof UIRepeat) {
                UIRepeat repeat = (UIRepeat) uiComponent;
                return buildRepeatValue(repeat);
            } else if (uiComponent.getChildCount() > 0) {
                String value = "";
                for (UIComponent child : uiComponent.getChildren()) {
                    value += buildValue(child);
                }
                return value;
            }
        }
        return "";
    }

    private String buildRepeatValue(UIRepeat repeat) {
        String value = "";
        String variableName = repeat.getVar();
        String statusName = repeat.getVarStatus();
        List<?> values = (List) repeat.getValueExpression("value")
                .getValue(FacesContext.getCurrentInstance().getELContext());
        for (Object valueObj : values) {
            for (UIComponent child : repeat.getChildren()) {
                String childValue = buildValue(child);
                if (StringUtils.isNotEmpty(childValue)) {
                    value += childValue + "\n";
                }
            }
        }
        return value;
    }
}

1 个答案:

答案 0 :(得分:1)

是不是要下载多行URL作为requests的输入?您不能,必须使用循环逐个完成。

def writeImages():
    pageNumber = '001'
    filename = pageNumber + '/'
    os.makedirs(os.path.dirname(filename), exist_ok=True)

    imgVal = getThumbnailLinks() # ['http://a.jpg', 'http://b.jpg']
    for imgBasename in imgVal:
        with open(filename + basename(imgBasename),"wb") as f:
            f.write(requests.get(imgBasename).content)

writeImages()