样式属性和值

时间:2019-11-06 10:08:43

标签: c# puppeteer puppeteer-sharp

我正在提取HTML DOM中每个元素的名称和值。对于只能提取属性名称而不是值的样式属性。

我的代码如下:

var elementHandles = await page.QuerySelectorAllAsync("a");
    foreach (var handle in elementHandles)
    {
        var elementStyle = await handle.GetPropertyAsync("style");
        var style = await elementStyle.JsonValueAsync();
        var output = style.ToString();
    }   

这是我的输出:

{{
  "0": "font-family",
  "1": "font-size",
  "2": "line-height",
  "3": "color",
  "4": "text-align"
}}

这是我所期望的:

font-family:Arial, Helvetica, sans-serif; 
font-size: 12px; 
line-height: 16px; 
color: #999999;
text-align: left

1 个答案:

答案 0 :(得分:0)

问题在于CSSStyleDeclaration的序列化方式。如果那是Chromium决定序列化该对象的方式,那么我们将无能为力。

但是,我们可以尝试使用EvaluateFunctionAsync通过javascript解决该问题。

foreach (var handle in elementHandles)
{
  var style = await page.EvaluateFunctionAsync<Dictionary<string, string>>(
    "e => Object.entries(e.style).filter(i => isNaN(i[0]) && i[1]).map(i => { return { [i[0]] : i[1]}}).reduce((acc, cur) => { return {...acc, ...cur}}, {})", handle);
  var output = style.ToString();
}

让我们看看javascript表达式

e=> //We send the HTML element instead of the style property
  Object.entries(e.style) //We get all the property/value pairs from the CSSStyleDeclaration object
    // We want to exclude number properties (like 0,1,2) and empty values
    .filter(i => isNaN(i[0]) && i[1]) 
    //We turn them into objects
    .map(i => { return { [i[0]] : i[1]}})
    //We merge them into one single object that will be returned to C#
    .reduce((acc, cur) => { return {...acc, ...cur}}, {})