我有一个解析html主体的函数来获取如下的Open Graph属性。
我不确定如何使用Stream以便解析只能进行一次 - 这甚至可能吗?
def og(body) do
image = attribute_content(body, "meta[property=og:image]")
title = attribute_content(body, "meta[property=og:title]")
site_name = attribute_content(body, "meta[property=og:site_name]")
desc = attribute_content(body, "meta[property=og:description]")
type = attribute_content(body, "meta[property=og:type]")
url = attribute_content(body, "meta[property=og:url]")
author = attribute_content(body, "meta[name=author]")
%{image: image, title: title, type: type,
site_title: site_title, url: url, site_name: site_name,
description: desc, author: author}
end
@doc """
Parse html body for the target element and return the content.
"""
defp attribute_content(body, target) do
Floki.find(body, target) |> Floki.attribute("content") |> List.first
end
答案 0 :(得分:2)
根据您的问题,我猜body
是String
,您想要解析一次。如果是这种情况,Floki.parse/1
会将正文解析为列表。 Floki.find/2
可以将此列表作为参数而不是带有HTML的String
。
(...)
parsed = Floki.parse(body)
image = attribute_content(parsed, "meta[property=og:image]")
(...)
此外,您可以创建一个包含所有属性的列表:
attributes = [image: "meta[property=og:image]",
title: "meta[property=og:title]",
site_name: "meta[property=og:site_name]",
description: "meta[property=og:description]",
type: "meta[property=og:type]",
url: "meta[property=og:url]",
author: "meta[name=author]"]
然后映射函数attribute_content/2
并将Keyword
转换为Map
:
attributes
|> Stream.map(fn {k, v} -> {k, attribute_content(parsed, v)} end)
|> Enum.into(%{})
所以完整的代码是:
def og(html) do
attributes = [image: "meta[property=og:image]",
title: "meta[property=og:title]",
site_name: "meta[property=og:site_name]",
description: "meta[property=og:description]",
type: "meta[property=og:type]",
url: "meta[property=og:url]",
author: "meta[name=author]"]
general(html, attributes)
end
defp general(html, attributes) do
parsed = Floki.parse(html)
attributes
|> Stream.map(fn {k, v} -> {k, attribute_content(parsed, v)} end)
|> Enum.into(%{})
end
defp attribute_content(parsed, target) do
Floki.find(body, target)
|> Floki.attribute("content")
|> List.first
end
我希望这能回答你的问题。