Question

编辑：以下是我要解析的示例XML文档：http://us.battle.net/wow/en/forum/1011699/（查看源代码）。

以下是我要检索的项目：

标题（tbody / tr / td / a）
作者（tbody / tr / td）
Url（也存储在作者节点中）
日期（tbody / tr / td / div / div）
回复（tbody / tr / td）
视图（也存储在上述节点中）

我执行'预查询'，因此我不必为每个后续查询遍历：

var threads =
    from allThreads in xmlThreadList.Descendants(ns + "tbody")
                                    .Descendants(ns + "tr")
                                    .Descendants(ns + "td")
    select allThreads;

我有一个表示论坛帖子列表的XML文档。在每个线程中都有不同的子节点，它们包含我想要检索的不同信息。目前，我通过多次查询XML文档来完成此操作。有没有办法在单个查询中提取此信息并将其存储在IEnumerable中？我现在这样做的方式似乎效率低下。

    // array of xelements that contain the title and url
    var threadTitles =
        (from allThreads in threads.Descendants(ns + "a")
        where allThreads.Parent.Attribute("class").Value.Equals("post-title")
        select allThreads).ToArray();

    // array of strings of author names
    var threadAuthors =
        (from allThreads in threads
        where allThreads.Attribute("class").Value.Equals("post-author")
        select allThreads.Value.Trim()).ToArray();

    // ...
    // there are several more queries like this
    // ...

    // for loop to populate a list with all the extracted data
    for (int i = 0, j = 0; i < threadTitles.Length; i++, j++)
    {
        ThreadItem threadItem = new ThreadItem();

        threadItem.Title = threadTitles[i].Value.Trim();
        threadItem.Author = threadAuthors[i];
        threadItem.Url = Path.Combine(_url, threadTitles[i].Attribute("href").Value);
        threadItem.Date = threadDates[i];
        threadItem.Replies = threadRepliesAndViews[j++];
        threadItem.Views = threadRepliesAndViews[j];
        _threads.Add(threadItem);
    }

任何建议都将不胜感激。我是整个LINQ to XML场景的新手。

Answer 1

希望这会有所帮助：

string ns = "{http://www.w3.org/1999/xhtml}";

var doc = XDocument.Load("http://us.battle.net/wow/en/forum/1011699/");
var threads = from tr in doc.Descendants(ns + "tbody").Elements(ns + "tr")
              let elements = tr.Elements(ns + "td")
              let title = elements.First(a => a.Attribute("class").Value == "post-title").Element(ns + "a")
              let author = elements.First(a => a.Attribute("class").Value == "post-author")
              let replies = elements.First(a => a.Attribute("class").Value == "post-replies")
              let views = elements.First(a => a.Attribute("class").Value == "post-views")
              select new
              {
                  Title = title.Value.Trim(),
                  Url = title.Attribute("href").Value.Trim(),
                  Author = author.Value.Trim(),
                  Replies = int.Parse(replies.Value),
                  Views = int.Parse(views.Value)
              };

foreach (var item in threads)
{
    Console.WriteLine(item);
}

Console.ReadLine();

Answer 2

尝试类似

的内容

from thread in threads
select new ThreadItem() {
   Title = thread.Descendants(ns + "a").First( title => title.Parent.Attribute("class").Value.Equals("post-title")),
  Date = date query part

  ect.... 
}

这会获得一些速度，因为你不会一次又一次地解析整个xml块，而是每次只查看每个较小的线程几次提取不同的信息。

我有兴趣知道哪个更快，因为你有效地交易希望整个元素项适合缓存，因此当你执行它上面的所有小查询时，你可以快速访问它，希望（在你的旧代码中）你的cpu上的分支预测器将调整为每个长查询执行，以提供更好的速度。

LINQ to XML Multiple Selects

2 个答案: