Question

给出如下URL：

foo.bar.car.com.au

我需要提取foo.bar。

我遇到了以下代码：

private static string GetSubDomain(Uri url)
{
    if (url.HostNameType == UriHostNameType.Dns)
    {
        string host = url.Host;
        if (host.Split('.').Length > 2)
        {
            int lastIndex = host.LastIndexOf(".");
            int index = host.LastIndexOf(".", lastIndex - 1);
            return host.Substring(0, index);
        }
    }         
    return null;     
}

这让我感觉像foo.bar.car。我想要foo.bar。我应该只使用拆分并取0和1吗？

但是有可能是wwww。

这有简单的方法吗？

Answer 1

根据你的要求（你想要前两个级别，不包括'www。'）我会接近这样的事情：

private static string GetSubDomain(Uri url)
{

    if (url.HostNameType == UriHostNameType.Dns)
    {

        string host = url.Host;

        var nodes = host.Split('.');
        int startNode = 0;
        if(nodes[0] == "www") startNode = 1;

        return string.Format("{0}.{1}", nodes[startNode], nodes[startNode + 1]);

    }

    return null; 
}

Answer 2

我遇到了类似的问题，根据前面的答案，写了这个扩展方法。最重要的是，它需要一个定义“根”域的参数，即该方法的消费者认为是根。在OP的情况下，呼叫将是

Uri uri = "foo.bar.car.com.au";
uri.DnsSafeHost.GetSubdomain("car.com.au"); // returns foo.bar
uri.DnsSafeHost.GetSubdomain(); // returns foo.bar.car

这是扩展方法：

/// <summary>Gets the subdomain portion of a url, given a known "root" domain</summary>
public static string GetSubdomain(this string url, string domain = null)
{
  var subdomain = url;
  if(subdomain != null)
  {
    if(domain == null)
    {
      // Since we were not provided with a known domain, assume that second-to-last period divides the subdomain from the domain.
      var nodes = url.Split('.');
      var lastNodeIndex = nodes.Length - 1;
      if(lastNodeIndex > 0)
        domain = nodes[lastNodeIndex-1] + "." + nodes[lastNodeIndex];
    }

    // Verify that what we think is the domain is truly the ending of the hostname... otherwise we're hooped.
    if (!subdomain.EndsWith(domain))
      throw new ArgumentException("Site was not loaded from the expected domain");

    // Quash the domain portion, which should leave us with the subdomain and a trailing dot IF there is a subdomain.
    subdomain = subdomain.Replace(domain, "");
    // Check if we have anything left.  If we don't, there was no subdomain, the request was directly to the root domain:
    if (string.IsNullOrWhiteSpace(subdomain))
      return null;

    // Quash any trailing periods
    subdomain = subdomain.TrimEnd(new[] {'.'});
  }

  return subdomain;
}

Answer 3

您可以使用以下nuget包Nager.PublicSuffix。

PM> Install-Package Nager.PublicSuffix

示例

var domainParser = new DomainParser(); var data = await domainParser.LoadDataAsync(); var tldRules = domainParser.ParseRules(data); domainParser.AddRules(tldRules); var domainName = domainParser.Get("sub.test.co.uk"); //domainName.Domain = "test"; //domainName.Hostname = "sub.test.co.uk"; //domainName.RegistrableDomain = "test.co.uk"; //domainName.SubDomain = "sub"; //domainName.TLD = "co.uk";

Answer 4

好的，先来。你是专门看'com.au'，还是这些一般的互联网域名？因为如果是后者，则根本没有自动方法来确定域中有多少是“站点”或“区域”，或者该区域内的单个“主机”或其他记录是多少和多少。

如果您需要能够从任意域名中找出答案，您将需要从Mozilla Public Suffix项目（http://publicsuffix.org）中获取TLD列表，并使用他们的算法查找TLD中的TLD。你的域名。然后，您可以假设您想要的部分以TLD之前的最后一个标签结束。

Answer 5

private static string GetSubDomain(Uri url)
{
    if (url.HostNameType == UriHostNameType.Dns)
    {

        string host = url.Host;   
        String[] subDomains = host.Split('.');
        return subDomains[0] + "." + subDomains[1];
     }
    return null; 
}

Answer 6

我建议使用正则表达式。以下代码段应该提取您要查找的内容...

string input = "foo.bar.car.com.au";
var match = Regex.Match(input, @"^\w*\.\w*\.\w*");
var output = match.Value;

Answer 7

除了Nager.PubilcSuffix中指定的NuGet this answer包之外，还有NuGet Louw.PublicSuffix包，根据其GitHub project page，它是一个.Net核心库。解析Public Suffix，并基于Nager.PublicSuffix project，并进行了以下更改：

移植到.NET核心库。
固定库，以便通过所有综合测试。
重构类以将功能拆分为较小的焦点类。
使类不可变。因此DomainParser可以用作单例并且是线程安全的。
添加了WebTldRuleProvider和FileTldRuleProvider。
添加了确定规则是ICANN还是私有域规则的功能。
使用异步编程模型

该页面还指出，上述许多更改已提交回原始Nager.PublicSuffix project。

从foo.bar.car.com中的URL获取特定子域

7 个答案: