作为日志解析的结果,我有一个包含主机名和偶尔IP地址的字段。我需要进一步处理该字段中的数据以解析主机名中的域。即如果主机名是googleanalytics.google.com我想尽可能高效地解析google.com,因为系统每秒处理数千条日志消息。
我现在所拥有的是:
-- Save hostname into a temporary variable
local tempMetaValue = hostname
local count = 0
local byte_char = string.byte(".")
for i = 1, #tempMetaValue do
if string.byte(tempMetaValue, i) == byte_char then
count = count + 1
end
end
local dotCount = count
-- If there was only one dot do nothing
if dotCount == 1 then
return 0
-- Check whether there were more than one dot
elseif dotCount == 2 then
-- Get the index of the first dot
local beginIndex = string.find(tempMetaValue,".",1,true)
-- Get the substring starting after the first dot
local domainMeta = string.sub(tempMetaValue,beginIndex+1)
-- Double check that the substring exists
if domainMeta ~= nil then
-- Populate the domain meta field
end
-- If there are more than two dots..
elseif dotCount > 2 then
-- Test to see if the hostname is actually an IP address
if tempMetaValue:match("%d%d?%d?%.%d%d?%d?%.%d%d?%d?%.%d%d?%d?") then
-- Skip the rest if an IP address was found
end
-- Get the index of the second to last dot
local beginIndex = string.find(tempMetaValue,"\.[^\.]*\.[^\.]*$")
-- Get the substring starting after the second to last dot
local domainMeta = string.sub(tempMetaValue,beginIndex+1)
-- Double check that the substring exists
if domainMeta ~= nil then
-- Populate the domain meta field
end
end
我有一种感觉,虽然他可能不是最快的解决方案。 “一种感觉”,因为在此之前我对Lua没有任何经验,但对于这么简单的任务来说似乎太长了。
我尝试创建一个类似于拆分的操作的解决方案。 Java将被执行,并且它将留下最后一个令牌“unsplit”,从而留下我实际想要的部分(域),但是无处可进。因此,基本上对于该解决方案,我想创建尽可能多的令牌,因为主机名值中有点,即googleanalytics.google.com将分为“googleanalytics”和“google.com”。
答案 0 :(得分:2)
这样的事情能做你想做的吗?
function getdomain(str)
-- Grad just the last two dotted parts of the string.
local domain = str:match("%.?([^.]+%.[^.]+)$")
-- If we have dotted parts and they are all numbers then this is an IP address.
if domain and tonumber((domain:gsub("%.", ""))) then
return nil
end
return domain
end
print(getdomain("googleanalytics.google.com"))
print(getdomain("foobar.com"))
print(getdomain("1.2.3.4"))
print(getdomain("something.else.longer.than.that.com"))
print(getdomain("foobar"))
这是“它是一个IP地址”测试是非常愚蠢的,应该很可能是一个更强大的测试,但对于服务的快速演示。