这是输入日志样本
122.161.182.200 - Joe [21/Jul/2009:13:14:17 -0700] "GET /rss.pl HTTP/1.1"
200 35942 "-" "IE/4.0 (compatible; MSIE 7.0; Windows NT 6.0;
Trident/4.0; SLCC1; .NET CLR 2.0.50727; .NET CLR 3.5.21022; InfoPath.2;
.NET CLR 3.5.30729; .NET CLR 3.0.30618; OfficeLiveConnector.1.3;
OfficeLivePatch.1.3; MSOffice 12)"
Pig脚本是
raw_logs = LOAD 'apacheLog.log' USING TextLoader AS (line:chararray);
logs_base = FOREACH raw_logs GENERATE FLATTEN (REGEX_EXTRACT_ALL
(line,'^([\\d.]+) (\\S+) (\\S+) \\[([\\w:/]+\\s[+\\-]\\d{4})\\] "(.+?)"
(\\d{3}) (\\d+)"([^"]+)" "([^"]+)"') ) AS (remoteAddr: chararray,
remoteLogname: chararray, user: chararray, time:
chararray, request: chararray, status: int, bytes_string: chararray,
referrer: chararray, browser: chararray);
dump logs_base;
当我将log_base转储为空白括号作为输出时
OUTPUT () () () ()
请帮忙!提前谢谢。