猪处理日志文件使用

时间:2016-09-06 15:15:02

标签: apache-pig bigdata

我有以下日志:任何人都可以告诉我如何使用PigLatin处理它?<​​/ p>

**

SYSTEM IP:192.168.68.78 
Distro info:Red Hat Enterprise Linux Server release 6.6 (Santiago)
Kernel:Linux bugzilla-blr-in 2.6.32-504.16.2.el6.x86_64 #1 SMP Tue Mar 10 17:01:00 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
Uptime:12:27:42 up 8 days, 17:57,  0 users,  load average: 0.00, 0.00, 0.00
Memory:Total:1869Mb Memory:Used:1567Mb  Memory:Free:302Mb
Swap:Total:1999Mb   Swap:Used:0Mb   Swap:Free: 1999Mb
Architecture:x86_64
  Processor:0:Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
Date:Wed Jun 29 12:27:42 IST 2016

SCRIPT USER
User:aimsadm (uid:503)
Groups:aimsadm
Working dir:/home/aimsadm
Home dir:/home/aimsadm

NETWORK DETAILS
Hostname:bugzilla-blr-in
IP (    ):127.0.0.1/8
IP (eth0):192.168.68.78/24
Gateway:192.168.68.1
Name Server:8.8.8.8
Name Server:192.168.68.80

LIST OF USERS:sdudam,sudutha,djegathesa,aimsadm,krishnang,

CLAMD STATUS: CLAM AV service is stopped or not installed

NAGIOS STATUS: Nagios service is running

OSSEC STATUS: Ossec service is stopped or not installed

NTPD STATUS: NTP service is running

HARDENING STATUS:Hardening Done

AD INTEGRATION STATUS:AD Integration Not Done

HARDWARE/PLATFORM DETAILS
Hardware Platform:64Bit
Hardware Info :DMI 2.3 present.
DMI: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012

OS DETAILS
Red Hat Enterprise Linux Server release 6.6 (Santiago)
Linux bugzilla-blr-in 2.6.32-504.16.2.el6.x86_64 #1 SMP Tue Mar 10 17:01:00 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux

CPU INFO
model name  : Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz

MEMORY INFO
MemTotal:        1914776 kB
RAM:1 GB

HARD DISK DETAILS

MOUNT DETAILS
Filesystem:/dev/mapper/vg_bugzillablrin-LogVol00,Type:ext4,Total Size:22G,Used:2.4G,Avail:19G,Use%:12%,Mounted on:/
Filesystem:tmpfs,Type:tmpfs,Total Size:981M,Used:0,Avail:981M,Use%:0%,Mounted on:/dev/shm
Filesystem:/dev/sda1,Type:ext4,Total Size:297M,Used:95M,Avail:186M,Use%:34%,Mounted on:/boot
Filesystem:/dev/mapper/vg_bugzillablrin-LogVol01,Type:ext4,Total Size:21G,Used:5.8G,Avail:14G,Use%:30%,Mounted on:/var

LSBLK OUTPUT
NAME:sr0,
MAJ:MIN:11:0,RM:1,SIZE:1024M,RO:0,TYPE:rom,MOUNTPOINT::
NAME:sda,
MAJ:MIN:8:0,RM:0,SIZE:60G,RO:0,TYPE:disk,MOUNTPOINT::
NAME:sda1,
MAJ:MIN:8:1,RM:0,SIZE:300M,RO:0,TYPE:part,MOUNTPOINT::/boot
NAME:sda2,
MAJ:MIN:8:2,RM:0,SIZE:59.7G,RO:0,TYPE:part,MOUNTPOINT::

RUNNING SERVICES
auditd running...
crond running...
messagebus running...
nrpe running...
ntpd running...
rhnsd running...
rhsmcertd running...
rpcbind running...
openssh-daemon running...




**

SYSTEM IP:192.168.68.35 
Distro info:CentOS release 6.6 (Final)
Kernel:Linux altifin-ci-app 2.6.32-504.16.2.el6.x86_64 #1 SMP Wed Apr 22 06:48:29 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Uptime:12:28:06 up 48 days, 20:31,  0 users,  load average: 0.00, 0.00, 0.00
Memory:Total:11903Mb    Memory:Used:1277Mb  Memory:Free:10625Mb
Swap:Total:8191Mb   Swap:Used:0Mb   Swap:Free: 8191Mb
Architecture:x86_64
  Processor:0:Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
  Processor:1:Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
Date:Wed Jun 29 12:28:06 IST 2016

SCRIPT USER
User:aimsadm (uid:509)
Groups:aimsadm
Working dir:/home/aimsadm
Home dir:/home/aimsadm

NETWORK DETAILS
Hostname:altifin-ci-app
IP (lo):127.0.0.1/8
IP (eth0):192.168.68.35/24
Gateway:192.168.68.1
Name Server:192.168.68.10
Name Server:192.168.68.4

LIST OF USERS:altipay,aramesh,sdudam,nagios,kpankaj,sudutha,miyappan,skosanam,djegathesa,aimsadm,

CLAMD STATUS: CLAM AV service is stopped or not installed

NAGIOS STATUS: Nagios service is running

OSSEC STATUS: Ossec service is stopped or not installed

NTPD STATUS: NTP service is running

HARDENING STATUS:Hardening Done

AD INTEGRATION STATUS:AD Integration Not Done

HARDWARE/PLATFORM DETAILS
Hardware Platform:64Bit
Hardware Info :DMI 2.3 present.
DMI: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006  05/23/2012

OS DETAILS
CentOS release 6.6 (Final)
Linux altifin-ci-app 2.6.32-504.16.2.el6.x86_64 #1 SMP Wed Apr 22 06:48:29 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

CPU INFO
model name  : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
model name  : Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz

MEMORY INFO
MemTotal:       12189032 kB
RAM:11 GB

HARD DISK DETAILS

MOUNT DETAILS
Filesystem:/dev/mapper/vg_altifinci-LogVol01,Type:ext4,Total Size:203G,Used:80G,Avail:113G,Use%:42%,Mounted on:/
Filesystem:tmpfs,Type:tmpfs,Total Size:6.3G,Used:0,Avail:6.3G,Use%:0%,Mounted on:/dev/shm
Filesystem:/dev/sda1,Type:ext4,Total Size:500M,Used:64M,Avail:410M,Use%:14%,Mounted on:/boot

LSBLK OUTPUT
NAME:sr0,
MAJ:MIN:11:0,RM:1,SIZE:1024M,RO:0,TYPE:rom,MOUNTPOINT::
NAME:sda,
MAJ:MIN:8:0,RM:0,SIZE:200G,RO:0,TYPE:disk,MOUNTPOINT::
NAME:sda1,
MAJ:MIN:8:1,RM:0,SIZE:500M,RO:0,TYPE:part,MOUNTPOINT::/boot
NAME:sda2,
MAJ:MIN:8:2,RM:0,SIZE:199.5G,RO:0,TYPE:part,MOUNTPOINT::

RUNNING SERVICES
abrtd running...
abrt-dump-oops running...
acpid running...
atd running...
auditd running...
automount running...
crond running...
cupsd running...
hald running...
mcelog running...
messagebus running...
MySQL but
rpc.statd running...
nrpe running...
ntpd running...
rpcbind running...
openssh-daemon running...

1 个答案:

答案 0 :(得分:0)

是。
有方法。让我解释一下。
虽然给定的样本数据属于“非结构化”的类别,但我们总是寻找“有些事情”。在其中。
说过我们正在寻找一种模式,比如说你正在研究所需数据的线或线! 为实现这一目标,我们需要确定“模式”。从样本数据中使用适当的&#39; RegEx&#39; (正则表达式)拉它。
此外,猪还带有内置jar&#39; piggybank&#39;支持各种预定义的文件格式,包括你所说的非结构化文件格式。
试试&#39; RegExLoader&#39;来自PIG's piggybank的以下套餐的一部分! (包org.apache.pig.piggybank.storage) https://pig.apache.org/docs/r0.15.0/api/

另外,让大家知道你正在研究的确切输出。