Question

我是Splunk的新手，我希望优化我将添加到splunk的日志数据文件（进行无损压缩）。由于数据必须是文本的（不是二进制或任何其他格式），我不能去霍夫曼编码等，也不知道从哪里开始。

任何帮助/想法都会很棒。

Answer 1

Splunk Enterprise在对归档文件编制索引之前对其进行解压缩。它可以处理这些常见的归档文件类型：tar，gz，bz2，tar.gz，tgz，tbz，tbz2，zip和z。

我建议使用上述任何压缩方法，然后使用UI或props.conf配置Splunk以按文件名或目录规范监视文件。如果由于某种原因需要使用不同的压缩算法，则可以执行此操作，然后指示Splunk在索引管道中使用特殊的unarchive_cmd。您可以通过查看props.conf.spec了解更多相关信息。以下是相关部分：

unarchive_cmd = <string>
* Only called if invalid_cause is set to "archive".
* This field is only valid on [source::<source>] stanzas.
* <string> specifies the shell command to run to extract an archived source.
* Must be a shell command that takes input on stdin and produces output on stdout.
* Use _auto for Splunk's automatic handling of archive files (tar, tar.gz, tgz, tbz, tbz2, zip)
* This setting applies at input time, when data is first read by Splunk. 
  The setting is used on a Splunk system that has configured inputs acquiring the data.
* Defaults to empty.

Splunk日志数据优化

1 个答案: