在PHP脚本中使用Bash命令,根据特定列的值获取文件的某些行

时间:2017-07-08 09:37:15

标签: regex linux bash shell grep

从php脚本通过linux bash命令(grep + regex +另一个命令?),我想根据某些条件获取文件的行,见下文:

文件的例子:

"id_line1","value_line1_column2","foo blablabla","value_line1_column4" 
"id_line2","value_line2_column2","blablabla foo","value_line2_column4"
"id_line3","value_line3_column2","blabla foo blabla","value_line3_column4"
"id_line4","value_line4_column2","blablabla","value_line4_column4"
"id_line5","value_line5_column2","fooblabla bla","value_line5_column4"
"id_line6","value_line6_column2","blabla blafoo","value_line6_column4"
"id_line7","value_line7_column2","blabla foobla bla","value_line7_column4" 

我只想搜索文件中的列号X(本例中的第三列)。

正则表达式

在我文件的所有行的第三列中,我想查找包含搜索字词的字符串:(通过grep + regex?)

  • 在特定列的字符串的开头(此处为示例,第三列)
  • OR在特定列的字符串末尾(此处为示例,第三列)
  • 或在特定列的字符串中的某处(此处为示例,第三列)

只找到与其他单词没有连接的单词。例如,使用上面的示例文件,如果我搜索单词“foo”:

"id_line1","value_line1_column2","foo blablabla","value_line1_column4" // the regex must return true
"id_line2","value_line2_column2","blablabla foo","value_line2_column4" // the regex must return true
"id_line3","value_line3_column2","blabla foo blabla","value_line3_column4" // the regex must return true
"id_line4","value_line4_column2","blablabla","value_line4_column4" // the regex must return false
"id_line5","value_line5_column2","fooblabla bla","value_line5_column4" // the regex must return false
"id_line6","value_line6_column2","blabla blafoo","value_line6_column4" // the regex must return false
"id_line7","value_line7_column2","blabla foobla bla","value_line7_column4" // the regex must return false 

结果

命令必须返回行:

"id_line1","value_line1_column2","foo blablabla","value_line1_column4"
"id_line2","value_line2_column2","blablabla foo","value_line2_column4"
"id_line3","value_line3_column2","blabla foo blabla","value_line3_column4"

我该怎么做? 如果我只能获得id(“id_line1”,“id_line2”,“id_line3”),那将是完美的:)

1 个答案:

答案 0 :(得分:3)

Awk将完成这项工作:

awk -F, '$3 ~ /"foo / || $3 ~ / foo"/ || $3 ~ /[[:blank:]]foo[[:blank:]]/ { print $0 }' filename

这里我们检查每一行所界定的第三段,并检查" foo或(由||表示)一个空格,然后检查foo然后另一个空格,最后是foo"。如果发生任何这些情况,请打印