为什么grep在文件中肯定没找到字符串?

时间:2015-08-20 14:43:12

标签: python regex grep

第一次发帖和一点菜鸟所以如果礼仪或格式有任何问题,请告诉我。

我正在尝试使用文件上的grep函数(下图)来检查文件中是否存在单词。当我查看文件时,这个词肯定存在。它被空格包围,是一行中的最后一个字。

由于某种原因,grep无法找到该单词且程序返回0.为什么?

谢谢!

import os
import re

word = "aliows"
folder = '/Users/jordanfreedman/Thinkful/Projects/Spam_Filter/enron1/spam/'
email = '4201.2005-04-05.GP.spam.txt'

number = int(os.popen("grep -w -i -l " + word + " " + folder + email + " | wc -l").read())
print number

2 个答案:

答案 0 :(得分:0)

您可以使用退出状态找出是否匹配:

import os
from subprocess import STDOUT, call

path = os.path.join(folder, email)
with open(os.devnull, 'wb', 0) as devnull:
   rc = call(['grep', '-w', '-l', '-i', '-F', word, path],
             stdout=devnull, stderr=STDOUT)
if rc == 0: 
    print('found')
elif rc == 1:
    print('not found')
else:
    print('error')

as @stevieb mentioned,您可以在纯Python中找到该单词是否在给定文件中:

import re
from contextlib import closing
from mmap import ACCESS_READ, mmap

with open(path) as f, closing(mmap(f.fileno(), 0, access=ACCESS_READ)) as m:
   if re.search(br"(?i)\b%s\b" % re.escape(word), m):
       print('found')

答案 1 :(得分:-1)

您需要发布文件片段,以便我们测试grep语句。此外,没有理由拒绝:

import re

word = "aliows"
folder = '/Users/jordanfreedman/Thinkful/Projects/Spam_Filter/enron1/spam/'
email = '4201.2005-04-05.GP.spam.txt'

file = folder + email
fh = open(file, 'r')

contents = re.findall(word, fh.read())

print(len(contents))