将目录传递给Python中的变量

时间:2017-07-22 01:24:05

标签: python tar

我正在尝试修改GitHub上的脚本,该脚本将访问TAR文件并对其进行处理。代码中有一个变量需要指向文件所在的根目录(我认为......)。以下是代码:

def make_Dictionary(root_dir):
    emails_dirs = [os.path.join(root_dir,f) for f in os.listdir(root_dir)]    
    all_words = []       
    for emails_dir in emails_dirs:
        emails = [os.path.join(emails_dir,f) for f in os.listdir(emails_dir)]
        for mail in emails:
            with open(mail) as m:
                 for line in m:
                     words = line.split()
                     all_words += words
    dictionary = Counter(all_words)
    list_to_remove = dictionary.keys()

    for item in list_to_remove:
        if item.isalpha() == False: 
            del dictionary[item]
        elif len(item) == 1:
             del dictionary[item]
    dictionary = dictionary.most_common(4000)

    np.save('dict_movie.npy',dictionary) 

    return dictionary

root_dir = sys.path[0]
dictionary = make_Dictionary(root_dir)

root_dir正在抛出:

  File "C:\Users\seand\eclipse-workspace\sentiment_project\src\root\nested\movie-polarity.py", line 22, in make_Dictionary
    emails = [os.path.join(emails_dir,f) for f in os.listdir(emails_dir)]
NotADirectoryError: [WinError 267] The directory name is invalid: 'C:\\Users\\seand\\eclipse-workspace\\sentiment_project\\src\\root\\nested\\movie-polarity-tfidf.py'

指示状态“注意:需要相应地设置movie-polarity-tfidf.py和movie-polarity.py中语料库的目录路径。”但是我指定的路径包含脚本需要的语料库TAR文件。我不明白为什么,如果脚本正在寻找一个目录,这个.py文件就会被拿起来。

3 个答案:

答案 0 :(得分:0)

emails_dirs的理解是返回一些非目录。可以修复:

emails_dirs = [os.path.join(root_dir,f) for f in os.listdir(root_dir)
               if os.path.isdir(os.path.join(root_dir,f))]  

答案 1 :(得分:0)

您在函数的第一行使用os.path.join(rootdir,f),因此email_dirs是绝对路径列表,但不是目录。所以你得到了例外。

答案 2 :(得分:0)

os.listdir列出目录中的所有内容。这包括文件和目录。我假设您第一次只想要目录(生成email_dirs的列表)而第二次只需要文件(生成emails列表)。

def make_Dictionary(root_dir):
    # # # Check for only directories # # #
    emails_dirs = [os.path.join(root_dir,f) for f in os.listdir(root_dir) if os.path.isdir(f)]    
    all_words = []       
    for emails_dir in emails_dirs:
        # # # Check for only files # # #
        emails = [os.path.join(emails_dir,f) for f in os.listdir(emails_dir) if os.path.isfile(f)]
        for mail in emails:
            with open(mail) as m:
                 for line in m:
                     words = line.split()
                     all_words += words
    dictionary = Counter(all_words)
    list_to_remove = dictionary.keys()

    for item in list_to_remove:
        if item.isalpha() == False: 
            del dictionary[item]
        elif len(item) == 1:
             del dictionary[item]
    dictionary = dictionary.most_common(4000)

    np.save('dict_movie.npy',dictionary) 

    return dictionary

root_dir = sys.path[0]
dictionary = make_Dictionary(root_dir)