使用Python在Mac上打开.pages文件

时间:2014-06-15 22:52:17

标签: python macos

我想打开这样的网页文档:

directory = "/Path/to/file/"
with open(directory+"test.pages") as file:
    data = f.readlines()
    for line in data:
        words = line.split()
        print words 

然后我收到了这个错误:

IOError: [Errno 21] Is a directory: '/path/to/file/test.pages'

为什么这是一个目录? 那我怎么打开呢?

2 个答案:

答案 0 :(得分:1)

'/path/to/file/test.pages'是文件系统上的目录,因此无法在Python中打开。您的操作系统正在该目录中捆绑多个文件,并可能将其作为单个包呈现。你可以想象地走一下目录并获取内容:

for root, dirs, files in os.walk('/path/to/file/test.pages'):
    for file in files:
        print os.path.join(root, file)

但是打开文件并尝试阅读其内容可能会毫无结果。

我将向您展示如何尝试查找任何纯文本:

import re
# use a pattern that matches for any letter A-Z, upper and lower, 0-9, and _
pattern = re.compile(r'.*\w+.*')

for root, dirs, files in os.walk('/path/to/file/test.pages'):
    for file in files:
        # open each file with the context manager so it's automatically closed
        # regardless if there's an error. Use the Universal Newlines (U) flag too
        # as a best practice (Unix, Linux, and MS have different newlines).
        with open(os.path.join(root, file), 'rU') as f:
            for line in f:
                if re.match(pattern, line):
                    print line

答案 1 :(得分:0)

我有一个OSX 10.9.3的Macbook Pro。

我使用了你的代码,我没有你引用的问题。由于您将打开.pages文件,因此您需要解码该文件:

File "/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 10: ordinal not in range(128)