Question

假设有一个文件test.txt包含字符串'test'。

现在，请考虑以下Python代码：

f = open('test', 'r+')
f.read()
f.truncate(0)
f.write('passed')
f.flush();

现在我希望test.txt现在包含'passed'，但是还有一些奇怪的符号！

更新：截断后的刷新无效。

Answer 1

这是因为truncate不会改变流的位置。

当你read()文件时，你将位置移动到最后。因此，连续的write将从该位置写入文件。但是，当你调用flush()时，它似乎不仅尝试将缓冲区写入文件，而且还进行一些错误检查并修复当前文件位置。在Flush()之后调用truncate(0)时，不写入任何内容（缓冲区为空），然后检查文件大小并将位置放在第一个适用位置（0）。

<强>更新

Python的文件函数不仅仅是C标准库等价物的包装，但了解C函数有助于更准确地了解发生的事情。

来自ftruncate man page：

调用ftruncate（）时不会修改查找指针的值。

来自fflush man page：

如果流指向输入最新操作的输入流或更新流，则如果该流是可搜索的并且尚未在文件结尾处，则刷新该流。刷新输入流会丢弃任何缓冲的输入并调整文件指针，以便下一个输入操作在最后一次读取后访问该字节。

这意味着如果您在flush之前放置truncate则无效。我查了一下就是这样。

但是要flush之后放truncate：

如果stream指向未输入最新操作的输出流或更新流，则fflush（）会将该流的任何未写入数据写入该文件，以及底层的st_ctime和st_mtime字段文件被标记为更新。

在解释输出流时没有输入上一次操作时，手册页未提及搜索指针。（这里我们的最后一项操作是truncate）

更新2

我在python源代码中找到了一些东西：Python-3.2.2\Modules\_io\fileio.c:837

#ifdef HAVE_FTRUNCATE
static PyObject *
fileio_truncate(fileio *self, PyObject *args)
{
    PyObject *posobj = NULL; /* the new size wanted by the user */
#ifndef MS_WINDOWS
    Py_off_t pos;
#endif

...

#ifdef MS_WINDOWS
    /* MS _chsize doesn't work if newsize doesn't fit in 32 bits,
       so don't even try using it. */
    {
        PyObject *oldposobj, *tempposobj;
        HANDLE hFile;

////// THIS LINE //////////////////////////////////////////////////////////////
        /* we save the file pointer position */
        oldposobj = portable_lseek(fd, NULL, 1);
        if (oldposobj == NULL) {
            Py_DECREF(posobj);
            return NULL;
        }

        /* we then move to the truncation position */
        ...

        /* Truncate.  Note that this may grow the file! */
        ...

////// AND THIS LINE //////////////////////////////////////////////////////////
        /* we restore the file pointer position in any case */
        tempposobj = portable_lseek(fd, oldposobj, 0);
        Py_DECREF(oldposobj);
        if (tempposobj == NULL) {
            Py_DECREF(posobj);
            return NULL;
        }
        Py_DECREF(tempposobj);
    }
#else

...

#endif /* HAVE_FTRUNCATE */

查看我指出的两行（///// This Line /////）。如果您的平台是Windows，那么它将保存位置并在截断后将其返回。

令我惊讶的是，Python 3.2.2函数中的大多数flush函数都没有做任何事情或根本没有调用fflush C函数。 3.2.2截断部分也非常无证。但是，我确实在Python 2.7.2源代码中找到了一些有趣的东西。首先，我在Python-2.7.2\Objects\fileobject.c:812实施中的truncate中找到了这个：

 /* Get current file position.  If the file happens to be open for
 * update and the last operation was an input operation, C doesn't
 * define what the later fflush() will do, but we promise truncate()
 * won't change the current position (and fflush() *does* change it
 * then at least on Windows).  The easiest thing is to capture
 * current pos now and seek back to it at the end.
 */

总而言之，我认为这是一个完全依赖平台的事情。我检查了默认的Python 3.2.2 for Windows x64并得到了与您相同的结果。不知道* nixes会发生什么。

Answer 2

是的，truncate()确实没有移动这个位置，但是说，就像死亡一样简单：

f.read()
f.seek(0)
f.truncate(0)
f.close()

这完全有效;）

Answer 3

截断不会改变文件位置。

另请注意，即使以read + write方式打开文件，也不能只在两种操作类型之间切换（例如，需要搜索操作才能从读取切换到写入，反之亦然）。

Answer 4

我希望以下是您打算编写的代码：

open('test.txt').read()
open('test.txt', 'w').write('passed')

Answer 5

如果有人和我在同一条船上，这是我的解决方案问题：

我有一个始终处于打开状态的程序，即它不会停止，会不断轮询数据并写入日志文件
问题是，我想在主文件达到10 MB标记后立即对其进行拆分，因此，我编写了以下程序。
我找到了问题的解决方案，其中truncate将空值写入文件导致了进一步的问题。

下面是我如何解决此问题的说明。

f1 = open('client.log','w')
nowTime = datetime.datetime.now().time() 
f1.write(os.urandom(1024*1024*15)) #Adding random values worth 15 MB
if (int(os.path.getsize('client.log') / 1048576) > 10): #checking if file size is 10 MB and above
    print 'File size limit Exceeded, needs trimming'
    dst = 'client_'+ str(randint(0, 999999)) + '.log'       
    copyfile('client.log', dst) #Copying file to another one
    print 'Copied content to ' + str(dst)
    print 'Erasing current file'
    f1.truncate(0) #Truncating data, this works fine but puts the counter at the last 
    f1.seek(0)  #very important to use after truncate so that new data begins from 0 
    print 'File truncated successfully'
    f1.write('This is fresh content') #Dummy content
f1.close()  
print 'All Job Processed'

Answer 6

这取决于。如果要保持文件打开并在不关闭文件的情况下访问它，则flush将强制写入文件。如果你在冲洗后立即关闭文件，那么你不需要它，因为close会为你冲洗。这是我对docs

的理解

Python中truncate（0）后的文件中的垃圾

6 个答案: