如何从ctypes C函数获取打印输出到Jupyter / IPython笔记本?

时间:2016-03-02 11:11:16

标签: python ctypes jupyter jupyter-notebook

简介

假设我有这个C代码:

#include <stdio.h>

// Of course, these functions are simplified for the purposes of this question.
// The actual functions are more complex and may receive additional arguments.

void printout() {
    puts("Hello");
}
void printhere(FILE* f) {
    fputs("Hello\n", f);
}

我正在编译为共享对象(DLL):gcc -Wall -std=c99 -fPIC -shared example.c -o example.so

然后我在into Python 3.xJupyter内导入IPython notebook

import ctypes
example = ctypes.cdll.LoadLibrary('./example.so')

printout = example.printout
printout.argtypes = ()
printout.restype = None

printhere = example.printhere
printhere.argtypes = (ctypes.c_void_p)  # Should have been FILE* instead
printhere.restype = None

问题

如何执行printout()printhere() C函数(通过ctypes)并在Jupyter / IPython笔记本中打印输出?

如果可能,我想避免编写更多C代码。我更喜欢纯Python解决方案。

我也希望避免写入临时文件。但是,写入管道/插座可能是合理的。

预期状态,当前状态

如果我在一个Notebook单元格中键入以下代码:

print("Hi")           # Python-style print
printout()            # C-style print
printhere(something)  # C-style print
print("Bye")          # Python-style print

我想得到这个输出:

Hi
Hello
Hello
Bye

但是,相反,我只在笔记本中获得了Python风格的输出结果。 C风格的输出被打印到启动笔记本进程的终端。

研究

据我所知,在Jupyter / IPython笔记本中,sys.stdout不是任何文件的包装器:

import sys

sys.stdout

# Output in command-line Python/IPython shell:
<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
# Output in IPython Notebook:
<IPython.kernel.zmq.iostream.OutStream at 0x7f39c6930438>
# Output in Jupyter:
<ipykernel.iostream.OutStream at 0x7f6dc8f2de80>

sys.stdout.fileno()

# Output in command-line Python/IPython shell:
1
# Output in command-line Jupyter and IPython notebook:
UnsupportedOperation: IOStream has no fileno.

相关问题和链接:

以下两个链接使用涉及创建临时文件的类似解决方案。但是,在实现此类解决方案时必须小心,以确保以正确的顺序打印Python样式的输出和C样式的输出。

是否可以避免临时文件?

我尝试使用C open_memstream()找到解决方案并将返回的FILE*分配给stdout,但它无效because stdout cannot be assigned

然后我尝试获取fileno()返回的流的open_memstream(),但我不能because it has no file descriptor

然后我查看了freopen(),但是API requires passing a filename

然后我查看了Python的标准库并找到了tempfile.SpooledTemporaryFile(),它是内存中的临时文件类对象。但是,只要调用fileno(),就会将其写入磁盘。

到目前为止,我找不到任何仅限内存的解决方案。最有可能的是,无论如何我们都需要使用临时文件。 (这不是什么大问题,但只是一些额外的开销和额外的清理,我宁愿避免。)

有可能使用os.pipe(),但这似乎很难做到没有分叉。

2 个答案:

答案 0 :(得分:6)

我终于开发出了一个解决方案。它需要将整个单元格包装在上下文管理器中(或仅包装C代码)。它还使用临时文件,因为我无法在不使用任何解决方案的情况下找到任何解决方案。

完整的笔记本可以作为GitHub Gist使用:https://gist.github.com/denilsonsa/9c8f5c44bf2038fd000f

第1部分:使用Python准备C库

import ctypes

# use_errno parameter is optional, because I'm not checking errno anyway.
libc = ctypes.CDLL(ctypes.util.find_library('c'), use_errno=True)

class FILE(ctypes.Structure):
    pass

FILE_p = ctypes.POINTER(FILE)

# Alternatively, we can just use:
# FILE_p = ctypes.c_void_p

# These variables, defined inside the C library, are readonly.
cstdin = FILE_p.in_dll(libc, 'stdin')
cstdout = FILE_p.in_dll(libc, 'stdout')
cstderr = FILE_p.in_dll(libc, 'stderr')

# C function to disable buffering.
csetbuf = libc.setbuf
csetbuf.argtypes = (FILE_p, ctypes.c_char_p)
csetbuf.restype = None

# C function to flush the C library buffer.
cfflush = libc.fflush
cfflush.argtypes = (FILE_p,)
cfflush.restype = ctypes.c_int

第2部分:构建我们自己的上下文管理器以捕获stdout

import io
import os
import sys
import tempfile
from contextlib import contextmanager

@contextmanager
def capture_c_stdout(encoding='utf8'):
    # Flushing, it's a good practice.
    sys.stdout.flush()
    cfflush(cstdout)

    # We need to use a actual file because we need the file descriptor number.
    with tempfile.TemporaryFile(buffering=0) as temp:
        # Saving a copy of the original stdout.
        prev_sys_stdout = sys.stdout
        prev_stdout_fd = os.dup(1)
        os.close(1)

        # Duplicating the temporary file fd into the stdout fd.
        # In other words, replacing the stdout.
        os.dup2(temp.fileno(), 1)

        # Replacing sys.stdout for Python code.
        #
        # IPython Notebook version of sys.stdout is actually an
        # in-memory OutStream, so it does not have a file descriptor.
        # We need to replace sys.stdout so that interleaved Python
        # and C output gets captured in the correct order.
        #
        # We enable line_buffering to force a flush after each line.
        # And write_through to force all data to be passed through the
        # wrapper directly into the binary temporary file.
        temp_wrapper = io.TextIOWrapper(
            temp, encoding=encoding, line_buffering=True, write_through=True)
        sys.stdout = temp_wrapper

        # Disabling buffering of C stdout.
        csetbuf(cstdout, None)

        yield

        # Must flush to clear the C library buffer.
        cfflush(cstdout)

        # Restoring stdout.
        os.dup2(prev_stdout_fd, 1)
        os.close(prev_stdout_fd)
        sys.stdout = prev_sys_stdout

        # Printing the captured output.
        temp_wrapper.seek(0)
        print(temp_wrapper.read(), end='')

乐趣:使用它!

libfoo = ctypes.CDLL('./foo.so')

printout = libfoo.printout
printout.argtypes = ()
printout.restype = None

printhere = libfoo.printhere
printhere.argtypes = (FILE_p,)
printhere.restype = None


print('Python Before capturing')
printout()  # Not captured, goes to the terminal

with capture_c_stdout():
    print('Python First')
    printout()
    print('Python Second')
    printhere(cstdout)
    print('Python Third')

print('Python After capturing')
printout()  # Not captured, goes to the terminal

输出:

Python Before capturing
Python First
C printout puts
Python Second
C printhere fputs
Python Third
Python After capturing

学分和进一步的工作

这个解决方案是阅读我在问题中链接的所有链接的结果,加上大量的反复试验。

此解决方案仅重定向stdout,重定向stdoutstderr可能会很有趣。现在,我将此作为练习留给读者。 ;)

此外,此解决方案中没有异常处理(至少现在还没有)。

答案 1 :(得分:0)

我花了整整一个下午的时间修改python2的版本,该死的,这很棘手,关键是用io.open重新打开tempfile。 然后我尝试一个更好的解决方案,只需为python stdout编写一个Logger类

# -*- coding: utf-8 -*-

import ctypes
# from ctypes import *
from ctypes import util

# use_errno parameter is optional, because I'm not checking errno anyway.
libraryC = ctypes.util.find_library('c')
libc = ctypes.CDLL(libraryC, use_errno=True)


# libc = cdll.msvcrt


class FILE(ctypes.Structure):
    pass


FILE_p = ctypes.POINTER(FILE)

# Alternatively, we can just use:
# FILE_p = ctypes.c_void_p

# These variables, defined inside the C library, are readonly.
##cstdin = FILE_p.in_dll(libc, 'stdin')
##cstdout = FILE_p.in_dll(libc, 'stdout')
##cstderr = FILE_p.in_dll(libc, 'stderr')

# C function to disable buffering.
csetbuf = libc.setbuf
csetbuf.argtypes = (FILE_p, ctypes.c_char_p)
csetbuf.restype = None

# C function to flush the C library buffer.
cfflush = libc.fflush
cfflush.argtypes = (FILE_p,)
cfflush.restype = ctypes.c_int

import io
import os
import sys
import tempfile
from contextlib import contextmanager
#import cStringIO


def read_as_encoding(fileno, encoding="utf-8"):
    fp = io.open(fileno, mode="r+", encoding=encoding, closefd=False)
    return fp


class Logger(object):
    def __init__(self, file, encoding='utf-8'):
        self.file = file
        self.encoding = encoding

    def write(self, message):
        self.file.flush()  # Meed to flush
        # python2 temp file is always binary
        # msg_unicode = message.('utf-8')
        self.file.write(message)


@contextmanager
def capture_c_stdout(on_output, on_error=None, encoding='utf8'):
    # Flushing, it's a good practice.
    sys.stdout.flush()
    sys.stderr.flush()
    ##cfflush(cstdout)
    # cfflush(cstdcerr)

    # We need to use a actual file because we need the file descriptor number.
    with tempfile.NamedTemporaryFile() as temp:
        with tempfile.NamedTemporaryFile() as temp_err:
            # print "TempName:", temp.name
            # print "TempErrName:", temp_err.name

            # Saving a copy of the original stdout.
            prev_sys_stdout = sys.stdout
            prev_stdout_fd = os.dup(1)
            os.close(1)
            # Duplicating the temporary file fd into the stdout fd.
            # In other words, replacing the stdout.
            os.dup2(temp.fileno(), 1)

            if on_error:
                prev_sys_stderr = sys.stderr
                prev_stderr_fd = os.dup(2)
                os.close(2)
                os.dup2(temp_err.fileno(), 2)

            # Replacing sys.stdout for Python code.
            #
            # IPython Notebook version of sys.stdout is actually an
            # in-memory OutStream, so it does not have a file descriptor.
            # We need to replace sys.stdout so that interleaved Python
            # and C output gets captured in the correct order.
            #
            # We enable line_buffering to force a flush after each line.
            # And write_through to force all data to be passed through the
            # wrapper directly into the binary temporary file.
            # No need to use TextIOWrapper in python2, in python2, tempFile is always binary according to official document
            ##temp_wrapper = io.TextIOWrapper(
            ##   read_as_encoding(temp.fileno(), encoding=encoding), encoding=encoding, line_buffering=True) ##, write_through=True)

            # temp_wrapper_python = io.TextIOWrapper(
            #    read_as_encoding(temp.fileno(), encoding=encoding), encoding='ascii', line_buffering=True)
            temp_wrapper_python = Logger(temp, encoding=encoding)
            sys.stdout = temp_wrapper_python

            if on_error:
                # temp_wrapper_err = io.TextIOWrapper(
                #   read_as_encoding(temp_err.fileno(), encoding=encoding), encoding=encoding, line_buffering=True) ##, write_through=True)
                temp_wrapper_python_err = Logger(temp_err, encoding=encoding)
                # string_str_err = cStringIO.StringIO()
                sys.stderr = temp_wrapper_python_err

            # Disabling buffering of C stdout.
            ##csetbuf(cstdout, None)

            yield

            # Must flush to clear the C library buffer.
            ##cfflush(cstdout)

            # Restoring stdout.
            os.dup2(prev_stdout_fd, 1)
            os.close(prev_stdout_fd)
            sys.stdout = prev_sys_stdout

            if on_error:
                os.dup2(prev_stderr_fd, 2)
                os.close(prev_stderr_fd)
                sys.stderr = prev_sys_stderr

            # Printing the captured output.
            # temp_wrapper.seek(0)
            # print "Reading: "
            # print temp_wrapper.read()
            if on_output:
                temp.flush()
                temp.seek(0)
                on_output(temp.read())
            temp.close()

            if on_error:
                temp_err.flush()
                temp_err.seek(0)
                on_error(temp_err.read())
                temp_err.close()


import repo_checker_cpp


def on_capture_output(input_stream):
    if input_stream:
        print "Here is captured stdout: \n", input_stream


def on_capture_err(input_stream):
    if input_stream:
        print "Here is captured stderr: \n", input_stream


if __name__ == '__main__':
    with capture_c_stdout(on_capture_output, on_capture_err) as custom_output:  # redirection here
        # repo_checker_cpp is a ctypes.CDll module
        print >> sys.stderr, "Hello World in python err\n"
        repo_checker_cpp.test_exception()  # throw an exception an capture inside cpp module then output to std::cerr
        print "Hello World in python\n"
        repo_checker_cpp.hello_world()  # simple std::cout << "Hello World" << std::endl; std::cerr << "Hello World in cerr" << std::endl;


我无法获得cstdin = FILE_p.in_dll(libc, 'stdin')一样的行。我用##注释它们,以表明它们最初是由Denilson编写的。还要感谢Denilson的工作。

在我的Window10 + python 2.7中正常工作,输出:

Here is captured stdout: 
Hello World in python
Hello World(C++)


Here is captured stderr: 
Hello World in python err
RepoCheckCpp_TestException, Reason: ensure failed : false
xxxxx\repocheckercpp.cpp(38)
context variables:
    error : This is a test exception


Hello World(C++) in cerr

一切都完美捕捉