用python ctypes包装regex.h

时间:2014-12-15 05:32:51

标签: python c regex ctypes

我正在尝试使用ctypes为标准c库的正则表达式功能编写一个python包装器(不,我不想只使用pythons re模块)。

我坚持如何在regex.h中专门包装re_pattern_buffer结构。这是我关注的一段代码:

struct re_pattern_buffer
{
/* Space that holds the compiled pattern.  It is declared as
   'unsigned char *' because its elements are sometimes used as
   array indexes.  */
unsigned char *__REPB_PREFIX(buffer);

/* Number of bytes to which `buffer' points.  */
unsigned long int __REPB_PREFIX(allocated);

/* Number of bytes actually used in `buffer'.  */
unsigned long int __REPB_PREFIX(used);

/* Syntax setting with which the pattern was compiled.  */
reg_syntax_t __REPB_PREFIX(syntax);

/* Pointer to a fastmap, if any, otherwise zero.  re_search uses the
   fastmap, if there is one, to skip over impossible starting points
   for matches.  */
char *__REPB_PREFIX(fastmap);

/* Either a translate table to apply to all characters before
   comparing them, or zero for no translation.  The translation is
   applied to a pattern when it is compiled and to a string when it
   is matched.  */
__RE_TRANSLATE_TYPE __REPB_PREFIX(translate);

/* Number of subexpressions found by the compiler.  */
size_t re_nsub; 

到目前为止,我尝试使用ctypes包装此结构:

reg_syntax_t = ctypes.c_ulong # unsigned long int
class regex_t(ctypes.Structure): # AKA: re_pattern_buffer
    _fields_ = [
                ("a", ctypes.c_void_p), # unsigned char *__REPB_PREFIX(buffer);
                ("b", ctypes.c_ulong),  # unsigned long int __REPB_PREFIX(allocated);
                ("c", ctypes.c_ulong),  # unsigned long int __REPB_PREFIX(used);
                ("d", reg_syntax_t),     # reg_syntax_t __REPB_PREFIX(syntax);
                ("e", ctypes.c_ubyte),  # char *__REPB_PREFIX(fastmap);
                ("re_nsub", ctypes.c_size_t),
               ]

我相信由于这里的一些错误我得到了seg错误。

C中的__REPB_PREFIX(...)发生了什么?我应该如何使用蟒蛇ctypes来表示这个?

提前感谢任何见解!

0 个答案:

没有答案