Question

假设我有一个指向函数_stack_push(stack* stk, void* el)的指针。我希望能够调用curry(_stack_push, my_stack)并返回一个只需要void* el的函数。我想不出办法，因为C不允许运行时函数定义，但我知道有比我更聪明的人:)。有什么想法吗？

Answer 1

我找到了Laurent Dami的一篇论文，讨论了C / C ++ / Objective-C中的currying：

More Functional Reusability in C/C++/Objective-c with Curried Functions

对它在C中的实现方式感兴趣：

我们当前的实现使用现有的C构造来添加currying机制。这比修改编译器要容易得多，并且足以证明currying的兴趣。然而，这种方法有两个缺点。首先，curried函数不能进行类型检查，因此需要小心使用以避免错误。其次，curry函数无法知道其参数的大小，并将它们视为整数的大小。

该论文不包含curry()的实现，但您可以想象使用function pointers和variadic functions如何实现它。

Answer 2

GCC为嵌套函数的定义提供了扩展。虽然这不是ISO标准C，但这可能会引起一些兴趣，因为它可以非常方便地回答这个问题。简而言之，嵌套函数可以访问父函数局部变量，父函数也可以返回它们的指针。

这是一个简短的，不言自明的例子：

#include <stdio.h>

typedef int (*two_var_func) (int, int);
typedef int (*one_var_func) (int);

int add_int (int a, int b) {
    return a+b;
}

one_var_func partial (two_var_func f, int a) {
    int g (int b) {
        return f (a, b);
    }
    return g;
}

int main (void) {
    int a = 1;
    int b = 2;
    printf ("%d\n", add_int (a, b));
    printf ("%d\n", partial (add_int, a) (b));
}

然而，这种结构存在限制。如果保留指向结果函数的指针，如

one_var_func u = partial (add_int, a);

函数调用u(0)可能会导致意外行为，因为a读取的变量u在partial终止后被销毁。

请参阅this section of GCC's documentation。

Answer 3

这是我头脑中的第一次猜测（可能不是最佳解决方案）。

curry函数可以从堆中分配一些内存，并将参数值放入堆分配的内存中。然后诀窍是返回的函数知道它应该从堆分配的内存中读取它的参数。如果只返回函数的一个实例，那么指向这些参数的指针可以存储在单例/全局中。否则，如果返回的函数有多个实例，那么我认为curry需要在堆分配的内存中创建返回函数的每个实例（通过编写像“获取指针到”的那样的操作码参数“，”推送参数“，并”将其他函数“调用到堆分配的内存中”。在这种情况下，你需要注意分配的内存是否可执行，也许（我不知道）甚至害怕反病毒程序。

Answer 4

好消息： 有一种方法可以编写程序，在标准 ANSI C 中执行此操作，而无需使用任何特定于编译器的功能。（特别是，它不需要gcc的{{3}}。）

坏消息：它需要创建少量可执行代码以在运行时用作蹦床函数。这意味着实现将依赖于：

处理器指令集
nested function support（特别是函数调用约定）
操作系统能够将数据标记为可执行

最好的消息： 如果您只需要在真实的生产代码中执行此操作……您应该使用 ABI。它已获得许可，并包含对 the closure API of libffi 的仔细、灵活的实现。

如果你还在这里，你想呆呆地了解一下如何“从头开始”实现这一点。

下面的程序演示了如何将 2 参数函数柯里化为 C 中的 1 参数函数，给定...

x86-64 处理器架构
many platforms and ABIs
Linux 操作系统

它基于来自 System V ABI，但蹦床结构存储在堆（通过malloc）而不是堆栈上。这更安全，因为这意味着我们不必禁用编译器的堆栈执行保护（没有 gcc -Wl,-z,execstack）。

它使用 Infectious Executable Stacks 使堆对象可执行。

该程序的本质是它接受一个指向双参数函数（uint32_t (*fp2)(uint32_t a, uint32_t b)）的指针，并将其转换为指向调用{{的单参数函数（uint32_t (*fp1)(uint32_t a)）的指针1}} 带有参数 fp1 的预设值。它通过创建小的 3 指令蹦床函数来做到这一点：

通过适当拼接movl $imm32, %esi /* $imm32 filled in with the value of 'b' */ movq %imm64, %rax /* $imm64 filled in with the value of 'fp2' */ jmpq *%rax和b的值，指向包含这3条指令的内存块的指针可以精确地用作单参数函数指针fp2如上所述。这是因为它遵守 the Linux mprotect system call，其中单参数函数接收 fp1/%edi 寄存器中的第一个参数，而双参数函数接收 {{1} 寄存器中的第二个参数}/%rdi 注册。在这种情况下，单参数trampoline函数在%esi中接收其%rsi参数，然后在uint32_t中填充第二个%edi参数的值，然后直接跳转到“真正的”双参数函数，它期望它的两个参数正好在那些寄存器中。

这是完整的工作代码，我也在 GitHub 上的 the x86-64 System V calling convention 上提供：

uint32_t

Answer 5

这是一种在C中进行currying的方法。虽然这个示例应用程序使用C ++ iostream输出是为了方便，但它都是C样式编码。

这种方法的关键是让struct包含一个unsigned char数组，这个数组用于构建函数的参数列表。要调用的函数被指定为推入数组的参数之一。然后将结果数组提供给代理函数，该代理函数实际执行函数和参数的闭包。

在这个例子中，我提供了几个特定于类型的辅助函数来将参数推送到闭包中，以及一个通用pushMem()函数来推送struct或其他内存区域。

这种方法确实需要分配一个存储区，然后用于闭包数据。最好将堆栈用于此内存区域，以便内存管理不会成为问题。还有一个问题是关闭存储区域的大小，以便有足够的空间容纳必要的参数，但不要太大，以至于内存或堆栈中的多余空间被未使用的空间占用。

我已经尝试使用稍微不同定义的闭包结构，该结构包含用于存储闭包数据的当前使用的数组大小的附加字段。然后，这个不同的闭包结构与修改的辅助函数一起使用，这使得帮助函数的用户在向闭包结构添加参数时不需要维护自己的unsigned char *指针。

备注和警告

使用Visual Studio 2013编译和测试以下示例程序。此示例的输出如下所示。我不确定GCC或CLANG在这个例子中的用途，也不确定64位编译器可能会出现的问题，因为我认为我的测试是使用32位应用程序。此方法似乎只适用于使用标准C声明的函数，其中调用函数在被调用者返回（Windows {1 {1}}而不是__cdecl后处理从堆栈中弹出参数）

由于我们在运行时构建参数列表然后调用代理函数，因此这种方法不允许编译器对参数执行检查。由于编译器无法标记的参数类型不匹配，这可能导致神秘的故障。

示例应用

__stdcall

测试输出

此示例程序的输出。括号中的数字是主函数调用的主编号。

// currytest.cpp : Defines the entry point for the console application.
//
// while this is C++ usng the standard C++ I/O it is written in
// a C style so as to demonstrate use of currying with C.
//
// this example shows implementing a closure with C function pointers
// along with arguments of various kinds. the closure is then used
// to provide a saved state which is used with other functions.

#include "stdafx.h"
#include <iostream>

// notation is used in the following defines
//   - tname is used to represent type name for a type
//   - cname is used to represent the closure type name that was defined
//   - fname is used to represent the function name

#define CLOSURE_MEM(tname,size) \
    typedef struct { \
        union { \
            void *p; \
            unsigned char args[size + sizeof(void *)]; \
        }; \
    } tname;

#define CLOSURE_ARGS(x,cname) *(cname *)(((x).args) + sizeof(void *))
#define CLOSURE_FTYPE(tname,m) ((tname((*)(...)))(m).p)

// define a call function that calls specified function, fname,
// that returns a value of type tname using the specified closure
// type of cname.
#define CLOSURE_FUNC(fname, tname, cname) \
    tname fname (cname m) \
    { \
        return ((tname((*)(...)))m.p)(CLOSURE_ARGS(m,cname)); \
    }

// helper functions that are used to build the closure.
unsigned char * pushPtr(unsigned char *pDest, void *ptr) {
    *(void * *)pDest = ptr;
    return pDest + sizeof(void *);
}

unsigned char * pushInt(unsigned char *pDest, int i) {
    *(int *)pDest = i;
    return pDest + sizeof(int);
}

unsigned char * pushFloat(unsigned char *pDest, float f) {
    *(float *)pDest = f;
    return pDest + sizeof(float);
}

unsigned char * pushMem(unsigned char *pDest, void *p, size_t nBytes) {
    memcpy(pDest, p, nBytes);
    return pDest + nBytes;
}


// test functions that show they are called and have arguments.
int func1(int i, int j) {
    std::cout << " func1 " << i << " " << j;
    return i + 2;
}

int func2(int i) {
    std::cout << " func2 " << i;
    return i + 3;
}

float func3(float f) {
    std::cout << " func3 " << f;
    return f + 2.0;
}

float func4(float f) {
    std::cout << " func4 " << f;
    return f + 3.0;
}

typedef struct {
    int i;
    char *xc;
} XStruct;

int func21(XStruct m) {
    std::cout << " fun21 " << m.i << " " << m.xc << ";";
    return m.i + 10;
}

int func22(XStruct *m) {
    std::cout << " fun22 " << m->i << " " << m->xc << ";";
    return m->i + 10;
}

void func33(int i, int j) {
    std::cout << " func33 " << i << " " << j;
}

// define my closure memory type along with the function(s) using it.

CLOSURE_MEM(XClosure2, 256)           // closure memory
CLOSURE_FUNC(doit, int, XClosure2)    // closure execution for return int
CLOSURE_FUNC(doitf, float, XClosure2) // closure execution for return float
CLOSURE_FUNC(doitv, void, XClosure2)  // closure execution for void

// a function that accepts a closure, adds additional arguments and
// then calls the function that is saved as part of the closure.
int doitargs(XClosure2 *m, unsigned char *x, int a1, int a2) {
    x = pushInt(x, a1);
    x = pushInt(x, a2);
    return CLOSURE_FTYPE(int, *m)(CLOSURE_ARGS(*m, XClosure2));
}

int _tmain(int argc, _TCHAR* argv[])
{
    int k = func2(func1(3, 23));
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;

    XClosure2 myClosure;
    unsigned char *x;

    x = myClosure.args;
    x = pushPtr(x, func1);
    x = pushInt(x, 4);
    x = pushInt(x, 20);
    k = func2(doit(myClosure));
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;

    x = myClosure.args;
    x = pushPtr(x, func1);
    x = pushInt(x, 4);
    pushInt(x, 24);               // call with second arg 24
    k = func2(doit(myClosure));   // first call with closure
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;
    pushInt(x, 14);              // call with second arg now 14 not 24
    k = func2(doit(myClosure));  // second call with closure, different value
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;

    k = func2(doitargs(&myClosure, x, 16, 0));  // second call with closure, different value
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;

    // further explorations of other argument types

    XStruct xs;

    xs.i = 8;
    xs.xc = "take 1";
    x = myClosure.args;
    x = pushPtr(x, func21);
    x = pushMem(x, &xs, sizeof(xs));
    k = func2(doit(myClosure));
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;

    xs.i = 11;
    xs.xc = "take 2";
    x = myClosure.args;
    x = pushPtr(x, func22);
    x = pushPtr(x, &xs);
    k = func2(doit(myClosure));
    std::cout << " main (" << __LINE__ << ") " << k << std::endl;

    x = myClosure.args;
    x = pushPtr(x, func3);
    x = pushFloat(x, 4.0);

    float dof = func4(doitf(myClosure));
    std::cout << " main (" << __LINE__ << ") " << dof << std::endl;

    x = myClosure.args;
    x = pushPtr(x, func33);
    x = pushInt(x, 6);
    x = pushInt(x, 26);
    doitv(myClosure);
    std::cout << " main (" << __LINE__ << ") " << std::endl;

    return 0;
}

Answer 6

<块引用>

因为 C 不允许运行时函数定义

这在standard C中原则上是正确的。阅读n1570了解详情。

然而，在实践中它可能是错误的。考虑

在 POSIX 系统（例如 Linux）上，在运行时在某个临时文件 /tmp/generated1234.c 文件中生成一些 C 代码，该文件定义了一些 void genfoo1234(void) 函数，编译该文件（例如使用最近的 {{3 }} 编译器作为 gcc -O -fPIC -Wall -shared /tmp/generated1234.c -o /tmp/generated1234.so) 然后在 /tmp/generated1234.so 上使用 GCC 然后在 genfoo1234 上使用 dlopen(3) 在 dlopen 返回的句柄上获取函数指针）。根据个人经验，这种方法在今天（2021 年，在 Linux 笔记本电脑上）足够快，甚至可以用于交互式使用（如果每个临时生成的 C 文件都有几百行 C 代码）。
在 x86、x86-64、ARM 处理器上使用一些机器代码生成库，如 dlsym(3)、GNU lightning（或在 C++ 中，libgccjit）

在实践中，您将为 asmjit（将函数指针与封闭值分组）生成代码并将其用作 closure。

相关的一点是垃圾收集，因此请阅读 callback。

还考虑在您的应用程序中嵌入一些现有的解释器，例如 garbage collection handbook、Lua、GNU guile 等......

研究这些解释器的源代码，至少是为了获得灵感。

Quenniec 的书Lisp 小部分 和Python 值得一读。都解释了实际问题和实现细节

另请参阅最近的 GCC 编译器（2021 年）中的 Dragon book。

有没有办法在C中做cur？

6 个答案: