Question

enter image description here

CAT *p;
...
p->speak();
...

有些书说编译器会将p-＆gt; speak（）转换为：

(*p->vptr[i])(p); //i is the idx of speak in the vtbl

我的问题是：因为在编译时，不可能知道p的实际类型，这意味着无法知道要使用哪个vptr或vtbl。那么，编译器如何生成正确的代码？

[改性]

例如：

void foo(CAT* c)
{
    c->speak();
    //if c point to SmallCat
    // should translate to (*c->vptr[i])(p); //use vtbl at 0x1234   
    //if c point to CAT
    // should translate to (*c->vptr[i])(p); //use vtbl at 0x5678  

    //since ps,pc all are CAT*, why does compiler can generate different code for them 
    //in compiler time?
}

...
CAT *ps,*pc;
ps = new SmallCat;  //suppose SmallCat's vtbl address is 0x1234;
pc = new CAT;       //suppose CAT's vtbl address is 0x5678;
...
foo(ps);
foo(pc)
...

有什么想法吗？感谢。

Answer 1

您的图片缺失的是从CAT和SmallCAT个对象到相应vtbls的箭头。编译器将指向vtbl的指针嵌入到对象本身中 - 可以将其视为隐藏的成员变量。这就是为什么据说在内存占用中添加第一个虚拟函数“花费”每个对象一个指针。指向vtbl的指针是由构造函数中的代码设置的，因此所有编译器生成的虚拟调用需要做的是在运行时获取其vtable，取消引用指向this的指针。

当然，虚拟和多重继承会变得更加复杂：编译器需要生成稍微不同的代码，但基本过程保持不变。

以下是您更详细解释的示例：

CAT *p1,*p2;
p1 = new SmallCat;  //suppose its vtbl address is 0x1234;
// The layout of SmallCat object includes a vptr as a hidden member.
// At this point, the value of this vptr is set to 0x1234.
p2 = new CAT;       //suppose its vtbl address is 0x5678;
// The layout of Cat object also includes a vptr as a hidden member.
// At this point, the value of this vptr is set to 0x5678.
(*p1->vptr[i])(p); //should use vtbl at 0x1234
// Compiler has enough information to do that, because it squirreled away 0x1234
// inside the SmallCat object at the time it was constructed.
(*p2->vptr[i])(p); //should use vtbl at 0x5678
// Same deal - the constructor saved 0x5678 inside the Cat, so we're good.

Answer 2

这意味着无法知道要使用哪个vptr或vtbl

在方法调用期间，这是正确的。但是在构造时，构造对象的类型实际上是已知的，编译器将在ctor中生成代码以初始化vptr以指向相应类的vtbl。所有后来的虚方法调用都将通过此vptr调用右vtbl中的方法。

有关此初始化如何与基础对象（多个ctors按顺序调用）完全一致的更多详细信息，请参阅this answer类似的问题。

Answer 3

编译器隐式地向每个具有一个或多个虚函数的类添加一个名为vptr的指针。

你可以在这样的类上使用sizeof来判断这一点，并且看到它大于4或8字节所期望的值，具体取决于sizeof(void*)。

编译器还向每个类的构造函数添加了一段隐含的代码，它将vptr设置为指向函数指针表（a.k.a.V-Table）。

实例化对象时，显式“提及”其类型。

例如：A a(1)或A* p = new B(2)。

因此，在构造函数中，在运行时中，vptr可以轻松设置为指向正确的V-Table。

在上面的示例中：

vptr的{{1}}设置为指向a的V-Table。
class A的{{1}}设置为指向vptr的V-Table。

BTW，构造函数与所有其他函数不同，事实上你必须显式地使用对象类型才能调用它（因此构造函数永远不能被声明为虚拟）。

以下是编译器为虚函数p生成正确代码的方法：

class B

编译器对p->speak()层次结构中的所有CAT *p; ... p = new SuperCat("SaberTooth",2); // p->vptr = SuperCat_Vtable ... p->speak(); // See pseudo assembly code below Ax = p // Get the address of the instance Bx = p->vptr // Get the address of the instance's V-Table Cx = Bx + CAT::speak // Add the number of the function in its class Dx = *Cx // Get the address of the appropriate function Push Ax // Push the address of the instance into the stack Push Dx // Push the address of the function into the stack CallF // Save some registers and jump to the beginning of the function函数使用相同的数字（索引）。

以下是编译器为非虚函数speak生成正确代码的方法：

class CAT

由于p->eat()函数的地址在编译时是已知的，因此汇编代码更有效。

最后，这里是'vptr'在运行时设置为指向正确的V-Table的方式：

p->eat(); // See pseudo assembly code below

Ax = p        // Get the address of the instance
Bx = CAT::eat // Get the address of the function
Push Ax       // Push the address of the instance into the stack
Push Bx       // Push the address of the function into the stack
CallF         // Save some registers and jump to the beginning of the function

实例化eat时，会创建一个新对象及其class SmallCat { void* vptr; // implicitly added by the compiler ... // your explicit variables SmallCat() { vptr = (void*)0x1234; // implicitly added by the compiler ... // Your explicit code } };

Answer 4

当你写这篇文章时（我用小写替换了所有用户代码）：

class cat {
public:
    virtual void speak() {std::cout << "meow\n";}
    virtual void eat() {std::cout << "eat\n";}
    virtual void destructor() {std::cout << "destructor\n";}
};

编译器神奇地生成所有这些（我的所有示例编译器代码都是大写的）：

class cat;
struct CAT_VTABLE_TYPE { //here's the cat's vtable type
    void(*speak)(cat* this); //contains a pointer for each virtual function
    void(*eat)(cat* this);
    void(*destructor)(cat* this);
};
extern CAT_VTABLE_TYPE CAT_VTABLE; //later is a global shared copy of the vtable
class cat { //here's the class you typed
private:
    CAT_VTABLE_TYPE* vptr; //but the compiler adds this magic member
public:
    cat() :vptr(&CAT_VTABLE) {} //the compiler initializes the vtable ptr
    ~cat() {vptr->destructor(this);} //redirects to the one you coded
    void speak() {vptr->speak(this);} //redirects to the one you coded
    void eat() {vptr->eat(this);} //redirects to the one you coded
};

//Here's the functions you programmed
void DEFAULT_CAT_SPEAK(CAT* this) {std::cout << "meow\n";}
void DEFAULT_CAT_EAT(CAT* this) {std::cout << "eat\n";}
void DEFAULT_CAT_DESTRUCTOR(CAT* this) {std::cout << "destructor\n";}
//and the global cat vtable (shared by all cat objects)
const CAT_VTABLE_TYPE CAT_VTABLE = {
    DEFAULT_CAT_SPEAK, 
    DEFAULT_CAT_EAT, 
    DEFAULT_CAT_DESTRUCTOR};

嗯，那不是很多吗？（我实际上略有欺骗，因为我在定义之前获取了一个对象的地址，但这种方式代码更少，更容易混淆，即使在技术上无法编译）你可以看到他们为什么将它构建到语言中。而且......之前是SmallCat：

class smallcat : public cat {
public:
    virtual void speak() {std::cout << "meow2\n";}
    virtual void destructor() {std::cout << "destructor2\n";}
};

之后：

class smallcat;
//here's the smallcat's vtable type
struct SMALLCAT_VTABLE_TYPE : public CAT_VTABLE_TYPE { 
     //contains no additional virtual functions that cat didn't have
};
extern SMALLCAT_VTABLE_TYPE SMALLCAT_VTABLE; //later is a global shared copy of the vtable
class smallcat : public cat { //here's the class you typed
public:
    smallcat() :vptr(&SMALLCAT_VTABLE) {} //the compiler initializes the vtable ptr
    //The other functions already are virtual, nothing additional needed
};
//Here's the functions you programmed
void DEFAULT_SMALLCAT_SPEAK(CAT* this) {std::cout << "meow2\n";}
void DEFAULT_SMALLCAT_DESTRUCTOR(CAT* this) {std::cout << "destructor2\n";}
//and the global cat vtable (shared by all cat objects)
const SMALLCAT_VTABLE_TYPE SMALLCAT_VTABLE = {
    DEFAULT_SMALLCAT_SPEAK, 
    DEFAULT_CAT_EAT, //note: eat wasn't overridden
    DEFAULT_SMALLCAT_DESTRUCTOR};

因此，如果读取太多，编译器会为每个类型创建一个VTABLE对象，该对象指向该特定类型的成员函数，然后它将指向该VTABLE的指针粘贴到其中每个实例。

当您创建smallcat对象时，编译器会构造cat父对象，该对象将vptr指定为CAT_VTABLE全局对象。紧接着，编译器构造smallcat派生对象，该对象覆盖vptr成员，使其指向SMALLCAT_VTABLE全局。

当你调用c->speak();时，编译器会调用它的cat::speak副本（看起来像this->vptr->speak(this);）。 vptr成员可能指向全局CAT_VTABLE或全局SMALLCAT_VTABLE，因此该表的speak指针指向DEFAULT_CAT_SPEAK（您放置的内容）在cat::speak）或DEFAULT_SMALLCAT_SPEAK（您放在smallcat::speak中的代码）。所以this->vptr->speak(this);最终调用最派生类型的函数，无论派生类型最多。

总而言之，它确实非常令人困惑，因为编译器在编译时神奇地重命名函数。实际上，由于多重继承，实际上它比我在这里显示的要复杂得多。

编译器如何为虚函数调用生成代码？

4 个答案: