编译器如何为虚函数调用生成代码?

时间:2014-02-04 20:48:29

标签: c++

enter image description here

CAT *p;
...
p->speak();
...

有些书说编译器会将p-> speak()转换为:

(*p->vptr[i])(p); //i is the idx of speak in the vtbl

我的问题是:因为在编译时,不可能知道p的实际类型, 这意味着无法知道要使用哪个vptr或vtbl。那么,编译器如何生成正确的代码?

[改性]

例如:

void foo(CAT* c)
{
    c->speak();
    //if c point to SmallCat
    // should translate to (*c->vptr[i])(p); //use vtbl at 0x1234   
    //if c point to CAT
    // should translate to (*c->vptr[i])(p); //use vtbl at 0x5678  

    //since ps,pc all are CAT*, why does compiler can generate different code for them 
    //in compiler time?
}

...
CAT *ps,*pc;
ps = new SmallCat;  //suppose SmallCat's vtbl address is 0x1234;
pc = new CAT;       //suppose CAT's vtbl address is 0x5678;
...
foo(ps);
foo(pc)
...

有什么想法吗?感谢。

4 个答案:

答案 0 :(得分:20)

您的图片缺失的是从CATSmallCAT个对象到相应vtbls的箭头。编译器将指向vtbl的指针嵌入到对象本身中 - 可以将其视为隐藏的成员变量。这就是为什么据说在内存占用中添加第一个虚拟函数“花费”每个对象一个指针。指向vtbl的指针是由构造函数中的代码设置的,因此所有编译器生成的虚拟调用需要做的是在运行时获取其vtable,取消引用指向this的指针。

当然,虚拟和多重继承会变得更加复杂:编译器需要生成稍微不同的代码,但基本过程保持不变。

以下是您更详细解释的示例:

CAT *p1,*p2;
p1 = new SmallCat;  //suppose its vtbl address is 0x1234;
// The layout of SmallCat object includes a vptr as a hidden member.
// At this point, the value of this vptr is set to 0x1234.
p2 = new CAT;       //suppose its vtbl address is 0x5678;
// The layout of Cat object also includes a vptr as a hidden member.
// At this point, the value of this vptr is set to 0x5678.
(*p1->vptr[i])(p); //should use vtbl at 0x1234
// Compiler has enough information to do that, because it squirreled away 0x1234
// inside the SmallCat object at the time it was constructed.
(*p2->vptr[i])(p); //should use vtbl at 0x5678
// Same deal - the constructor saved 0x5678 inside the Cat, so we're good.

答案 1 :(得分:8)

  

这意味着无法知道要使用哪个vptr或vtbl

在方法调用期间,这是正确的。但是在构造时,构造对象的类型实际上是已知的,编译器将在ctor中生成代码以初始化vptr以指向相应类的vtbl。所有后来的虚方法调用都将通过此vptr调用右vtbl中的方法。

有关此初始化如何与基础对象(多个ctors按顺序调用)完全一致的更多详细信息,请参阅this answer类似的问题。

答案 2 :(得分:6)

编译器隐式地向每个具有一个或多个虚函数的类添加一个名为vptr的指针。

你可以在这样的类上使用sizeof来判断这一点,并且看到它大于4或8字节所期望的值,具体取决于sizeof(void*)

编译器还向每个类的构造函数添加了一段隐含的代码,它将vptr设置为指向函数指针表(a.k.a.V-Table)。

实例化对象时,显式“提及”其类型。

例如:A a(1)A* p = new B(2)

因此,在构造函数中,在运行时中,vptr可以轻松设置为指向正确的V-Table。

在上面的示例中:

  • vptr的{​​{1}}设置为指向a的V-Table。

  • class A的{​​{1}}设置为指向vptr的V-Table。

BTW,构造函数与所有其他函数不同,事实上你必须显式地使用对象类型才能调用它(因此构造函数永远不能被声明为虚拟)。

以下是编译器为虚函数p生成正确代码的方法:

class B

编译器对p->speak()层次结构中的所有CAT *p; ... p = new SuperCat("SaberTooth",2); // p->vptr = SuperCat_Vtable ... p->speak(); // See pseudo assembly code below Ax = p // Get the address of the instance Bx = p->vptr // Get the address of the instance's V-Table Cx = Bx + CAT::speak // Add the number of the function in its class Dx = *Cx // Get the address of the appropriate function Push Ax // Push the address of the instance into the stack Push Dx // Push the address of the function into the stack CallF // Save some registers and jump to the beginning of the function 函数使用相同的数字(索引)。

以下是编译器为非虚函数speak生成正确代码的方法:

class CAT

由于p->eat()函数的地址在编译时是已知的,因此汇编代码更有效。

最后,这里是'vptr'在运行时设置为指向正确的V-Table的方式:

p->eat(); // See pseudo assembly code below

Ax = p        // Get the address of the instance
Bx = CAT::eat // Get the address of the function
Push Ax       // Push the address of the instance into the stack
Push Bx       // Push the address of the function into the stack
CallF         // Save some registers and jump to the beginning of the function

实例化eat时,会创建一个新对象及其class SmallCat { void* vptr; // implicitly added by the compiler ... // your explicit variables SmallCat() { vptr = (void*)0x1234; // implicitly added by the compiler ... // Your explicit code } };

答案 3 :(得分:4)

当你写这篇文章时(我用小写替换了所有用户代码):

class cat {
public:
    virtual void speak() {std::cout << "meow\n";}
    virtual void eat() {std::cout << "eat\n";}
    virtual void destructor() {std::cout << "destructor\n";}
};

编译器神奇地生成所有这些(我的所有示例编译器代码都是大写的):

class cat;
struct CAT_VTABLE_TYPE { //here's the cat's vtable type
    void(*speak)(cat* this); //contains a pointer for each virtual function
    void(*eat)(cat* this);
    void(*destructor)(cat* this);
};
extern CAT_VTABLE_TYPE CAT_VTABLE; //later is a global shared copy of the vtable
class cat { //here's the class you typed
private:
    CAT_VTABLE_TYPE* vptr; //but the compiler adds this magic member
public:
    cat() :vptr(&CAT_VTABLE) {} //the compiler initializes the vtable ptr
    ~cat() {vptr->destructor(this);} //redirects to the one you coded
    void speak() {vptr->speak(this);} //redirects to the one you coded
    void eat() {vptr->eat(this);} //redirects to the one you coded
};

//Here's the functions you programmed
void DEFAULT_CAT_SPEAK(CAT* this) {std::cout << "meow\n";}
void DEFAULT_CAT_EAT(CAT* this) {std::cout << "eat\n";}
void DEFAULT_CAT_DESTRUCTOR(CAT* this) {std::cout << "destructor\n";}
//and the global cat vtable (shared by all cat objects)
const CAT_VTABLE_TYPE CAT_VTABLE = {
    DEFAULT_CAT_SPEAK, 
    DEFAULT_CAT_EAT, 
    DEFAULT_CAT_DESTRUCTOR};
嗯,那不是很多吗? (我实际上略有欺骗,因为我在定义之前获取了一个对象的地址,但这种方式代码更少,更容易混淆,即使在技术上无法编译)你可以看到他们为什么将它构建到语言中。而且......之前是SmallCat:

class smallcat : public cat {
public:
    virtual void speak() {std::cout << "meow2\n";}
    virtual void destructor() {std::cout << "destructor2\n";}
};

之后:

class smallcat;
//here's the smallcat's vtable type
struct SMALLCAT_VTABLE_TYPE : public CAT_VTABLE_TYPE { 
     //contains no additional virtual functions that cat didn't have
};
extern SMALLCAT_VTABLE_TYPE SMALLCAT_VTABLE; //later is a global shared copy of the vtable
class smallcat : public cat { //here's the class you typed
public:
    smallcat() :vptr(&SMALLCAT_VTABLE) {} //the compiler initializes the vtable ptr
    //The other functions already are virtual, nothing additional needed
};
//Here's the functions you programmed
void DEFAULT_SMALLCAT_SPEAK(CAT* this) {std::cout << "meow2\n";}
void DEFAULT_SMALLCAT_DESTRUCTOR(CAT* this) {std::cout << "destructor2\n";}
//and the global cat vtable (shared by all cat objects)
const SMALLCAT_VTABLE_TYPE SMALLCAT_VTABLE = {
    DEFAULT_SMALLCAT_SPEAK, 
    DEFAULT_CAT_EAT, //note: eat wasn't overridden
    DEFAULT_SMALLCAT_DESTRUCTOR};

因此,如果读取太多,编译器会为每个类型创建一个VTABLE对象,该对象指向该特定类型的成员函数,然后它将指向该VTABLE的指针粘贴到其中每个实例。

当您创建smallcat对象时,编译器会构造cat父对象,该对象将vptr指定为CAT_VTABLE全局对象。紧接着,编译器构造smallcat派生对象,该对象覆盖vptr成员,使其指向SMALLCAT_VTABLE全局。

当你调用c->speak();时,编译器会调用它的cat::speak副本(看起来像this->vptr->speak(this);)。 vptr成员可能指向全局CAT_VTABLE或全局SMALLCAT_VTABLE,因此该表的speak指针指向DEFAULT_CAT_SPEAK(您放置的内容)在cat::speak)或DEFAULT_SMALLCAT_SPEAK(您放在smallcat::speak中的代码)。所以this->vptr->speak(this);最终调用最派生类型的函数,无论派生类型最多。

总而言之,它确实非常令人困惑,因为编译器在编译时神奇地重命名函数。实际上,由于多重继承,实际上它比我在这里显示的要复杂得多。