C ++及其类型系统:如何处理多种类型的数据?

时间:2010-04-23 08:23:36

标签: c++ interpreter typing

“简介”

我对C ++比较陌生。我完成了所有基本的工作,并设法为我的编程语言构建了2-3个简单的解释器。

第一件事让我头疼:用C ++实现我的语言类型系统

想一想:Ruby,Python,PHP和Co.有很多内置类型,显然是用C实现的。 所以我第一次尝试的是让我的语言中有三种可能的类型:Int,String和Nil。

我想出了这个:

enum ValueType
{
     Int, String, Nil
};

class Value
{
 public:
  ValueType type;
  int intVal;
  string stringVal;
};
是的,哇,我知道。传递这个类非常慢,因为必须一直调用字符串分配器。

下次我尝试了类似的东西:

enum ValueType
{
     Int, String, Nil
};

extern string stringTable[255];
class Value
{
 public:
  ValueType type;
  int index;
};

我会将所有字符串存储在stringTable中并将其位置写入index。如果Value的类型是Int,我只是将整数存储在index中,使用int索引访问另一个int根本没有意义,或者?

无论如何,上面也让我头疼。过了一段时间,从这里的表中访问字符串,在那里引用它并在那里复制它变得越来越多 - 我失去了控制。我不得不放下翻译稿。

现在:好的,所以C和C ++是静态类型的。

  • 上述语言的主要实现如何处理程序中的不同类型(fixnums,bignums,nums,strings,arrays,resources,...)?

  • 我应该怎样做才能获得许多不同类型的最高速度?

  • 这些解决方案与我上面的简化版本相比如何?

5 个答案:

答案 0 :(得分:4)

一个明显的解决方案是定义类型层次结构:

class Type
{
};

class Int : public Type
{
};

class String : public Type
{
};

等等。作为一个完整的例子,让我们为一种小语言编写一个解释器。该语言允许声明这样的变量:

var a 10

这将创建一个Int对象,为其赋值10并将其存储在名称为a的变量表中。可以对变量调用操作。例如,对两个Int值的加法运算如下:

+ a b

以下是解释器的完整代码:

#include <iostream>
#include <string>
#include <vector>
#include <sstream>
#include <cstdlib>
#include <map>

// The base Type object from which all data types are derived.
class Type
{
public:
  typedef std::vector<Type*> TypeVector;
  virtual ~Type () { }

  // Some functions that you may want all types of objects to support:

  // Returns the string representation of the object.
  virtual const std::string toString () const = 0;
  // Returns true if other_obj is the same as this.
  virtual bool equals (const Type &other_obj) = 0;
  // Invokes an operation on this object with the objects in args
  // as arguments.
  virtual Type* invoke (const std::string &opr, const TypeVector &args) = 0;
};

// An implementation of Type to represent an integer. The C++ int is
// used to actually store the value.  As a consequence this type is
// machine dependent, which might not be what you want for a real
// high-level language.
class Int : public Type
{
public:
  Int () : value_ (0), ret_ (NULL) { }
  Int (int v) : value_ (v), ret_ (NULL) { }
  Int (const std::string &v) : value_ (atoi (v.c_str ())), ret_ (NULL) { }
  virtual ~Int ()
  {
    delete ret_;
  }
  virtual const std::string toString () const
  {
    std::ostringstream out;
    out << value_;
    return out.str ();
  }
  virtual bool equals (const Type &other_obj)
  {    
    if (&other_obj == this) 
      return true;
    try
      {
        const Int &i = dynamic_cast<const Int&> (other_obj);
        return value_ == i.value_;
      }
    catch (std::bad_cast ex)
      {
        return false;
      }
  }
  // As of now, Int supports only addition, represented by '+'.
  virtual Type* invoke (const std::string &opr, const TypeVector &args)    
  {
    if (opr == "+")
      {
        return add (args);
      }
    return NULL;
  }
private:
  Type* add (const TypeVector &args)
  {
    if (ret_ == NULL) ret_ = new Int;
    Int *i = dynamic_cast<Int*> (ret_);
    Int *arg = dynamic_cast<Int*> (args[0]);
    i->value_ = value_ + arg->value_;
    return ret_;
  }
  int value_;
  Type *ret_;
};

// We use std::map as a symbol (or variable) table.
typedef std::map<std::string, Type*> VarsTable;
typedef std::vector<std::string> Tokens;

// A simple tokenizer for our language. Takes a line and
// tokenizes it based on whitespaces.  
static void
tokenize (const std::string &line, Tokens &tokens)
{
  std::istringstream in (line, std::istringstream::in);
  while (!in.eof ())
    {
      std::string token;
      in >> token;
      tokens.push_back (token);
    }
}

// Maps varName to an Int object in the symbol table.  To support
// other Types, we need a more complex interpreter that actually infers
// the type of object by looking at the format of value.
static void
setVar (const std::string &varName, const std::string &value,
        VarsTable &vars)
{
  Type *t = new Int (value);
  vars[varName] = t;
}

// Returns a previously mapped value from the symbol table.
static Type *
getVar (const std::string &varName, const VarsTable &vars)
{
  VarsTable::const_iterator iter = vars.find (varName);
  if (iter == vars.end ())
    {
      std::cout << "Variable " << varName 
                << " not found." << std::endl;
      return NULL;
    }
  return const_cast<Type*> (iter->second);
}

// Invokes opr on the object mapped to the name var01.
// opr should represent a binary operation. var02 will
// be pushed to the args vector. The string represenation of
// the result is printed to the console.
static void
invoke (const std::string &opr, const std::string &var01,
        const std::string &var02, const VarsTable &vars)
{
  Type::TypeVector args;
  Type *arg01 = getVar (var01, vars);
  if (arg01 == NULL) return;
  Type *arg02 = getVar (var02, vars);
  if (arg02 == NULL) return;
  args.push_back (arg02);
  Type *ret = NULL;
  if ((ret = arg01->invoke (opr, args)) != NULL)
    std::cout << "=> " << ret->toString () << std::endl;
  else
    std::cout << "Failed to invoke " << opr << " on " 
              << var01 << std::endl;
}

// A simple REPL for our language. Type 'quit' to exit
// the loop.
int 
main (int argc, char **argv)
{
  VarsTable vars;
  std::string line;
  while (std::getline (std::cin, line))
    {
      if (line == "quit")
        break;
      else
        {
          Tokens tokens;
          tokenize (line, tokens);
          if (tokens.size () != 3)
            {
              std::cout << "Invalid expression." << std::endl;
              continue;
            }
          if (tokens[0] == "var")
            setVar (tokens[1], tokens[2], vars);
          else
            invoke (tokens[0], tokens[1], tokens[2], vars);
        }
    }  
  return 0;
}

与解释器的示例交互:

/home/me $ ./mylang

var a 10
var b 20
+ a b
30
+ a c
Variable c not found.
quit

答案 1 :(得分:4)

你可以在这里做几件不同的事情。不同的解决方案及时出现,其中大部分需要动态分配实际数据(boost :: variant可以避免为小对象使用动态分配的内存 - 谢谢@MSalters)。

Pure C方法:

存储类型信息和指向必须根据类型信息(通常是枚举)解释的内存的void指针:

enum type_t {
   integer,
   string,
   null
};
typedef struct variable {
   type_t type;
   void * datum;
} variable_t;
void init_int_variable( variable_t * var, int value )
{
   var->type = integer;
   var->datum = malloc( sizeof(int) );
   *((int)var->datum) = value;
}
void fini_variable( variable_t var ) // optionally by pointer
{
   free( var.datum );
}

在C ++中,您可以通过使用类来简化使用来改进此方法,但更重要的是,您可以使用更复杂的解决方案并使用现有库作为boost :: any或boost :: variant,为同一问题提供不同的解决方案

boost :: any和boost :: variant都将值存储在动态分配的内存中,通常通过指向层次结构中虚拟类的指针,以及将操作符重新解释(向下转换)为具体类型。

答案 2 :(得分:1)

关于速度,你说:

  

传递这个是非常缓慢的   作为字符串分配器的类   不得不一直打电话。

你知道你应该通过参考绝大多数时间传递对象吗?您的解决方案看起来适用于简单的解释器。

答案 3 :(得分:1)

C ++是一种强类型语言。我可以看到你是来自非类型语言,仍然用这些术语思考。

如果确实需要在变量中存储多种类型,请查看boost::any

但是,如果要实现解释器,则应使用表示特定类型的继承和类。

答案 4 :(得分:0)

根据Vijay的解决方案,实施将是:

Type* array;
// to initialize the array
array = new Type(size_of_array);
// when you want to add values
array[0] = new Int(42);
// to add another string value
array[1] = new String("fourty two");

他的代码中缺少的是如何提取这些值......这是我的版本(实际上我是从Ogre中学到的并根据自己的喜好对其进行了修改)。

用法类似于:

Any array[4];
// Automatically understands it's an integer
array[0] = Any(1);
// But let's say you want the number to be thought of as float
array[1] = Any<float>(2);
// What about string?
array[2] = Any<std::string>("fourty two");
// Note that this gets the compiler thinking it's a char*
// instead of std::string
array[3] = Any("Sometimes it just turns out to be what you don't want!");

好的,现在要查看某个特定元素是否为字符串:

if(array[2].isType<std::string>()
{
   // Extract the string value.
   std::string val = array[2].cast<std::string>();
   // Make the string do your bidding!!!... /evilgrin
   // WAIT! But what if you want to directly manipulate
   // the value in the array?
   std::string& val1 = array[2].cast<std::string>();
   // HOHOHO... now any changes to val1 affects the value
   // in the array ;)
}

Any类的代码如下。随意使用它,但你喜欢:)。希望这有帮助!

在头文件中...说Any.h

    #include <typeinfo>
    #include <exception>

    /*
     *    \class Any
     *    \brief A variant type to hold any type of value.
     *    \detail This class can be used to store values whose types are not 
     *        known before hand, like to store user-data.
     */
    class Any
    {
    public:
        /*!
         *    \brief Default constructor. 
         */

    Any(void);

    /*!
     *    \brief Constructor that accepts a default user-defined value.
     *    \detail This constructor copies that user-defined value into a 
     *        place holder. This constructor is explicit to avoid the compiler
     *        to call this constructor implicitly when the user didn't want
     *        the conversion to happen.
     *    \param val const reference to the value to be stored.
     */
    template <typename ValueType>
    explicit Any(const ValueType& val);

    /*!
     *    \brief Copy constructor.
     *    \param other The \c Any variable to be copied into this.
     */
    Any(const Any& other);

    /*!
     *    \brief Destructor, does nothing other than destroying the place holder.
     */
    ~Any(void);

    /*!
     *    \brief Gets the type of the value stored by this class.
     *    \detail This function uses typeid operator to determine the type
     *        of the value it stores.
     *    \remarks If the place holder is empty it will return Touchscape::VOID_TYPE.
     *        It is wise to check if this is empty by using the function Any::isEmpty().
     */
    const std::type_info& getType() const;

    /*!
     *    \brief Function to verify type of the stored value.
     *    \detail This function can be used to verify the type of the stored value.
     *    Usage:
     *    \code
     *    int i;
     *    Touchscape::Any int_any(i);
     *    // Later in your code...
     *    if (int_any.isType<int>())
     *    {
     *        // Do something with int_any.
     *    }
     *    \endcode
     *    \return \c true if the type matches, false otherwise.
     */
    template <typename T>
    bool isType() const;

    /*!
     *    \brief Checks if the type stored can be converted 'dynamically'
     *        to the requested type.
     *    \detail This would useful when the type stored is a base class
     *        and you would like to verify if it can be converted to type
     *        the user wants.
     *    Example:
     *    \code
     *    class Base
     *    {
     *        // class implementation.
     *    };
     *    class Derived : public Base
     *    {
     *        // class implementation.
     *    };
     *
     *    // In your implementation function.
     *    {
     *        //...
     *        // Somewhere in your code.
     *        Base* a = new Derived();
     *        Touchscape::Any user_data(a);
     *        my_object.setUserData(user_data);
     *        // Then when you need to know the user-data type
     *        if(my_object.getUserData().isDynamicType<Derived>())
     *        {
     *            // Do something with the user data
     *        }
     *    }
     *    \endcode
     *    \return \c true if the value stored can be dynamically casted to the target type.
     *    \deprecated This function will be removed and/or changed in the future.
     */
    template <typename T>
    bool isDynamicType() const;

    /*!
     *    \brief Convert the value stored to the required type.
     *    \detail This function is used just like a static-cast to retrieve
     *        the stored value.
     *    \return A reference to the stored value.
     *    \warning This function will throw std::bad_cast exception if it
     *        finds the target type to be incorrect.
     */
    template <typename T>
    T& cast();

    /*!
     *    \brief Convert the value stored to the required type (const version).
     *    \detail This function is used just like static_cast to retrieve
     *        the stored value.
     *    \return A \c const reference to the stored value.
     *    \warning This function will throw std::bad_cast exception if it
     *        finds the target type to be incorrect.
     */
    template <typename T>
    const T& cast() const;

    /*!
     *    \brief Dynamically converts the stored value to the target type
     *    \detail This function is just like dynamic_cast to retrieve
     *        the stored value to the target type.
     *    \return A reference to the stored value.
     *    \warning This function will throw std::bad_cast exception if it
     *        finds that the value cannot be dynamically converted to the target type.
     *    \deprecated This function will be removed and/or changed in the future.
     */
    template <typename T>
    T& dynamicCast();

    /*!
     *    \brief Dynamically converts the stored value to the target type (const version)
     *    \detail This function is just like dynamic_cast to retrieve
     *        the stored value to the target type.
     *    \return A const reference to the stored value.
     *    \warning This function will throw std::bad_cast exception if it
     *        finds that the value cannot be dynamically converted to the target type.
     *    \deprecated This function will be removed and/or changed in the future.
     */
    template <typename T>
    const T& dynamicCast() const;

    /*!
     *    \brief Swaps the contents with another \c Any variable.
     *    \return reference to this instance.
     */
    Any& swap(Any& other);

    /*!
     *    \brief Checks if the place holder is empty.
     *    \return \c true if the the place holder is empty, \c false otherwise.
     */
    bool isEmpty() const;

    /*!
     *    \brief Checks if the place holder is \b not empty.
     *    \return \c true if the the place holder is not empty, \c false otherwise.
     *    \remarks This is just a lazy programmer's attempt to make the code look elegant.
     */
    bool isNotEmpty() const;

    /*!
     *    \brief Assignment operator
     *    \detail Assigns a 'raw' value to this instance.
     *    \return Reference to this instance after assignment.
     */
    template <typename ValueType>
    Any& operator = (const ValueType& rhs);

    /*!
     *    \brief Default assignment operator
     *    \detail Assigns another \c Any type to this one.
     *    \return Reference to this instance after assignment.
     */
    Any& operator = (const Any& rhs);

    /*!
     *    \brief Boolean equality operator
     */
    bool operator == (const Any& other) const;

    /*!
     *    \brief Boolean equality operator that accepts a 'raw' type.
     */
    template<typename ValueType>
    bool operator == (const ValueType& other) const;

    /*!
     *    \brief Boolean inequality operator
     */
    bool operator != (const Any& other) const;

    /*!
     *    \brief Boolean inequality operator that accepts a 'raw' type.
     */


      template<typename ValueType>
        bool operator != (const ValueType& other) const;
    protected:
        /*!
         *    \class PlaceHolder
         *    \brief The place holder base class
         *    \detail The base class for the actual 'type'd class that stores
         *        the value for T

ouchscape::Any.
     */
    class PlaceHolder
    {
    public:

        /*!
         *    \brief Virtual destructor.
         */
        virtual ~PlaceHolder(){}

        /*!
         *    \brief Gets the \c type_info of the value stored.
         *    \return (const std::type_info&) The typeid of the value stored.
         */
        virtual const std::type_info& getType() const = 0;
        /*!
         *    \brief Clones this instance.
         *    \return (PlaceHolder*) Cloned instance.
         */
        virtual PlaceHolder* clone() const = 0;
    };

    /*!
     *    \class PlaceHolderImpl
     *    \brief The class that ultimately keeps hold of the value stored
     *        in Touchscape::Any.
     */
    template <typename ValueType>
    class PlaceHolderImpl : public PlaceHolder
    {
    public:
        /*!
         *    \brief The only constructor allowed.
         *    \param val The value to store.
         */
        PlaceHolderImpl(const ValueType& val)
            :m_value(val){}
        /*!
         *    \brief The destructor.
         *    \detail Does nothing
         */
        ~PlaceHolderImpl(){}

        /*!
         *    \copydoc Touchscape::PlaceHolder::getType()
         */
        const std::type_info& getType() const
        {
            return typeid(ValueType);
        }

        /*!
         *    \copydoc Touchscape::PlaceHolder::clone()
         */
        PlaceHolder* clone() const
        {
            return new PlaceHolderImpl<ValueType>(m_value);
        }

        ValueType m_value;
    };

    PlaceHolder* m_content;
};

/************************************************************************/
/* Template code implementation section                                 */
/************************************************************************/
template <typename ValueType>
Any::Any(const ValueType& val)
    :m_content(new PlaceHolderImpl<ValueType>(val))
{
}
//---------------------------------------------------------------------
template <typename T>
bool Any::isType() const
{
    bool result = m_content?m_content->getType() == typeid(T):false;
    return result;
}
//---------------------------------------------------------------------
template <typename T>
bool Any::isDynamicType() const
{
    bool result = m_content
        ?dynamic_cast<T>(static_cast<PlaceHolderImpl<T>*>(m_content)->m_value)!=NULL
        :false;
    return result;
}
//---------------------------------------------------------------------
template <typename T>
T& Any::cast()
{
    if (getType() != VOID_TYPE && isType<T>())
    {
        T& result = static_cast<PlaceHolderImpl<T>*>(m_content)->m_value;
        return result;
    }
    StringStream ss;
    ss<<"Cannot convert '"<<getType().name()<<"' to '"<<typeid(T).name()<<"'. Did you mean to use dynamicCast() to cast to a different type?";
    throw std::bad_cast(ss.str().c_str());
}
//---------------------------------------------------------------------
template <typename T>
const T& Any::cast() const
{
    Any& _this = const_cast<Any&>(*this);
    return _this.cast<T>();
}
//---------------------------------------------------------------------
template <typename T>
T& Any::dynamicCast()
{
    T* result = dynamic_cast<T>(static_cast<PlaceHolderImpl<T>*>(m_content)->m_value);
    if (result == NULL)
    {
        StringStream ss;
        ss<<"Cannot convert '"<<getType().name()<<"' to '"<<typeid(T)<<"'.";
        throw std::bad_cast(ss.str().c_str());
    }
    return *result;
}
//---------------------------------------------------------------------
template <typename T>
const T& Any::dynamicCast() const
{
    Any& _this = const_cast<Any&>(*this);
    return _this.dynamicCast<T>();
}
//---------------------------------------------------------------------
template <typename ValueType>
Any& Any::operator = (const ValueType& rhs)
{
    Any(rhs).swap(*this);
    return *this;
}
//---------------------------------------------------------------------
template <typename ValueType>
bool Any::operator == (const ValueType& rhs) const
{
    bool result = m_content == rhs;
    return result;
}
//---------------------------------------------------------------------
template <typename ValueType>
bool Any::operator != (const ValueType& rhs) const
{
    bool result = m_content != rhs;
    return result;
}

现在在CPP文件中... Any.cpp

#include "Any.h"

static const std::type_info& VOID_TYPE(typeid(void));

Any::Any( void )
    :m_content(NULL)
{
}
//---------------------------------------------------------------------
Any::Any( const Any& other )
    :m_content(other.m_content?other.m_content->clone():NULL)
{
}
//---------------------------------------------------------------------
Any::~Any( void )
{
    SafeDelete(m_content);
}
//---------------------------------------------------------------------
const std::type_info& Any::getType() const
{
    return m_content?m_content->getType():VOID_TYPE;
}
//---------------------------------------------------------------------
Any& Any::swap( Any& other )
{
    std::swap(m_content, other.m_content);
    return *this;
}
//---------------------------------------------------------------------
Any& Any::operator=( const Any& rhs )
{
    Any(rhs).swap(*this);
    return *this;
}
//---------------------------------------------------------------------
bool Any::isEmpty() const
{
    bool is_empty = m_content == NULL;
    return is_empty;
}
//---------------------------------------------------------------------
bool Any::isNotEmpty() const
{
    bool is_not_empty = m_content != NULL;
    return is_not_empty;
}
//---------------------------------------------------------------------
bool Any::operator==( const Any& other ) const
{
    bool result = m_content == other.m_content;
    return result;
}
//---------------------------------------------------------------------
bool Any::operator!=( const Any& other ) const
{
    bool result = m_content != other.m_content;
    return result;
}