为什么我们需要在C#中装箱和拆箱?
我知道拳击和拆箱是什么,但我无法理解它的实际用途。为什么以及在哪里使用它?
short s = 25;
object objshort = s; //Boxing
short anothershort = (short)objshort; //Unboxing
答案 0 :(得分:452)
为什么
要拥有一个统一的类型系统,并允许值类型与引用类型表示其基础数据的方式具有完全不同的基础数据表示(例如,int
只是一个三十二的桶与参考类型完全不同的位。)
这样想。您有o
类型的变量object
。现在你有一个int
,你想把它放进o
。 o
是对某个地方的引用,int
强调不是对某个地方的引用(毕竟,它只是一个数字)。因此,您要做的是:创建一个可以存储object
的新int
,然后将该对象的引用分配给o
。我们称这个过程为“拳击”。
所以,如果你不关心拥有一个统一的类型系统(即,引用类型和值类型具有非常不同的表示,而你不想要一种“代表”两者的常用方法)那么你就不要需要拳击。如果你不关心int
代表它们的基础价值(即,int
也是引用类型而只是存储对它们的基础值的引用)那么你不需要装箱。 / p>
我应该在哪里使用它。
例如,旧集合类型ArrayList
仅吃object
秒。也就是说,它只存储对某些地方的某些东西的引用。没有拳击,你不能将int
放入这样的集合中。但是对于拳击,你可以。
现在,在仿制药的时代,你并不真的需要这个,并且通常可以快乐地走,而不用考虑问题。但有一些需要注意的注意事项:
这是正确的:
double e = 2.718281828459045;
int ee = (int)e;
这不是:
double e = 2.718281828459045;
object o = e; // box
int ee = (int)o; // runtime exception
相反,你必须这样做:
double e = 2.718281828459045;
object o = e; // box
int ee = (int)(double)o;
首先,我们必须明确取消double
((double)o
),然后将其转换为int
。
以下结果如何:
double e = 2.718281828459045;
double d = e;
object o1 = d;
object o2 = e;
Console.WriteLine(d == e);
Console.WriteLine(o1 == o2);
在继续下一句话之前,请考虑一下。
如果您说True
和False
太棒了!等等,什么?这是因为引用类型上的==
使用引用相等性来检查引用是否相等,而不是基础值是否相等。这是一个非常容易犯的错误。也许更微妙
double e = 2.718281828459045;
object o1 = e;
object o2 = e;
Console.WriteLine(o1 == o2);
还会打印False
!
更好地说:
Console.WriteLine(o1.Equals(o2));
然后,谢天谢地,打印True
。
最后一个微妙之处:
[struct|class] Point {
public int x, y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
}
Point p = new Point(1, 1);
object o = p;
p.x = 2;
Console.WriteLine(((Point)o).x);
输出是什么?这取决于!如果Point
为struct
,则输出为1
,但如果Point
为class
,则输出为2
!装箱转换会复制正在装箱的值,以解释行为的差异。
答案 1 :(得分:50)
在.NET框架中,有两种类型 - 值类型和引用类型。这在OO语言中相对常见。
面向对象语言的一个重要特性是能够以类型无关的方式处理实例。这被称为polymorphism。既然我们想利用多态,但我们有两种不同的类型,必须有一些方法将它们组合在一起,这样我们才能以相同的方式处理其中一种。
现在,回到过去的日子(Microsoft.NET的1.0),没有这个新奇的泛型hullabaloo。您无法编写具有可以为值类型和引用类型提供服务的单个参数的方法。这是对多态性的违反。因此采用拳击作为将值类型强制转换为对象的手段。
如果无法做到这一点,那么框架将充满方法和类,其唯一目的是接受其他类型的类型。不仅如此,但由于值类型并不真正共享一个共同类型的祖先,因此每个值类型(bit,byte,int16,int32等等)都必须有不同的方法重载。
拳击阻止了这种情况的发生。 这就是为什么英国人庆祝节礼日。
答案 2 :(得分:30)
The best way to understand this is to look at lower-level programming languages C# builds on.
In the lowest-level languages like C, all variables go one place: The Stack. Each time you declare a variable it goes on the Stack. They can only be primitive values, like a bool, a byte, a 32-bit int, a 32-bit uint, etc. The Stack is both simple and fast. As variables are added they just go one on top of another, so the first you declare sits at say, 0x00, the next at 0x01, the next at 0x02 in RAM, etc. In addition, variables are often pre-addressed at compile-time, so their address is known before you even run the program.
In the next level up, like C++, a second memory structure called the Heap is introduced. You still mostly live in the Stack, but special ints called Pointers can be added to the Stack, that store the memory address for the first byte of an Object, and that Object lives in the Heap. The Heap is kind of a mess and somewhat expensive to maintain, because unlike Stack variables they don't pile linearly up and then down as a program executes. They can come and go in no particular sequence, and they can grow and shrink.
Dealing with pointers is hard. They're the cause of memory leaks, buffer overruns, and frustration. C# to the rescue.
At a higher level, C#, you don't need to think about pointers - the .Net framework (written in C++) thinks about these for you and presents them to you as References to Objects, and for performance, lets you store simpler values like bools, bytes and ints as Value Types. Underneath the hood, Objects and stuff that instantiates a Class go on the expensive, Memory-Managed Heap, while Value Types go in that same Stack you had in low-level C - super-fast.
For the sake of keeping the interaction between these 2 fundamentally different concepts of memory (and strategies for storage) simple from a coder's perspective, Value Types can be Boxed at any time. Boxing causes the value to be copied from the Stack, put in an Object, and placed on the Heap - more expensive, but, fluid interaction with the Reference world. As other answers point out, this will occur when you for example say:
import csv
with open(number_file) as fileobj:
result = {row[0]: row[1:] for row in csv.reader(fileobj, delimiter='\t')}
A strong illustration of the advantage of Boxing is a check for null:
bool b = false; // Cheap, on Stack
object o = b; // Legal, easy to code, but complex - Boxing!
bool b2 = (bool)o; // Unboxing!
Our object o is technically an address in the Stack that points to a copy of our bool b, which has been copied to the Heap. We can check o for null because the bool's been Boxed and put there.
In general you should avoid Boxing unless you need it, for example to pass an int/bool/whatever as an object to an argument. There are some basic structures in .Net that still demand passing Value Types as object (and so require Boxing), but for the most part you should never need to Box.
A non-exhaustive list of historical C# structures that require Boxing, that you should avoid:
The Event system turns out to have a Race Condition in naive use of it, and it doesn't support async. Add in the Boxing problem and it should probably be avoided. (You could replace it for example with an async event system that uses Generics.)
The old Threading and Timer models forced a Box on their parameters but have been replaced by async/await which are far cleaner and more efficient.
The .Net 1.1 Collections relied entirely on Boxing, because they came before Generics. These are still kicking around in System.Collections. In any new code you should be using the Collections from System.Collections.Generic, which in addition to avoiding Boxing also provide you with stronger type-safety.
You should avoid declaring or passing your Value Types as objects, unless you have to deal with the above historical problems that force Boxing, and you want to avoid the performance hit of Boxing it later when you know it's going to be Boxed anyway.
Per Mikael's suggestion below:
if (b == null) // Will not compile - bools can't be null
if (o == null) // Will compile and always return false
using System.Collections.Generic;
var employeeCount = 5;
var list = new List<int>(10);
This answer originally suggested Int32, Bool etc cause boxing, when in fact they are simple aliases for Value Types. That is, .Net has types like Bool, Int32, String, and C# aliases them to bool, int, string, without any functional difference.
答案 3 :(得分:19)
拳击并不是你真正使用的东西 - 它是运行时使用的东西,因此你可以在必要时以相同的方式处理引用和值类型。例如,如果您使用ArrayList来保存整数列表,则整数被装箱以适合ArrayList中的对象类型插槽。
现在使用通用集合,这几乎消失了。如果您创建了List<int>
,则没有完成装箱 - List<int>
可以直接保存整数。
答案 4 :(得分:11)
Boxing和Unboxing专门用于将值类型对象视为引用类型;将其实际值移动到托管堆并通过引用访问它们的值。
如果没有装箱和拆箱,你永远不能通过引用传递值类型;这意味着您无法将值类型作为Object的实例传递。
答案 5 :(得分:6)
我必须解开的最后一个地方是编写一些从数据库中检索一些数据的代码(我没有使用LINQ to SQL,只是普通的ADO.NET}:
int myIntValue = (int)reader["MyIntValue"];
基本上,如果你在泛型之前使用旧的API,你会遇到拳击。除此之外,它并不常见。
答案 6 :(得分:4)
当我们有一个需要对象作为参数的函数时,需要装箱,但是我们有不同的值类型需要传递,在这种情况下我们需要先将值类型转换为对象数据类型,然后再将其传递给功能
我不认为这是真的,请尝试这样做:
class Program
{
static void Main(string[] args)
{
int x = 4;
test(x);
}
static void test(object o)
{
Console.WriteLine(o.ToString());
}
}
运行得很好,我没有使用装箱/拆箱。 (除非编译器在幕后执行此操作?)
答案 7 :(得分:1)
在.net中,Object的每个实例或从中派生的任何类型都包含一个数据结构,其中包含有关其类型的信息。 .net中的“真实”值类型不包含任何此类信息。为了允许值类型中的数据由期望接收从对象派生的类型的例程操纵,系统自动为每个值类型定义具有相同成员和字段的相应类类型。 Boxing创建此类类型的新实例,从值类型实例复制字段。拆箱将类类型实例中的字段复制到值类型的实例。从值类型创建的所有类类型都是从具有讽刺意味的类ValueType派生的(尽管它的名称,它实际上是一个引用类型)。
答案 8 :(得分:0)
当方法仅将引用类型作为参数(比如通过new
约束限制为类的泛型方法)时,您将无法将引用类型传递给它并且必须使用框它
对于将object
作为参数的任何方法也是如此 - 这将将作为引用类型。
答案 9 :(得分:0)
通常,您通常希望避免装箱值类型。
但是,很少有这种情况有用。例如,如果您需要定位1.1框架,则无法访问泛型集合。在.NET 1.1中使用集合需要将您的值类型视为System.Object,这会导致装箱/取消装箱。
在.NET 2.0+中仍然存在这种情况。只要您想利用所有类型(包括值类型)可以直接作为对象这一事实,您可能需要使用装箱/拆箱。这有时很方便,因为它允许您在集合中保存任何类型(通过在通用集合中使用对象而不是T),但一般来说,最好避免这种情况,因为您正在失去类型安全性。但是,经常发生拳击的一种情况是,当你使用反射时 - 反射中的许多调用在处理值类型时需要装箱/拆箱,因为事先不知道类型。
答案 10 :(得分:0)
装箱是将值转换为引用类型,并且数据在堆上的对象中处于某个偏移位置。
关于拳击的实际作用。这是一些例子
Mono C ++
void* mono_object_unbox (MonoObject *obj)
{
MONO_EXTERNAL_ONLY_GC_UNSAFE (void*, mono_object_unbox_internal (obj));
}
#define MONO_EXTERNAL_ONLY_GC_UNSAFE(t, expr) \
t result; \
MONO_ENTER_GC_UNSAFE; \
result = expr; \
MONO_EXIT_GC_UNSAFE; \
return result;
static inline gpointer
mono_object_get_data (MonoObject *o)
{
return (guint8*)o + MONO_ABI_SIZEOF (MonoObject);
}
#define MONO_ABI_SIZEOF(type) (MONO_STRUCT_SIZE (type))
#define MONO_STRUCT_SIZE(struct) MONO_SIZEOF_ ## struct
#define MONO_SIZEOF_MonoObject (2 * MONO_SIZEOF_gpointer)
typedef struct {
MonoVTable *vtable;
MonoThreadsSync *synchronisation;
} MonoObject;
Mono中的取消装箱是一种在对象中偏移2个gpointer(例如16个字节)的指针的过程。 gpointer
是void*
。在查看MonoObject
的定义时,这很有意义,因为它显然只是数据的标题。
C ++
要在C ++中装箱值,您可以执行以下操作:
#include <iostream>
#define Object void*
template<class T> Object box(T j){
return new T(j);
}
template<class T> T unbox(Object j){
T temp = *(T*)j;
delete j;
return temp;
}
int main() {
int j=2;
Object o = box(j);
int k = unbox<int>(o);
std::cout << k;
}