我的问题如下: 我试图将x86汇编源代码翻译成c ++源代码。
Explanation as to what registers are.
skip this if you know what they are and how they work.
As you may or may not know, assembly language makes use of "general purpose registers".
In x86 assembly these registers are, and can be considered as "4 bytes" in length variables ( int var in c++ ), their names are: eax, ebx, ecx and edx.
Now, these registers are each respectively broken down into ax, bx, cx and dx that represent the 2 bytes less significant value of each register.
ax, bx, cx and dx are also broken down into ah, bx, ch and dh ( most significant byte ) and al, bl, cl and dl ( less significant byte ).
So, for example:
If I set eax:
EAX = 0xAB12CDEF
that would automatically change ax, al and ah
AX would become 0xCDEF
AH would become 0xCD
AL would become 0xEF
我的问题是:如何在C ++中实现这一目标?
int eax, ax, ah, al;
eax = 0xAB12CDEF
我如何制作,斧头啊,等等,同时改变? 或者是否有可能使它们指向不同的部分eax,如果是这样,怎么样? 谢谢! 附:另外我怎么能用另一个变量作为char呢? 我怎样才能创建变量new变量" char chAL"指向指向eax的al。 因此,当我对chAL进行更改时,更改会自动回复到eax,ah和al。
答案 0 :(得分:2)
如果您的目标是模拟X86汇编代码,那么您确实需要支持X86寄存器的行为。
这是使用union
:
#include <iostream>
#include <cstdint>
using namespace std;
union reg_t {
uint64_t rx;
uint32_t ex;
uint16_t x;
struct {
uint8_t l;
uint8_t h;
};
};
int main(){
reg_t a;
a.rx = 0xdeadbeefcafebabe;
cout << "rax = " << hex << a.rx << endl;
cout << "eax = " << hex << a.ex << endl;
cout << "ax = " << hex << a.x << endl;
cout << "al = " << hex << (uint16_t)a.l << endl;
cout << "ah = " << hex << (uint16_t)a.h << endl;
cout << "ax & 0xFF = " << hex << (a.x & 0xFF) << endl;
cout << "(ah << 8) + al = " << hex << (a.h << 8) + a.l << endl;
}
输出:
rax = deadbeefcafebabe
eax = cafebabe
ax = babe
al = be
ah = ba
ax & 0xFF = be
(ah << 8) + al = babe
您将在正确的平台上获得正确的结果(little-endian)。你必须交换 字节,和/或为其他平台添加填充。
这是基本的,脚踏实地的解决方案,它肯定适用于许多x86平台(至少X86 / linux / g ++工作正常),但这种方法依赖的行为似乎未定义C ++
这是使用字节数组存储寄存器内容的另一种方法:
class x86register {
uint8_t bytes[8];
public:
x86register &operator =(const uint64_t &v){
for (int i = 0; i < 8; i++)
bytes[i] = (v >> (i * 8)) & 0xff;
return *this;
}
x86register &operator =(const uint32_t &v){
for (int i = 0; i < 4; i++)
bytes[i] = (v >> (i * 8)) & 0xff;
return *this;
}
x86register &operator =(const uint16_t &v){
for (int i = 0; i < 2; i++)
bytes[i] = (v >> (i * 8)) & 0xff;
return *this;
}
x86register &operator =(const uint8_t &v){
bytes[0] = v;
return *this;
}
operator uint64_t(){
uint64_t res = 0;
for (int i = 7; i >= 0; i--)
res = (res << 8) + bytes[i];
return res;
}
operator uint32_t(){
uint32_t res = 0;
for (int i = 4; i >= 0; i--)
res = (res << 8) + bytes[i];
return res;
}
operator uint16_t(){
uint16_t res = 0;
for (int i = 2; i >= 0; i--)
res = (res << 8) + bytes[i];
return res;
}
operator uint8_t(){
return bytes[0];
}
};
无论运行平台上的字节顺序如何,这个简单的类都应该有效。此外,您可能希望添加一些其他访问器/更改器来处理字寄存器的HSB(AH,BH等)。
答案 1 :(得分:1)
您可以使用按位操作提取部分eax,如下所示:
void main()
{
int eax, ax, ah, al;
eax = 0xAB12CDEF;
ax = eax & 0x0000FFFF;
ah = (eax & 0x0000FF00) >> 8;
al = eax & 0x000000FF;
printf("ax = eax & 0x0000FFFF = 0x%X\n", ax);
printf("ah = (eax & 0x0000FF00) >> 8 = 0x%X\n", ah);
printf("al = eax & 0x000000FF = 0x%X\n", al);
}
输出
ax = eax & 0x0000FFFF = 0xCDEF
ah = (eax & 0x0000FF00) >> 8 = 0xCD
al = eax & 0x000000FF = 0xEF
你也可以像这样定义宏:
#define AX(dw) ((dw) & 0x0000FFFF)
#define AH(dw) ((dw) & 0x0000FF00) >> 8)
#define AL(dw) ((dw) & 0x000000FF)
void main()
{
int eax = 0xAB12CDEF;
cout << "ax = " << hex << AX(eax) << endl; // prints ax = 0xCDEF
}
答案 2 :(得分:1)
如果你想让它像你已经把示例一样简单地工作,你可以通过重新解释转换来逃避它,虽然这违反了指针别名规则,所以行为是未定义的。
std::uint32_t eax = 0xAB12CDEF;
std::uint16_t& ax = reinterpret_cast<std::uint16_t*>(&eax)[1];
std::uint8_t& ah = reinterpret_cast<std::uint8_t&>(ax);
std::uint8_t& al = (&ah)[1];
第二行将eax的地址强制转换为std::uint16_t*
,通过对[1]
应用uint8_t
,得到32位的后半部分。
第三行只是对ah
的强制转换,因为啊将与斧头前面相同。
将al
的地址索引为1会得到以下字节,即class Reg {
private:
std::uint32_t data_;
public:
Reg(std::uint32_t in) : data_{in} { }
std::uint32_t ex() const {
return data_;
}
std::uint16_t x() const {
return static_cast<std::uint16_t>(data_ & 0xFFFF);
}
std::uint8_t h() const {
return static_cast<std::uint8_t>((data_ & 0xFF00) >> 8);
}
std::uint8_t l() const {
return static_cast<std::uint8_t>(data_ & 0xFF);
}
};
。
你要做的事情看起来很不安全和奇怪。因此,为了以最安静的方式获得最相似的行为,您可以使用自定义类型。 然而结果将在下面的机器之间保持一致,但由于不同的endian方案,它们不会在上面获胜。
{{1}}