z80 emulator flags
z80 emulator flags
im writing a ti-83 plus emulator and was wondering what is the best way to handle the issue of flags.
- be2003
blog
blog
"Best" depends on what you want. Do you want it fast? Then do it in assembly language and test if the flags need to be computed at all (check the next X instructions for flag usage). If fast isn't a consideration (and it probably isn't for a calculator emulation), just do the most straightforward C code you can so that it's easy to read, debug, and maintain.
For example, maybe you keep a bool for each flag in the Z80, and then after doing an operation on A that affects the flags, you do something like
z_flag = (A_reg == 0);
Simple, easy to understand, and slow as hell. :)
You can find already written z80 emulation all over the net if you don't care to write your own. Any SEGA GG/MS/Genesis emulation will have a Z80 emulator core in it. The MSX was also a Z80 if I remember correctly. All those Timex/Sinclair computers were Z80 based.
For example, maybe you keep a bool for each flag in the Z80, and then after doing an operation on A that affects the flags, you do something like
z_flag = (A_reg == 0);
Simple, easy to understand, and slow as hell. :)
You can find already written z80 emulation all over the net if you don't care to write your own. Any SEGA GG/MS/Genesis emulation will have a Z80 emulator core in it. The MSX was also a Z80 if I remember correctly. All those Timex/Sinclair computers were Z80 based.
The ti-83 plus was clocked with 15 MHz, so should be no problem without optimization to run it in real time.Jim wrote:For many of the flags, you only need a 256 byte (for the result) lookup table. For some others you need 256x256 byte table (for each operand). Lookups are the only way for speed. Basically I'm saying pre-compute all the possible flags for all the possible results. It's a tiny amount of memory.
But are you sure that the memory access for a 64 Kbyte table is faster than executing some instructions, which are maybe piplined and in instruction cache?
He's right - with instruction caches and such, doing several opcodes to algorithmically generate flags is much faster than lookup tables. Tables stopped being faster for most emulation purposes almost a decade ago.
Some CPUs are actually capable of executing dozens of opcodes in the same time as one memory fetch (not from L2 cache). Using BIG tables actually hurts your emulation speed as the caches are flooded with mostly useless data.
Some CPUs are actually capable of executing dozens of opcodes in the same time as one memory fetch (not from L2 cache). Using BIG tables actually hurts your emulation speed as the caches are flooded with mostly useless data.
i know nearly nothing bout the z80 (i made a 6502 emulator) and since it is a 8 bit cpu the best way i think is as you said use a plain "int" for each flag, and have a "byte" (usigned char) for the emulating the flag register itself.
case 1: Only one variable (byte).
Suppouse you only use a byte to emulate your cpu flags and that flag "Zero" is the bit 0 (lsbit) of the byte, and you need to set/clear it
You would do:
As you can see you have to use an "if" on the value, wich as we now normaly the compiler, in this case will generate condition and branch on the target cpu (the cpu that you want you emulated one to run).
case 2: one varible for each flag, and a byte variable for the emulated one.
Now look at the code if you use a variable (int) for each flag:
As you can see in this code you eliminated the "if" and some logical operatos (much faster);
Then when there is a instruction that push/pop the flag to the stack you just have to "reconstruct" your flags (i currently dont know z80 flag positions, but the important thing here is the "concept").
You could ask yourself, that is slow, but think that emulation of the processor maybe expend 10% in pushing/poping flags and a 90% setting/clearing them.
case 3: precalculated flags using only one var (byte).
you can precalculate you flags creating an array of the total of the values that a BYTE (a 8 bit processor) can hold. ie = 256 elements in the array. and each entry in the array hast a "1" or "0", that tells if the value (array position) the flag shold be set or clear.
Lets look the code:
This method is not too bad, is faster than 1 but slower than 2 and you use a little bit of memory.
so i suggest method 2 (case 2).
i hope this help.
case 1: Only one variable (byte).
Suppouse you only use a byte to emulate your cpu flags and that flag "Zero" is the bit 0 (lsbit) of the byte, and you need to set/clear it
You would do:
Code: Select all
//this set
if (val == 0) //set "zero flag"
Cpu.Flags |= 0x01;
else
Cpu.Flags &= ~0x01; //clear zero flag
//Tessting for "set"
if (Cpu.Flags & 0x01)
//instruction dependent of the flag here
//Tessting for "clear"
if (!(Cpu.Flags & 0x01))
//instruction dependent of the flag here
case 2: one varible for each flag, and a byte variable for the emulated one.
Now look at the code if you use a variable (int) for each flag:
Code: Select all
//this set
Cpu.Zero = ! value; set to "1" if value is 0, otherwise clear it
//Tessting for "set"
if (Cpu.Zero)
//instruction dependent of the flag here
//Tessting for "clear"
if (! Cpu.Flags)
//instruction dependent of the flag here
Then when there is a instruction that push/pop the flag to the stack you just have to "reconstruct" your flags (i currently dont know z80 flag positions, but the important thing here is the "concept").
Code: Select all
Cpu.Flags = 0;
if (Cpu.Zero)
Cpu.Flags |= 0x01;
if (Cpu.Carry)
Cpu.Flags |= 0x02;
if (Cpu.Overflow)
Cpu.Flags |= 0x04;
case 3: precalculated flags using only one var (byte).
you can precalculate you flags creating an array of the total of the values that a BYTE (a 8 bit processor) can hold. ie = 256 elements in the array. and each entry in the array hast a "1" or "0", that tells if the value (array position) the flag shold be set or clear.
Lets look the code:
Code: Select all
//global array
unsigned char flag_zero_sign_lut[256];
then you can make a function, that is called only once in your program to "precalculate" the flags.
for (int i = 0; i < 256; i++)
{
flag_zero_sign_lut[i] = (i == 0 ? 1 : 0) | (i & 0x80 ? 1 : 0) ; // this test if the value is zero and or it with the test if the value is negative
}
and when you need to set or clear a flag:
Cpu.Flag &= ~0x81; //we clear here since we dont know the prev. value
Cpu.Flags |= flag_zero_sign_lut[value];
so i suggest method 2 (case 2).
i hope this help.
- Ateneo -
C allows you to define a struct with bit fields (see http://www.google.com/search?q=struct+bit+fields ), with which you could represent all flags with one byte and set a bit with one assignment, without if's or shift's, but I have never used it and I don't know if GCC optimize it with fast MIPS instructions.Atenus wrote:As you can see you have to use an "if" on the value, wich as we now normaly the compiler, in this case will generate condition and branch on the target cpu (the cpu that you want you emulated one to run).
yes i know and i have used them, but i can tell you they are a pain in the a*s regarding code generation.C allows you to define a struct with bit fields (see http://www.google.com/search?q=struct+bit+fields ), with which you could represent all flags with one byte and set a bit with one assignment, without if's or shift's, but I have never used it and I don't know if GCC optimize it with fast MIPS instructions.
- Ateneo -
Bah. I dont know the difference between reoply and edit.
Read on ...
(ps - whats with the delay between posts/edits?)
Read on ...
(ps - whats with the delay between posts/edits?)
Last edited by cheriff on Wed Nov 22, 2006 8:38 am, edited 1 time in total.
Damn, I need a decent signature!
cheriff wrote:Personally, I avoid bitfields like the plague.
They are compiler and endian dependant, which cause headaches with regard to portability.
I've also read they can cause gcc to generate icky code, although I personally haven't examined this to make sure, I just never use them :)
Instead:
Its a little bit more work but works anywhere.Code: Select all
#define GET_FIELD(x, shift, mask) ( (x>>shift)&mask ) #define SHIFT_Z (1) #define MASK_Z (1) #define SHIFT_NEG (2) #define MASK_NEG (1) #define GET(field, x) GET_FIELD(x, SHIFT_##field, MASK_##field) ... if (GET(NEG, flags)) // do something
A simiar macro could be done for SET field too.
Probably not important if you stick to one compiler on a single platform, since it is at least consistient, but worth a look.
Damn, I need a decent signature!
normally gcc should use ext/ins instructions which are designed for this purpose. But here it would be more likely a or to set or an and to reset.Atenus wrote:yes i know and i have used them, but i can tell you they are a pain in the a*s regarding code generation.C allows you to define a struct with bit fields (see http://www.google.com/search?q=struct+bit+fields ), with which you could represent all flags with one byte and set a bit with one assignment, without if's or shift's, but I have never used it and I don't know if GCC optimize it with fast MIPS instructions.
Code: Select all
#define GET_FIELD(x, lsb, bits) ({ int res; asm volatile("ext %0, %1, %2, %3" : "="(res) : "r"(x), "i"(lsb), "i"(bits)); res; })
#define SHIFT_Z (1)
#define BITS_Z (1)
#define SHIFT_NEG (2)
#define BITS_NEG (1)
#define GET(field, x) GET_FIELD(x, SHIFT_##field, BITS_##field)
...
if (GET(NEG, flags))
// do something
...
#define SET_FIELD(x, y, lsb, bits) ({ asm volatile("ins %0, %1, %2, %3" : "+r"(x) : "r"(y), "i"(lsb), "i"(bits)); })
Yeah, I was contributing to the discussion on potential use of C bitfileds, and less on the best way for this particular task.hlide wrote:if BITS equals to 1 then the right flag in X will be set with the value of Y (i.e, bit 0 of Y).
My asm is a bit rusty, but for setting and clearing indivitual bits hlid's code would be the way to go.
I need portability and wider fields.. so each to his own :)
Damn, I need a decent signature!
Sorry, i writed the post on the fly and i tried to change 2 things in my first reply, so that could explain the collition.Bah. I dont know the difference between reoply and edit.
Read on ...
So, taking into account that i will add/correct here to myself:
Correction:
The lookup table generation code, it would be:
Code: Select all
for (int i = 0; i < 256; i++)
{
flag_zero_sign_lut[i] = (i == 0 ? 1 : 0) | (i & 0x80 ? 0x80 : 0) ; // this test if the value is zero and or it with the test if the value is negative
}
Add:
You should use a reconstruct() function (explained above) for the flag register when its pushed in.
And a deconstruct() function to values popped out since flag register changes.
That would be:
Code: Select all
if (Cpu.Flags & 0x01)
Cpu.Zero = 1;
else
Cpu.Zero = 0
if (Cpu.Flags & 0x02)
Cpu.Carry = 1;
else
Cpu.Carry = 0;
// OR knowing the position of flags, this code (much faster):
Cpu.Zero = (Cpu.Flags & 0x01);
Cpu.Carry = (Cpu.Flags & 0x02);
//and so on ...
//note: it dont cares if a int flag changes values to 1 to 2, or 2 to 4 (talking decimaly) since the compiler will evaluate codition "true" != 0;
This according the side you look at, but dont take this into account.
thats all.
- Ateneo -
will this work:
Code: Select all
byte R; // the flag register
int a,b,c,d,e,f,g,h;
//h being bit 7 and a as bit 0
#define MKFLAG() R = (h<<7|g<<6|f<<5|e<<4|d<<3|c<<2|b<<1|a<<0)
- be2003
blog
blog
supposedly you have a,b,c,d,e,f,g and h stored in registers, you will have 15 instructions to execute (1 MOV, 7 SLL, 7 OR) :be2003 wrote:will this work:Code: Select all
byte R; // the flag register int a,b,c,d,e,f,g,h; //h being bit 7 and a as bit 0 #define MKFLAG() R = (h<<7|g<<6|f<<5|e<<4|d<<3|c<<2|b<<1|a<<0)
Code: Select all
move v0, s0
sll v1, s1, 1
or v0, v0, v1
sll v1, s2, 2
or v0, v0, v1
sll v1, s3, 3
or v0, v0, v1
sll v1, s4, 3
or v0, v0, v1
sll v1, s5, 5
or v0, v0, v1
sll v1, s6, 6
or v0, v0, v1
sll v1, s7, 7
or v0, v0, v1
Code: Select all
andi v0, s0, 1
ins v0, s1, 1, 1
ins v0, s2, 2, 1
ins v0, s3, 3, 1
ins v0, s4, 4, 1
ins v0, s5, 5, 1
ins v0, s6, 6, 1
ins v0, s7, 7, 1
So far as you are not concerned by a portable code, of course.
EDIT: 7 AND can be removed indeed since you use int and not a boolean :/
Last edited by hlide on Wed Nov 22, 2006 7:54 pm, edited 2 times in total.
since v0 is a read/write register you may need to avoid to execute two 'ins' on the same register (i'm not sure about that nonetheless):
9 instructions still
Code: Select all
andi v0, s0, 1
ins v1, s4, 4, 1
ins v0, s1, 1, 1
ins v1, s5, 5, 1
ins v0, s2, 2, 1
ins v1, s6, 6, 1
ins v0, s3, 3, 1
ins v1, s7, 7, 1
ins v0, v1, 4, 4
will this work:
yep in will work, but if you take as a constant that your emulator will store in int flags always 1 or 0.
the (a << 0) can be removed since its already 0 or 1.
Code: Select all
byte R; // the flag register
int a,b,c,d,e,f,g,h;
//h being bit 7 and a as bit 0
#define MKFLAG() R = (h<<7|g<<6|f<<5|e<<4|d<<3|c<<2|b<<1|a<<0)
the (a << 0) can be removed since its already 0 or 1.
- Ateneo -
thank god, i was looking for the simplest code without asm... i put the a<<0 for concept completenessAtenus wrote:will this work:yep in will work, but if you take as a constant that your emulator will store in int flags always 1 or 0.Code: Select all
byte R; // the flag register int a,b,c,d,e,f,g,h; //h being bit 7 and a as bit 0 #define MKFLAG() R = (h<<7|g<<6|f<<5|e<<4|d<<3|c<<2|b<<1|a<<0)
the (a << 0) can be removed since its already 0 or 1.
- be2003
blog
blog
ups!! i was wrong when i said "the (a << 0) can be removed ", maybe a should say that you can remove the "<< 0" shif left 0 to a and live only "a".i put the a<<0 for concept completeness
Talking about asm/c/c++ code, you can use all c bitwise operations and you dont need assembly. I dont know too much about MIPS32 (Allegrex) architecture (im just a beginner, im studying it), but i made a NES emulator for Windows in VC (plain c code), then i tried nearly all compilers for win32 on my code (ported it) and the best one was MINGW, wich is based on gcc, and i can say its a good compiler (anyway microsoft do his o.system and vc compiler is not more that 1% better than gcc for creating fast win applications).
Since an emulator (console, computer, etc) is an application that needs very good performance the key sometimes is:
- the emulation itself (cpu, graphics, etc) should be done in asm or plain c (procedural). Asm is inpractical since code portability, but if you think your emu. for just one platform asm could be a choice, but you maybe will win no more than 2 or 3 %).
- you could use OOP for things that dont play significant rules in the emulation itself, like if you are going to do a menu system for the emulator (thinking that psp wasnt thought for GUI or user interfaces).
And then mix c/c++ code.
- know about the target machine, so your take it to the limit when running your program (im weack here since im a beginner to psp).
Well there could be (and surely there are) more than this ones, but its more like a "general rule".
Also remember that psp is capable of running an 8-bit computer emulator, but you will need to optimize code and find the best code.
- Ateneo -