Step up to become the VU master!
Step up to become the VU master!
Want to be a VU master? Well, here's your chance to stand up and become one. We need a beginners guide to getting into VU coding. I know there's information out there but... we need information in here. So, start posting info. Either a full guide (ha!) or pieces that I can cobble into a guide. Maybe a general overview of how the process works, or a piece on how to compile VU code. Example code, or a discussion on what the VU can do or not do. Anything would be good...
I think this is a VERY good idea and hope lots of information gets posted.
I'd like to start by posting everything I know about using the VUs.
If anyone knows anything I haven't posted, please post :)
I'd like to start by posting everything I know about using the VUs.
If anyone knows anything I haven't posted, please post :)
Shoot Pixels Not People!
Makeshift Development
Makeshift Development
-
- Posts: 564
- Joined: Sat Jan 17, 2004 10:22 am
- Location: Sweden
- Contact:
Ok this will be short pointers rather than a full document describing the odd ends of VU programming.
The easiest path to get something running on the VU is to do what druckluft made in Funslower, Sinde VU0 and VU1 have both code and data memory mapped on EE you can copy the code and data to those areas and kick VU0 or VU1, the code for this can be found in the Funslower source. The copy could be a simple memcpy or your ownmade
lq t0, 0(a0)
sq t0, 0(a1)
styled loop, with a0 as the source for your code/data block and a1 the vu0/1 data/code memory.
to kick the vu0 you use an opcode called vcallms or vcallmsr ( call micro subroutine [register]) with the start address.
vcallms 0
vcallmsr zero
for vu1 you need to use the ctc2 opcode which stands for
copy to coprocessor2, and what you want to copy is the start address.
so to have vu1 start at position 0 in vu1code mem do
; zero == $0 and cmsar1 == $31
ctc2.i zero, cmsar1
The way of the VIF.
Now VIF works the same way as GIF, you have a tag wich tells VIF what comes after it, to upload code and have the vu to execute you will have a sequence that looks like this.
VIF_MPG
code
VIF_MSCAL
in dvp-as language that would end up being
DMAcnt *
MPG 0, *
maddw.xyzw vf1, vf0, vf0 iaddiu vi1, vi0, 0xff
.EndMPG
MSCAL 0
.EndDmaData
the small code just adds some to vf1 and vi1, you could store them to
vu data memory and then check that its reasonable by printing from the EE mapped memory area.
Uploading data via VIF can be done in various ways since the UNPACK
tag has alot of attributes, the basic one is calle V4_32, basically the data is expected to be arranged as 4 32bit floats that will go directly to x,y,z,w.
a simple dvp-as example
unpack 4, 4, V4_32, 4, *
fwzyx 0.0, 1.0, 2.0, 3.0
fwzyx 0.0, 1.0, 2.0, 3.0
fwzyx 0.0, 1.0, 2.0, 3.0
fwzyx 0.0, 1.0, 2.0, 3.0
.EndUnpack
Normally you will upload the code without any MSCAL or MSCNT, and upload data with the MSCAL or MSCNT after the Unpack data since you will most often upload data alot more than code. so upload code once, then for each data uploaded you have the MSCAL/MSCNT tag to tell vu to start kicking it.
For more reading material check out the source for Funslower or Aura for Laura, also Jar has included source in 2 of his contriubtions to www.thethirdcreation.net. be sure to check them out aswell.
The easiest path to get something running on the VU is to do what druckluft made in Funslower, Sinde VU0 and VU1 have both code and data memory mapped on EE you can copy the code and data to those areas and kick VU0 or VU1, the code for this can be found in the Funslower source. The copy could be a simple memcpy or your ownmade
lq t0, 0(a0)
sq t0, 0(a1)
styled loop, with a0 as the source for your code/data block and a1 the vu0/1 data/code memory.
to kick the vu0 you use an opcode called vcallms or vcallmsr ( call micro subroutine [register]) with the start address.
vcallms 0
vcallmsr zero
for vu1 you need to use the ctc2 opcode which stands for
copy to coprocessor2, and what you want to copy is the start address.
so to have vu1 start at position 0 in vu1code mem do
; zero == $0 and cmsar1 == $31
ctc2.i zero, cmsar1
The way of the VIF.
Now VIF works the same way as GIF, you have a tag wich tells VIF what comes after it, to upload code and have the vu to execute you will have a sequence that looks like this.
VIF_MPG
code
VIF_MSCAL
in dvp-as language that would end up being
DMAcnt *
MPG 0, *
maddw.xyzw vf1, vf0, vf0 iaddiu vi1, vi0, 0xff
.EndMPG
MSCAL 0
.EndDmaData
the small code just adds some to vf1 and vi1, you could store them to
vu data memory and then check that its reasonable by printing from the EE mapped memory area.
Uploading data via VIF can be done in various ways since the UNPACK
tag has alot of attributes, the basic one is calle V4_32, basically the data is expected to be arranged as 4 32bit floats that will go directly to x,y,z,w.
a simple dvp-as example
unpack 4, 4, V4_32, 4, *
fwzyx 0.0, 1.0, 2.0, 3.0
fwzyx 0.0, 1.0, 2.0, 3.0
fwzyx 0.0, 1.0, 2.0, 3.0
fwzyx 0.0, 1.0, 2.0, 3.0
.EndUnpack
Normally you will upload the code without any MSCAL or MSCNT, and upload data with the MSCAL or MSCNT after the Unpack data since you will most often upload data alot more than code. so upload code once, then for each data uploaded you have the MSCAL/MSCNT tag to tell vu to start kicking it.
For more reading material check out the source for Funslower or Aura for Laura, also Jar has included source in 2 of his contriubtions to www.thethirdcreation.net. be sure to check them out aswell.
Kung VU
I believe VU1 can access VU0 RAM, or it may be the other way around I'm not sure. Remember that VU0 is 4k Code and 4k Data, and VU1 is 16k Code and 16k Data.ooPo wrote:Some questions: What can you see/access from a running VU program? Can it access the main EE memory? Can it access the GS/GIF? I've seen some programs say they use an interrupt-based way of getting a texture uploaded before the geometry is sent to the GS. How would that work?
The EE can access either of VU0 or VU1 code or data, but not the other way around.
VU1 is directly tied to the GIF.
One thing I think should be mentioned in this regard, which seems obvious once you've done some VU stuff, but i didn't know when i started out, is that you need to use packed mode for your GIF packets rather than A+D, as most tutorials do.
I recommend playing around with packed mode and getting transformations/lighting/texturing right on the EE-core (ie. in your normal C code), and then convert it to VU code.
Also, when trying to get VIF packages right it's, imho, best to try uploading the package and then some kind of vif_wait() and read back the memory to see if it was written as expected, this goes especially for the more obscure unpacking methods.
I recommend playing around with packed mode and getting transformations/lighting/texturing right on the EE-core (ie. in your normal C code), and then convert it to VU code.
Also, when trying to get VIF packages right it's, imho, best to try uploading the package and then some kind of vif_wait() and read back the memory to see if it was written as expected, this goes especially for the more obscure unpacking methods.
-
- Posts: 564
- Joined: Sat Jan 17, 2004 10:22 am
- Location: Sweden
- Contact:
neither vu0 or vu1 in micromode can access EE, they are running in a standalone mode and can only be controlled by VIF ( you can start, stop and reset from without VIF thats it ) when running,ooPo wrote:Some questions: What can you see/access from a running VU program? Can it access the main EE memory? Can it access the GS/GIF? I've seen some programs say they use an interrupt-based way of getting a texture uploaded before the geometry is sent to the GS. How would that work?
when not running you can upload code/data to the EE mapped memory,
and also you must stop VU0/VU1 and VIF0/VIF1 before reading since you cant trust the mapped data when they are in the run state.
now that being said, VIF and VU can generate an interrupt normally, for example you have an unpack batch being sent and then mscal with interrupt bit set ( all VIF tags have a bit for interrupt generation ), on the R5900 you would have a something that checks for it to be set and sends of texture or whatever you want todo.
Last edited by blackdroid on Thu Jan 22, 2004 7:46 am, edited 2 times in total.
Kung VU
-
- Posts: 564
- Joined: Sat Jan 17, 2004 10:22 am
- Location: Sweden
- Contact:
Vu1's registers are mapped to Vu0 mem after 0x4000I believe VU1 can access VU0 RAM, or it may be the other way around I'm not sure. Remember that VU0 is 4k Code and 4k Data, and VU1 is 16k Code and 16k Data.
The EE can access either of VU0 or VU1 code or data, but not the other way around.
VU1 is directly tied to the GIF.
nothing else is crossmapped.
Last edited by blackdroid on Thu Jan 22, 2004 7:51 am, edited 1 time in total.
Kung VU
-
- Posts: 564
- Joined: Sat Jan 17, 2004 10:22 am
- Location: Sweden
- Contact:
squeezing xyz into 64bits is something that the VU's was not designed for.ooPo wrote:I'm using packed mode currently for my gif packets, but I was pondering going to A+D mode for the xyz register so I can remove the drawing kick if the poly is clipped. Is there any reason other that memory size to do packed mode on the VU?
therefor packed mode is the logical choice for the VU's, which is abit of
a dumb name since the packing gets done by GIF.
Kung VU
btw: http://ps2linux.diabolus-ex-machina.com ... or%20Units has some nice links
Since I am a newbie at PS2 programming, let me be the first to ask this question...
What is a Vector Unit?
Now, please stay in your chairs and stop yourself from laughing as much as you can.... please....stop. Anyway, the PS2 seems so complicated compared to the GBA (which I am working on right now). There really is only one set of tutorials out there for a newbie like me.
So just use me as a guinea pig. I'll ask yall guys questions and you *will* answer them, then stuff them into a FAQ, hopefully a guide, maybe someday an SDK, so n00bs don't have to suffer.
Oh, yea, hello everyone.
What is a Vector Unit?
Now, please stay in your chairs and stop yourself from laughing as much as you can.... please....stop. Anyway, the PS2 seems so complicated compared to the GBA (which I am working on right now). There really is only one set of tutorials out there for a newbie like me.
So just use me as a guinea pig. I'll ask yall guys questions and you *will* answer them, then stuff them into a FAQ, hopefully a guide, maybe someday an SDK, so n00bs don't have to suffer.
Oh, yea, hello everyone.
while (your_engine >= my_engine)
my_engine++; :P
my_engine++; :P
The vector unit can be described as a very specialized processor, a kind of cross between MMX and a vertex shader on a graphic card. You can write a small program to get a VU to quickly do math on data, like 3D transformations or complex physics. It has some useful SIMD (single instruction multiple data) opcodes and a special connection linked directly to the GS. Very useful, but very exotic.
-
- Posts: 564
- Joined: Sat Jan 17, 2004 10:22 am
- Location: Sweden
- Contact:
It might be good to mention that there is no "the" vector unit.
Since the ps2 has 2 vector units, one (VU0) that is closer to the r5900 cpu ( since it can operate in coprocessor mode ) and the other (VU1) which has its own bus to the GIF.
Vu0 as mentioned before has 4KB for data and 4KB for code.
Vu1 as mentioned before has 16KB for data and 16KB for code.
And what they are really good at is matrix math, why ? well
mul.xyzw vf3, vf2, vf1 <- 4 mul in one go, which fits nicely for
4x4 matrices(and less), and thats what 3d engines likes to do the most.
All the float registers on the vector unit are 128bit wide and divided in
4 so we have 4 32bit single precision floats in each register, the memory
on the Vu's are laid out in 128bit fashion so when do load/store against it
you do it on 128bit align.
Anything else ?
Since the ps2 has 2 vector units, one (VU0) that is closer to the r5900 cpu ( since it can operate in coprocessor mode ) and the other (VU1) which has its own bus to the GIF.
Vu0 as mentioned before has 4KB for data and 4KB for code.
Vu1 as mentioned before has 16KB for data and 16KB for code.
And what they are really good at is matrix math, why ? well
mul.xyzw vf3, vf2, vf1 <- 4 mul in one go, which fits nicely for
4x4 matrices(and less), and thats what 3d engines likes to do the most.
All the float registers on the vector unit are 128bit wide and divided in
4 so we have 4 32bit single precision floats in each register, the memory
on the Vu's are laid out in 128bit fashion so when do load/store against it
you do it on 128bit align.
Anything else ?
Kung VU
-
- Posts: 564
- Joined: Sat Jan 17, 2004 10:22 am
- Location: Sweden
- Contact:
-
- Posts: 564
- Joined: Sat Jan 17, 2004 10:22 am
- Location: Sweden
- Contact:
-
- Posts: 564
- Joined: Sat Jan 17, 2004 10:22 am
- Location: Sweden
- Contact: