Confusion about GU Texture swizzling

Discuss the development of new homebrew software, tools and libraries.

Moderators: cheriff, TyRaNiD

blasty
Posts: 9
Joined: Mon Aug 22, 2005 12:07 am

Confusion about GU Texture swizzling

Post by blasty »

Hi,

I need to get some things straight I guess. Today I have been busy porting my 2D GFX functions to use the GU instead of writing directly to VRAM (and use a doublebuffer technique). as example I took the "blit" demo from the PSPSDK samples directory. This one renders a flat 480x272 texture to the screen. I got this all nicely working, only there was one downside, the animation of my stuff wasn't 100% smooth. Someone suggested I might needed to flush the caches using sceKernelDcacheWritebackAll(); But this didn't change a thing..

Then I read on this forum about texture swizzling. I enabled swizzling in sceGuTexMode(); and took the example function from [here] (thanks chp). Now comes my actual problem, I don't know how to use this function properly. it requires a input and output pointer which are chars. My texture is in raw 16bit format..

My code looks (shortened) like this:

Code: Select all

  static unsigned short __attribute__((aligned(16))) pixels[512*272]; // display buffer
  static unsigned int __attribute__((aligned(16))) list[262144];

  // Setup GU
  sceGuStart(GU_DIRECT,list);
  sceGuDrawBuffer(GU_PSM_5551,(void*)0,512); 
  sceGuDispBuffer(480,272,(void*)0x88000,512);

  sceGuDepthBuffer((void*)0x110000,512);
  sceGuOffset(2048 - (480/2),2048 - (272/2));
  sceGuViewport(2048,2048,480,272);

  sceGuDepthRange(0xc350,0x2710);
  sceGuScissor(0,0,480,272);
  sceGuEnable(GU_SCISSOR_TEST);
  sceGuFrontFace(GU_CW);
  sceGuEnable(GU_TEXTURE_2D);

  sceGuClear(GU_COLOR_BUFFER_BIT|GU_DEPTH_BUFFER_BIT);
  sceGuFinish();
  sceGuSync(0,0);

  sceDisplayWaitVblankStart();
  sceGuDisplay(1);

  struct Vertex* vertices;
  unsigned short *texture;

  texture = malloc(sizeof(pixels));
  
  // swizzle texture
  swizzle(texture, &pixels, 480*2, 272); // pixel width = 2 bytes, so x2
  
  sceGuStart(GU_DIRECT,list);
  
  // setup the source buffer as a 512x512 texture, but only copy 480x272
  sceGuTexMode(GU_PSM_5551,0,0,GU_TRUE);
  sceGuTexImage(0,512,512,512,texture);
  sceGuTexFunc(GU_TFX_REPLACE,GU_TCC_RGB);
  sceGuTexFilter(GU_NEAREST,GU_NEAREST);
  sceGuTexScale(1.0f/512.0f,1.0f/512.0f); // scale UVs to 0..1
  sceGuTexOffset(0.0f,0.0f);
  sceGuAmbientColor(0xffffffff);
  
  for &#40;j = 0; j < 480; j = j+SLICE_SIZE&#41; &#123;
    vertices = &#40;struct Vertex*&#41;sceGuGetMemory&#40;2 * sizeof&#40;struct Vertex&#41;&#41;;

    vertices&#91;0&#93;.u = j; vertices&#91;0&#93;.v = 0;
    vertices&#91;0&#93;.color = 0;
    vertices&#91;0&#93;.x = j; vertices&#91;0&#93;.y = 0; vertices&#91;0&#93;.z = 0;
    vertices&#91;1&#93;.u = j+SLICE_SIZE; vertices&#91;1&#93;.v = 272;
    vertices&#91;1&#93;.color = 0;
    vertices&#91;1&#93;.x = j+SLICE_SIZE; vertices&#91;1&#93;.y = 272; vertices&#91;1&#93;.z = 0;
    sceGuDrawArray&#40;GU_SPRITES,GU_TEXTURE_16BIT|GU_COLOR_5551|GU_VERTEX_16BIT|GU_TRANSFORM_2D,2,0,vertices&#41;;
  &#125;

  sceGuFinish&#40;&#41;;
  sceGuSync&#40;0,0&#41;;
This doesn't give the wanted result. I see parts of my image data back, but it's not ordered properly. Does anyone have a clue what I'm missing/doing wrong here?

Any help would be highly appreciated!
chp
Posts: 313
Joined: Wed Jun 23, 2004 7:16 am

Post by chp »

Quite simple: You're swizzling it as a 480x272 16-bit, while it's clearly a 512x272 16-bit. That will shift all the blocks because of the dead data read.

If you wanted to swizzle just part of a texture, you'd have to take into account the buffer-width aswell. The swizzle functions I wrote do not at the moment.
GE Dominator
blasty
Posts: 9
Joined: Mon Aug 22, 2005 12:07 am

Post by blasty »

Damn. that was quite obvious actually. just fixed that bug in the source.

But still my texture still doesn't come out correct.. It's divided in 4 vertical lines and seems to skip a row of pixels (so is black in end result) each time. veeery strange.
blasty
Posts: 9
Joined: Mon Aug 22, 2005 12:07 am

Post by blasty »

NOBODY has an idea what I am doing wrong here? Furthermore, I have looked through the PSPWARE SVN directory to see if any of the projects used texture swizzling.. but NONE of them does. Why?? Am I the only person experiencing heavy frame tearing when copying a 480x272 texture a few times every second? I have fiddled around with this for 12+ hours now, and I'm getting kind of fed up with it. Without swizzling my code looks exactly the same except for the memory getting malloc()'d and the swizzle parameter not being set in sceGuTexMode(); (oh, and the reference to "pixels" insteaf of "texture", kinda obvious). A working example or explanation why my stuff is so slow would be highly appreciated.
Panajev2001a
Posts: 100
Joined: Sat Aug 20, 2005 3:25 am

Post by Panajev2001a »

Try this function: it is the original function chp optimized for the Wiki submission and I did a small few changes to it (no divides, but shifts instead, etc...).

Code: Select all

void swizzle_fast&#40;u8* out, const u8* in, unsigned int width, unsigned int height&#41;
&#123;
   unsigned int blockx, blocky;
   unsigned int i,j;
 
   unsigned int width_blocks = &#40;width >> 4&#41;;
   unsigned int height_blocks = &#40;height >> 3&#41;;
 
   unsigned int src_pitch = &#40;width-16&#41; >> 2;
   unsigned int src_row = width << 3;
 
   const u8* ysrc = in;
   u32* dst = out;
 
   for &#40;blocky = 0; blocky < height_blocks; ++blocky&#41;
   &#123;
      const u8* xsrc = ysrc;
      for &#40;blockx = 0; blockx < width_blocks; ++blockx&#41;
      &#123;
         const u32* src = &#40;u32*&#41;xsrc;
         for &#40;j = 0; j < 8; ++j&#41;
         &#123;
            *&#40;dst++&#41; = *&#40;src++&#41;;
            *&#40;dst++&#41; = *&#40;src++&#41;;
            *&#40;dst++&#41; = *&#40;src++&#41;;
            *&#40;dst++&#41; = *&#40;src++&#41;;
            src += src_pitch;
         &#125;
         xsrc += 16;
     &#125;
     ysrc += src_row;
   &#125;
&#125;
The best thing would be to swizzle outside of your PSP program, presenting the texture as already swizzled when you load them up.
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

Panajev2001a wrote:Try this function: it is the original function chp optimized for the Wiki submission and I did a small few changes to it (no divides, but shifts instead, etc...).
Most compilers replace division by power-of-two integer numbers by shifts. Check the compiler or objdump output to be sure.
Panajev2001a
Posts: 100
Joined: Sat Aug 20, 2005 3:25 am

Post by Panajev2001a »

holger wrote:
Panajev2001a wrote:Try this function: it is the original function chp optimized for the Wiki submission and I did a small few changes to it (no divides, but shifts instead, etc...).
Most compilers replace division by power-of-two integer numbers by shifts. Check the compiler or objdump output to be sure.
I tend not to trust the compiler ;).
ector
Posts: 195
Joined: Thu May 12, 2005 10:22 pm

Post by ector »

holger wrote:
Panajev2001a wrote:Try this function: it is the original function chp optimized for the Wiki submission and I did a small few changes to it (no divides, but shifts instead, etc...).
Most compilers replace division by power-of-two integer numbers by shifts. Check the compiler or objdump output to be sure.
That's true for left-shifts, but rightshifts aren't as simple. You must use unsigned variables for the compiler to be able to fully replace a division with a rightshift, otherwise it has to add fixup code to account for the fact that -1 >> 1 == -1 and not 0 as would be expected.
http://www.dtek.chalmers.se/~tronic/PSPTexTool.zip Free texture converter for PSP with source. More to come.
blasty
Posts: 9
Joined: Mon Aug 22, 2005 12:07 am

Post by blasty »

boohoo, I'm still camping around with this problem. I tried Panajev's function also, but basically it does the same as chp's one. To clarify a little, I have redrawn the result of what I get when I swizzle my textures. It looks like [this], but it's supposed to look like [this]. (yes, a white rectangle ;))

My code looks still the same except for the bug/mistake I fixed regarding to the swizzle(); call. (the one chp mentioned earlier in this thread)
ector
Posts: 195
Joined: Thu May 12, 2005 10:22 pm

Post by ector »

The width of your image has to be a multiple of 32 bytes. You can't just try to swizzle a 23x20 pixel image and expect it to come out right..
http://www.dtek.chalmers.se/~tronic/PSPTexTool.zip Free texture converter for PSP with source. More to come.
blasty
Posts: 9
Joined: Mon Aug 22, 2005 12:07 am

Post by blasty »

Heh, I guess I was a little vague. This is just a quick cut and paste of the result. the texture itself is 512x512.
ashleydb
Posts: 26
Joined: Mon Oct 03, 2005 2:06 am
Location: USA
Contact:

Post by ashleydb »

Did you ever manage to fix your problem?

I'm having trouble swizzling anything. I'm trying to do a 128*128 texture, but its not working. I went back to trying to swizzle the texture that came with the cube demo instead to try and find out what I'm doing wrong, but thats still not happening. Anything obviously wrong with the following?

Code: Select all

extern unsigned char logo_start&#91;&#93;;
const unsigned short swizzleData&#91;64*64&#93; = &#123;0&#125;;

...

swizzle&#40;swizzleData,logo_start, 64, 64&#41;;

...

sceGuTexMode&#40;GU_PSM_4444,0,0,TRUE&#41;;
sceGuTexImage&#40;0,64,64,64,swizzleData&#41;;

www.PSP-Files.com - PSP News, Hacks etc.

www.HiAsh.com - My work and stuff
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

you must not declare arrays that you are intending to change as "const", you want to use "static" instead.
chp
Posts: 313
Joined: Wed Jun 23, 2004 7:16 am

Post by chp »

I can spot one problem in your code, and that is that you are passing texel-width to swizzle(), when the code in itself assumes byte-width, so the width for a 64x64 16-bit texture is in fact 128 (64 texels times 2 bytes).

I'll take a look at making a "complete" approach on the wiki soon, which will make understanding the code more clear, like passing a pixelformat to make width independent, and support swizzling just part of a texture.
GE Dominator
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

Hi chp,

if you change the calling convention of the swizzle/tile function, so that the texture stride is passed as argument and not calculated internally you avoid bugs like this and would even allow more flexible use...

btw. Using the VFPU you can make the inner loop a single load/store-128bit-vector pair.
chp
Posts: 313
Joined: Wed Jun 23, 2004 7:16 am

Post by chp »

But since people mistake the width as texel-width, it does show that it's unclear now aswell, doesn't it? One good approach would perhaps be to make the prototype identical to sceGuCopyImage() (from my perspective, I guess you want something else :)), since people know atleast somewhat how to use it and what the parameters mean.

Optimizing using VFPU is of secondary interest right now actually. The question is if it's any gain, depending on how we can pipeline memory-accesses from the VFPU, or if any gain can be had since you'll probably be cache-tied when reading the source-data anyway. One approach would be to switch approach and read data in a linear fashion and write in blocks. This could improve performance more than using 128-bit read/writes since we might not have to refill the cache as often.
GE Dominator
ashleydb
Posts: 26
Joined: Mon Oct 03, 2005 2:06 am
Location: USA
Contact:

Post by ashleydb »

Thanks for clearing up the width parameter for me. And I can't believe I made such a stupid mistake leaving that 'const' there... The perils of Copy & Paste.
www.PSP-Files.com - PSP News, Hacks etc.

www.HiAsh.com - My work and stuff
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

chp wrote: Optimizing using VFPU is of secondary interest right now actually. The question is if it's any gain, depending on how we can pipeline memory-accesses from the VFPU, or if any gain can be had since you'll probably be cache-tied when reading the source-data anyway. One approach would be to switch approach and read data in a linear fashion and write in blocks. This could improve performance more than using 128-bit read/writes since we might not have to refill the cache as often.
By setting the cache policy bit in the sw.q instruction you can bypass the cache entirely, you can also avoid cache pollution by using the uncached high memory address space. In our pspgl experiments uncached word writes to the GE command buffer were more performant than cached ones, I assume there is a write combiner unit in the uncached write path.
ashleydb
Posts: 26
Joined: Mon Oct 03, 2005 2:06 am
Location: USA
Contact:

Post by ashleydb »

Code: Select all

extern unsigned char logo_start&#91;&#93;;
static unsigned short swizzleData&#91;64*64&#93; = &#123;0&#125;;
...
swizzle&#40;swizzleData,logo_start, 64*2, 64&#41;;
//swizzle_fast&#40;swizzleData,logo_start, 64*2, 64&#41;;
...
sceGuTexMode&#40;GU_PSM_4444,0,0,TRUE&#41;;
sceGuTexImage&#40;0,64,64,64,swizzleData&#41;;
Odd that this still shows a corrupt image...
www.PSP-Files.com - PSP News, Hacks etc.

www.HiAsh.com - My work and stuff
ector
Posts: 195
Joined: Thu May 12, 2005 10:22 pm

Post by ector »

ashleydb, do you flush the texture cache?
http://www.dtek.chalmers.se/~tronic/PSPTexTool.zip Free texture converter for PSP with source. More to come.
ashleydb
Posts: 26
Joined: Mon Oct 03, 2005 2:06 am
Location: USA
Contact:

Post by ashleydb »

No, I wasn't. I just tried adding it here:

sceGuTexFlush();
sceGuTexImage(0,64,64,64,swizzleData);

But that made it look worse...
www.PSP-Files.com - PSP News, Hacks etc.

www.HiAsh.com - My work and stuff
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

You also need to flush DCache before calling the TexImage function.
ashleydb
Posts: 26
Joined: Mon Oct 03, 2005 2:06 am
Location: USA
Contact:

Post by ashleydb »

From this in another post:
sceKernelDcacheWritebackAll() before calling the glTex*() functions, glFinish() after the vertex render functions before you are moving to the next texture on the same texture object.
I made this:

Code: Select all

void GameMain&#40;void&#41;
&#123;
	sceGuStart&#40;GU_DIRECT,list&#41;;

	//Clear screen and depth buffers.
	sceGuClearColor&#40;0xff000000&#41;;
	sceGuClearDepth&#40;0&#41;;
	sceGuClear&#40;GU_COLOR_BUFFER_BIT|GU_DEPTH_BUFFER_BIT&#41;;

	//Read in the user input
	GetInput&#40;&#41;;


	//Setup model matrix for the cube. Projection and View already done once since camera is static.
	sceGumMatrixMode&#40;GU_MODEL&#41;;
	sceGumLoadIdentity&#40;&#41;;
	&#123;
		ScePspFVector3 pos = &#123; g_fTranslateX, g_fTranslateY, g_fTranslateZ &#125;;
		ScePspFVector3 rot = &#123;
			g_fRotateX * &#40;M_PI/180.0f&#41;,
			g_fRotateY * &#40;M_PI/180.0f&#41;,
			g_fRotateZ * &#40;M_PI/180.0f&#41; &#125;;
		sceGumRotateXYZ&#40;&rot&#41;;
		sceGumTranslate&#40;&pos&#41;;
	&#125;

	// setup texture
	sceGuTexMode&#40;GU_PSM_4444,0,0,TRUE&#41;;


	//-------------ADDED THESE LINES----------------
	sceKernelDcacheWritebackAll&#40;&#41;; 
	sceGuTexFlush&#40;&#41;;


	//Set current texturemap.
	sceGuTexImage&#40;0,64,64,64,swizzleData&#41;;

	//Set how textures are applied.
	sceGuTexFunc&#40;GU_TFX_REPLACE,GU_TCC_RGB&#41;;

	//Set how the texture is filtered.
	sceGuTexFilter&#40;GU_LINEAR,GU_LINEAR&#41;;

	sceGuTexScale&#40;1.0f,1.0f&#41;;

	//?Where to start the texture when drawing?
	sceGuTexOffset&#40;0.0f,0.0f&#41;;

	//?Set the ambient light colour?
	sceGuAmbientColor&#40;0xffffffff&#41;;

	//Draw array of vertices forming the cube.
	sceGumDrawArray&#40;GU_TRIANGLES,GU_TEXTURE_32BITF|GU_COLOR_8888|GU_VERTEX_32BITF|GU_TRANSFORM_3D,12*3,0,vertices&#41;;


	//Finish current display list and go back to the parent context.
	sceGuFinish&#40;&#41;;

	//Wait until display list has finished executing. Pass &#40;0,0&#41;
	sceGuSync&#40;0,0&#41;;

	//Wait for the screen to finish drawing, &#40;the next Vertical Refresh will be starting&#41;
	sceDisplayWaitVblankStart&#40;&#41;;

	//Swap display and draw buffer so we see what we just drew.
	sceGuSwapBuffers&#40;&#41;;
&#125;
The black bits are gone, but the texture still isn't right. I also tried moving those lines above sceGuTexMode(GU_PSM_4444,0,0,TRUE); but that didn't help either.
www.PSP-Files.com - PSP News, Hacks etc.

www.HiAsh.com - My work and stuff
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

call sceGuTexFlush() after sceGuTexImage() and before sceGumDrawArray(), otherwise it does not makes much sense. In Addition to sceGuTexFlush() it may even make more sense to call also sceGuTexSync(), in order to force the GE to wait until the texture transfer is finished before starting to render.

So the entire sequence looks like:

Code: Select all

/* paint your texture, and swizzle if you want to */
sceKernelDcacheWritebackAll&#40;&#41;;  /* get all data into memory */
/* upload your texture to VRAM if you want to, e.g. using CopyImage&#40;&#41; */
TexImage&#40;&#41;;  /* set up DMA pointers of GE for texture access */
TexFlush&#40;&#41;;  /* start DMA transfer */
/* ... do other things ...*/
TexSync&#40;&#41;;   /* let the GE wait for end of DMA transfer */
DrawArray&#40;&#41;; /* trigger vertex array rendering */

ashleydb
Posts: 26
Joined: Mon Oct 03, 2005 2:06 am
Location: USA
Contact:

Post by ashleydb »

Thanks again for your help. That all makes good sense. After implementing the calls in the order you suggest, I really don't know why its still not working, but I don't want to keep bothering you guys with it. I guess I'll keep plugging away and see if I get anywhere eventually.
www.PSP-Files.com - PSP News, Hacks etc.

www.HiAsh.com - My work and stuff
ashleydb
Posts: 26
Joined: Mon Oct 03, 2005 2:06 am
Location: USA
Contact:

Post by ashleydb »

Has anyone got any example code of doing this stuff so that it works? Maybe looking at a working implementation would point out an obvious error in my code.

Thanks!
www.PSP-Files.com - PSP News, Hacks etc.

www.HiAsh.com - My work and stuff
ashleydb
Posts: 26
Joined: Mon Oct 03, 2005 2:06 am
Location: USA
Contact:

Post by ashleydb »

ashleydb wrote:Has anyone got any example code of doing this stuff so that it works? Maybe looking at a working implementation would point out an obvious error in my code.

Thanks!
Anyone? Please? ;)
www.PSP-Files.com - PSP News, Hacks etc.

www.HiAsh.com - My work and stuff
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

what about the cube sample in gu/samples/ ?
ashleydb
Posts: 26
Joined: Mon Oct 03, 2005 2:06 am
Location: USA
Contact:

Post by ashleydb »

It doesn't swizzle anything. That is the demo I'm trying to edit. Unless it is swizzled and I'm trying to swizzle it again...
www.PSP-Files.com - PSP News, Hacks etc.

www.HiAsh.com - My work and stuff
holger
Posts: 204
Joined: Thu Aug 18, 2005 10:57 am

Post by holger »

haven't you been talking about cache inconsistency problems?
Post Reply