Blitting 24 bit image data into 32 bit frame buffer...

Discuss the development of new homebrew software, tools and libraries.

Moderators: cheriff, TyRaNiD

Post Reply
User avatar
pailes
Posts: 16
Joined: Fri Feb 16, 2007 4:31 am
Contact:

Blitting 24 bit image data into 32 bit frame buffer...

Post by pailes »

Hi there,

I was wondering what would be the fastest way on MIPS to blit a 24 bit image (3 bytes per pixel) into the 32 bit frame buffer without causing alignment issues?
Right now I'm moving three single bytes which feels kinda slow. I could move from 24 bit image data to 32 bit image data and copy dwords but then I'd be wasting some memory I could use otherwise.
Any tips or hints?
crazyc
Posts: 408
Joined: Fri Jun 17, 2005 10:13 am

Re: Blitting 24 bit image data into 32 bit frame buffer...

Post by crazyc »

pailes wrote:Hi there,

I was wondering what would be the fastest way on MIPS to blit a 24 bit image (3 bytes per pixel) into the 32 bit frame buffer without causing alignment issues?
Right now I'm moving three single bytes which feels kinda slow. I could move from 24 bit image data to 32 bit image data and copy dwords but then I'd be wasting some memory I could use otherwise.
Any tips or hints?
AFAIK, the fastest way would be to use lwl and lwr to load the data then mask off the top byte.
StrmnNrmn
Posts: 46
Joined: Wed Feb 14, 2007 11:32 pm
Location: London, UK
Contact:

Post by StrmnNrmn »

Can you load 4*3 bytes at a time (4 RGB pixels) and use a a few shifts and masks to forms 4*4 bytes to write out (4 RGBA pixels)? Something like:

Code: Select all

lui t5, 0xff00    // assuming alpha is 255

loop:
lw  t0, 0(a0)
lw  t1, 4(a0)
lw  t2, 8(a0)
addiu a0, a0, 12

// Pixel 1
srl t3, t0, 8  // extract RGBx -> 0RGB
or  t3, t3, t5   // or in alpha
sw t3, 0(a1)

// Pixel 2
sll t3, t0, 16  // xxxR -> xR00
srl t4, t1, 16 // GBxx -> 00GB
or t3, t3, t4  // Combine
or t3, t3, t5  // Add alpha
sw t3, 4(a1)

// Pixel 3
sll t3, t1, 8    // xxRG -> xRGx
srl t4, t2, 24  // Bxxx -> 000B
or t3, t3, t4  // Combine
or t3, t3, t5  // Add alpha
sw t3, 8(a1)

// Pixel 4
// t2 is xRGB - just or in alpha
or t2, t2, t5
sw t2, 12(a1)
addiu a1, a1, 24

branch to loop while pixels remain
I'm assuming that there's quite a large penalty for lwl/lwr, otherwise you're probably not going to get anything for the extra effort.

The code above also assumes a certain ordering for the RGB elements, which I'm fairly sure is wrong...You also might get some mileage from using ext/ins?

-StrmnNrmn
Post Reply