Hi there,
I was wondering what would be the fastest way on MIPS to blit a 24 bit image (3 bytes per pixel) into the 32 bit frame buffer without causing alignment issues?
Right now I'm moving three single bytes which feels kinda slow. I could move from 24 bit image data to 32 bit image data and copy dwords but then I'd be wasting some memory I could use otherwise.
Any tips or hints?
Blitting 24 bit image data into 32 bit frame buffer...
Re: Blitting 24 bit image data into 32 bit frame buffer...
AFAIK, the fastest way would be to use lwl and lwr to load the data then mask off the top byte.pailes wrote:Hi there,
I was wondering what would be the fastest way on MIPS to blit a 24 bit image (3 bytes per pixel) into the 32 bit frame buffer without causing alignment issues?
Right now I'm moving three single bytes which feels kinda slow. I could move from 24 bit image data to 32 bit image data and copy dwords but then I'd be wasting some memory I could use otherwise.
Any tips or hints?
Can you load 4*3 bytes at a time (4 RGB pixels) and use a a few shifts and masks to forms 4*4 bytes to write out (4 RGBA pixels)? Something like:
I'm assuming that there's quite a large penalty for lwl/lwr, otherwise you're probably not going to get anything for the extra effort.
The code above also assumes a certain ordering for the RGB elements, which I'm fairly sure is wrong...You also might get some mileage from using ext/ins?
-StrmnNrmn
Code: Select all
lui t5, 0xff00 // assuming alpha is 255
loop:
lw t0, 0(a0)
lw t1, 4(a0)
lw t2, 8(a0)
addiu a0, a0, 12
// Pixel 1
srl t3, t0, 8 // extract RGBx -> 0RGB
or t3, t3, t5 // or in alpha
sw t3, 0(a1)
// Pixel 2
sll t3, t0, 16 // xxxR -> xR00
srl t4, t1, 16 // GBxx -> 00GB
or t3, t3, t4 // Combine
or t3, t3, t5 // Add alpha
sw t3, 4(a1)
// Pixel 3
sll t3, t1, 8 // xxRG -> xRGx
srl t4, t2, 24 // Bxxx -> 000B
or t3, t3, t4 // Combine
or t3, t3, t5 // Add alpha
sw t3, 8(a1)
// Pixel 4
// t2 is xRGB - just or in alpha
or t2, t2, t5
sw t2, 12(a1)
addiu a1, a1, 24
branch to loop while pixels remain
The code above also assumes a certain ordering for the RGB elements, which I'm fairly sure is wrong...You also might get some mileage from using ext/ins?
-StrmnNrmn