VFPU Hang

Discuss the development of new homebrew software, tools and libraries.

Moderators: cheriff, TyRaNiD

Post Reply
pegasus2000
Posts: 160
Joined: Wed Jul 12, 2006 7:09 am

VFPU Hang

Post by pegasus2000 »

When I try to compile this code:

#define GEN___Store16IntsToMatrixDestr(_mpTypeFunc_,_mpTag_,_mpNrMatrix_) \
\
\
_mpTypeFunc_ void _mpTag_##Store16IntsToMatrixDestr_m##_mpNrMatrix_ (float *Data) \
{ \
__asm__ volatile ( \
"vf2in.q C"#_mpNrMatrix_"00, C"#_mpNrMatrix_"00, 0\n" /* Esegue la conversione del contenuto del registro in */ \
"vf2in.q C"#_mpNrMatrix_"10, C"#_mpNrMatrix_"10, 0\n" /* integer: questo distrugge il contenuto della matrice */ \
"vf2in.q C"#_mpNrMatrix_"20, C"#_mpNrMatrix_"20, 0\n" /* VFPU. */ \
"vf2in.q C"#_mpNrMatrix_"30, C"#_mpNrMatrix_"30, 0\n" \
\
"sv.s S"#_mpNrMatrix_"00, 0 + %0\n" /* Provvedi a salvare i dati */ \
"sv.s S"#_mpNrMatrix_"01, 4 + %0\n" \
"sv.s S"#_mpNrMatrix_"02, 8 + %0\n" \
"sv.s S"#_mpNrMatrix_"03, 12 + %0\n" \
\
"sv.s S"#_mpNrMatrix_"10, 16 + %0\n" \
"sv.s S"#_mpNrMatrix_"11, 20 + %0\n" \
"sv.s S"#_mpNrMatrix_"12, 24 + %0\n" \
"sv.s S"#_mpNrMatrix_"13, 28 + %0\n" \
\
"sv.s S"#_mpNrMatrix_"20, 32 + %0\n" \
"sv.s S"#_mpNrMatrix_"21, 36 + %0\n" \
"sv.s S"#_mpNrMatrix_"22, 40 + %0\n" \
"sv.s S"#_mpNrMatrix_"23, 44 + %0\n" \
\
"sv.s S"#_mpNrMatrix_"30, 48 + %0\n" \
"sv.s S"#_mpNrMatrix_"31, 52 + %0\n" \
"sv.s S"#_mpNrMatrix_"32, 56 + %0\n" \
"sv.s S"#_mpNrMatrix_"33, 60 + %0\n" \
\
: : "m"(*Data)); \
\
return; \
}

// End macro


MACROGEN1d(, Store16IntsToMatrixDestr, ndHEL_XFPU_)




The code compiles correctly, but vf2in creates an error of the PSP CPU.
What is wrong ?
hlide
Posts: 739
Joined: Sun Sep 10, 2006 2:31 am

Post by hlide »

Huh... i've no internet access longer at home so it would be difficult for me to test it :///

1) Be sure you thread accepts VFPU and it is not called from a callback which can have its own thread. It looks like you have an exception when running VFPU instruction. Just add a simple VADD.S S000, S000, S000 at the begining of your code to check if you have an exception at ADD.

2) why do you need to do 4x4 "sv.s" instead of 4x2 "svl/r.q" ? in the end you will have only 8 stores instead of 16 stores, and even only 4 stores ("sv.q") if your data is 16-byte aligned.
Post Reply