The hunt for HV's FIFO/Push buffer...

ps2devman · Post by **ps2devman** » Sat May 19, 2007 1:28 am

Since a parameter of hypervisor's function used to request a bitblt from GPU looks like the one we would use ourself in a Nvidia miniport push buffer (alias FIFO) if we had direct access to RSX... I think there is really a push buffer/FIFO used by hypervisor. A circular main memory area, where a few dwords are written (1 command + few parameters).
There is certainly a variable (tail) that points at the end of these list.
Area may be very short if hypervisor knows linux runs and only bitblt commands will be issued.

When this variable (tail) changes, that means new commands just got written. A direct write to GPU mapped registers is probably made at the same time to warn GPU it should start reading them and where it should stop.

If we manage to survey that variable (tail) and if we manage to put our own commands and parameters in the FIFO/push buffer, immediately after this change, it's our commands that will be executed (if it's too fast, we can leave alone first commands and target next ones).

I'm not naive I doubt software tricks can bypass memory addresses range verification when CPU attempts a write/read somewhere in memory.
But who knows, with marcus' other os demo...

I'm rather thinking about a ram attack. That is hardware mods allowing to spy/change main ram.

I also heard a rumour about a usb chipset in PS3 hardware, for which we have all direct access, and that may act as a dma controller, thus allowing to read/write anywhere.

What's your opinion about this idea?

ldesnogu · Post by **ldesnogu** » Sat May 19, 2007 1:42 am

EDIT : Sorry I read your post too fast, I thought you wanted to do it in a pure software way :)

ps2devman wrote:What's your opinion about this idea?

My opinion is simple: you will probably never find any such thing that is accessible when not in hypervisor mode.

Here is the probable way hypervisor works:

1. the processor is either in hypervisor mode or not
2. there are different memory mappings with different protections depending on whether you run in hypervisor mode or not
3. the hypervisor setup code does not want non hypervisor modes to see most of the IO space and certainly not its own private memory.

Basically this means that if the hypervisor doesn't want you to see portions of memory, then there is no way you will see them, unless you enter hypervisor mode.

So if there is an hypervisor API call that can "unprotect" some memory (cf. GPU VRAM access kernel module), or if you find a way to run your code in hypervisor mode, then you will be able to find your memory location. Else... :)

StrontiumDog · Post by **StrontiumDog** » Sat May 19, 2007 11:19 am

ldesnogu wrote:
ps2devman wrote:What's your opinion about this idea?
My opinion is simple: you will probably never find any such thing that is accessible when not in hypervisor mode.

Here is the probable way hypervisor works:

1. the processor is either in hypervisor mode or not
2. there are different memory mappings with different protections depending on whether you run in hypervisor mode or not
3. the hypervisor setup code does not want non hypervisor modes to see most of the IO space and certainly not its own private memory.

Basically this means that if the hypervisor doesn't want you to see portions of memory, then there is no way you will see them, unless you enter hypervisor mode.

I think you are half right. There is a catch 22 in what you say. If the RSX can access it, then it can be DMA'd by the RSX. So if the HV has such a Fifo buffer for sending commands to the RSX it must, by definition be in an area of memory that can be DMA'd from. By my reading of the CELL specs, DMA is an all or nothing proposition. If Anything can DMA to/from memory then everything can DMA to/from that memory. The HV code is 99.999% for sure locked in an area of memory that can not be DMA'd to/from which precludes it holding such a Fifo. In which case the HV must be split. It must have a DMA accessable portion it uses to communicate to other devices, if it in fact does have a FIFO buffer the RSX can DMA from. This means if a way can be found to control DMA of some other device and that FIFO can be found, it can be manipulated by writing over it with another DMA call.

ps2devman · Post by **ps2devman** » Sat May 19, 2007 5:11 pm

That's good and bad news.

Bad news:
That means the variable (tail) is probably in a non dma-able area.

Good news:
Still hope for the FIFO/push buffer itself!
Instead of surveying the variable (tail), we can survey the first command and anticipate what command it will be and the number of parameters involved.

First step would be to explore all dma-able areas to check for remaining traces of one of the parameters of the bitblt : "(xres << 16) | yres"
Usually, what is written in FIFO/push buffer is not erased after being
read by GPU. It will just be overwritten when all buffer is used (a jump
to head will occur) since it's a circular buffer. It shouldn't be a too common
value in main ram... Also it should be repeated in memory at exact
same "distance" (a few dwords).

Remember, ralferoo, pointed at this kernel call (let's say HV API) :

Code: Select all

status = lv1_gpu_context_attribute&#40;ps3fb.context_handle,
	L1GPU_CONTEXT_ATTRIBUTE_FB_BLIT,
	offset, fb_ioif,
	L1GPU_FB_BLIT_WAIT_FOR_COMPLETION |
	&#40;xres << 16&#41; | yres,
	xres * BPP&#41;;

(Sony engineer lazyness is our friend, it seems... It's obvious this parameter is pre-formatted for direct sending to FIFO/Push buffer.
Because it's exactly the format expected by bitblt on nv20 chipsets)

Good hunting to all!

ps2devman · Post by **ps2devman** » Sat May 19, 2007 5:14 pm

Maybe search for
L1GPU_FB_BLIT_WAIT_FOR_COMPLETION | (xres << 16) | yres
as well... or only look for relevant lower bits.

mc · Post by mc » Thu May 24, 2007 9:30 pm

StrontiumDog wrote:This means if a way can be found to control DMA of some other device and that FIFO can be found, it can be manipulated by writing over it with another DMA call.

Which is why the hypervisor will not allow you direct access to anything
capable of such DMA. You can control the DMA function of the SFCs in
the SPEs, but those DMA accesses are translated through the PTEs, so
you'll not be able to access anything you can't with regular CPU accesses.

rz950 · Post by **rz950** » Mon Jul 16, 2007 7:35 am

The ps3 activates the hypervisor when it boots into the other os? This is the only thing I don't understand, what actually tells the hypervisor to activate, isn't it not "on" when running a game? I understand the other os or "boot loader" doesnt actually call the hypervisor, if it did we could have had a modded boot loader but I am guessing the ps3 firmware tells the hypersior to go into block rsx mode(yeah I come up with some good names xD) when the other os is running. If I am wrong please feel free to correct me, I would like to know.

Friend just got a ps3 yesturday (version 1.51) and I have been reading up on it a bit more then before.

mc · Post by mc » Mon Jul 16, 2007 8:06 am

When booting, the system checks a bit in the flash which says if it should boot GameOS or "Other OS". If it's set to "Other OS", it starts the hypervisor which will in turn start otheros.bld. So when otheros.bld starts, the hypervisor is already active.

Whether GameOS runs without a hypervisor, or just with a different hypervisor, is unknown at this point.

rz950 · Post by **rz950** » Mon Jul 16, 2007 8:36 am

ah, I see now thanks for explaining. Other OS is just responsible of launching linux basically then even through modded ones can do so many other things. I know it wont do much but a modded other os that would call the hypervisor, it should report it as active, just a thought?

Well, thanks again for explaining so quick and nice work on the modded other os mc :)

Edited: This is a bit off-topic but I am wondering if I should update or not, I told my friend not too and he knows why and understands that he shouldn't but just want to know what anyone here thinks.

mc · Post by mc » Mon Jul 16, 2007 7:36 pm

The "Other OS" mechanism is intended to support launching any OS under
virtualization, but since this virtualization is not fully transparent, the
OS needs to be adapted to run in this environment, and currently Linux
is the only OS for which such an adaption has been done (since that's
the OS which Sony themselves chose to do it for). But yes, "otheros.bld"
is just intended as a launcher ("bld" = Bootloader) which loads the
actual OS from the harddrive. But any handover between bld and OS
is up to them, the hypervisor will not care but present them with the
exact same virtualized environment.

I know it wont do much but a modded other os that would call the hypervisor, it should report it as active, just a thought?

This question I do not understand. What do you mean by "report it
as active", and who would report whom to whom?

IronPeter · Post by **IronPeter** » Sun Sep 30, 2007 4:51 am

The new sources for ps3fb.c http://www.everfall.com/paste/id.php?kjjqgvpbncu0 contain GPU_CMD_BUF_SIZE macro.

It's clear that the memory region [ ps3fb_videomemory.address + ps3fb_videomemory.size - GPU_CMD_BUF_SIZE, ps3fb_videomemory.address + ps3fb_videomemory.size ) is FIFO buffer.

This region dump: http://www.everfall.com/paste/id.php?ieto25cyoy0g

PS. excuse my English.

IronPeter · Post by **IronPeter** » Mon Oct 01, 2007 12:26 am

Yes, it is realy push buffer.

The good idea is to test all undocumented entries from http://wiki.ps2dev.org/ps3:hypervisor:l ... _attribute , get push buffer dump for the each one and compare with nouveau NV40 push buffer database.

IronPeter · Post by **IronPeter** » Sat Oct 06, 2007 2:44 am

fresh news about fifo control regs.

There is lpar_dma_control area. It can be iomapped. This area is filled with zeroes, only 3 dwords are non-zeroes. dwords with indices 0x10, 0x11, 0x15.

The value of dword 0x11 just after hypervisor blit is 0xe1f0000, few ms later is 0xe1f0048 and 0xe1f00b8 finally.

compare with
http://nouveau.cvs.sourceforge.net/nouv ... iew=markup

Code: Select all

  711 	printf&#40;"FIFO put=0x%08x, get=0x%08x\n",
  712 			fifo_regs&#91;0x40/4&#93;,
  713 			fifo_regs&#91;0x44/4&#93;
  714 			&#41;;
  715 	FIRE_RING&#40;&#41;;
  716 	sleep&#40;1&#41;;
  717 	printf&#40;"FIFO put=0x%08x, get=0x%08x\n",
  718 			fifo_regs&#91;0x40/4&#93;,
  719 			fifo_regs&#91;0x44/4&#93;
  720 			&#41;;

Edited: constants

IronPeter · Post by **IronPeter** » Sun Oct 07, 2007 3:58 am

I've parsed blit push buffer into dma packets ( size of packet, subchannel id, tag ):

http://www.everfall.com/paste/id.php?z51ttk37j71s

My screen resolution is 1280 x 1024 ( 0x500 x 0x400 ).
Funny, hypervisor made 2 blits ( one 1024 x 1024 and one 256 x 1024 ), so parameters of hypervisor call are not "preformated for th GPU".

You can refer this document relating push buffer tags:

http://gitweb.freedesktop.org/?p=nouvea ... veau_reg.h

jimparis · Post by **jimparis** » Sun Oct 07, 2007 4:47 am

Nice work, this is good stuff.

Post by **Warren** » Sun Oct 07, 2007 9:02 am

IronPeter wrote: My screen resolution is 1280 x 1024 ( 0x500 x 0x400 ).
Funny, hypervisor made 2 blits ( one 1024 x 1024 and one 256 x 1024 ), so parameters of hypervisor call are not "preformated for th GPU".

Got to love power of 2 sized textures.

ps2devman · Post by **ps2devman** » Sun Oct 07, 2007 6:35 pm

Congratulations IronPeter! Smells good, very good!

IronPeter · Post by **IronPeter** » Sun Oct 07, 2007 6:43 pm

I was able to run blit push buffer from the user land using fifo control regs.

There was some kind of protection. Very weak protection.

It works unstable for now, but it does work. Probably, it's possible to write some kind of 2D support ( stretched blits, color fills, etc ).

The main question is about 3D support. We need so-called "context objects" to be properly initalized. Probably, hypervisor does this work for us. All we need are handles ( and lpar_dma_reports contains something that looks like this handles ). To initialize these objects "by hands" we need to access to very special RSX registers, so called RAMIN area.

PS. I investigate RSX with only open-source information. I have no signed NDA with Sony or NVidia.

ps2devman · Post by **ps2devman** » Sun Oct 07, 2007 7:11 pm

Ok, things are getting serious now... Hehe.
What is your firmware version?

We need to be careful and detect when this new trick will become unusable in future firmware versions. Someone with infectus and the ability to swap firmware would be the best person for detecting such infamous change...

Thanks for your great work, IronPeter!

PS: If you could publish a minimal source, even unstable, that can be used to test this new trick for each firmware version that would be great! Thanks!

IronPeter · Post by **IronPeter** » Sun Oct 07, 2007 8:31 pm

The firmware version is 1.8

The trick is very simple, I can describe it without posting the full sources.

Look at the push buffer dump:

http://www.everfall.com/paste/id.php?uxdlpwlbfpo9

It is the image of push buffer after hypervisor blit. The end of buffer is at 0xb8.

The last packet sends zero to the subchannel zero with tag 0x110. Replace the tag with NOP ( 0x100 ) while buffer is kicked by the hypervisor and is executing with the RSX.

Fill push buffer with N + 1 copies of the first 0xb8 bytes.

In cycle for( int i = 0; i < N; ++i ) modify client screen buffer in some way ( fill with random numbers ), kick push buffer via writing ( (uint32_t *)ioremap( lpar_dma_control, 1024) )[ 0x10 ] = 0xe1f0000 + 0xb8 * ( i + 1 ); sleep ( 1 ); The client screen in the xdr memory will blits in the videomemory.

If nobody is unable to repeat these steps I post the full sources.

Edited: bugfix

mc · Post by mc » Sun Oct 07, 2007 8:35 pm

Very nice work, IronPeter!

I find it intriguing that not only are you allowed to define the FIFO region
in user memory, but that you are also allowed to map the control area.
This suggests to me that Sony actually intended to support HW 3D under
Linux, as they could just as easily have made the control area accessible
only to the hypervisor.

IronPeter · Post by **IronPeter** » Sun Oct 07, 2007 8:47 pm

mc wrote: ... but that you are also allowed to map the control area.

We are allowed to map only part of mmio register. Only context control registers. I do not know the way to map the global RAMIN area.

The good source about this stuff:
http://nouveau.freedesktop.org/wiki/HonzaHavlicek

mc · Post by mc » Sun Oct 07, 2007 9:01 pm

Well, yeah, but that only makes it more likely that it is intentional
that this particular part can be mapped. It would make sense
to only expose the parts needed for performance (= FIFO interface)
and handle access to other parts through the HV.

Thanks for the link, BTW.

IronPeter · Post by **IronPeter** » Sun Oct 07, 2007 9:29 pm

Yes, mc, you are right.

The only thing that needs direct FIFO access is real time 3D acceleration. 2D part does not need that. So Sony probably wants to expose 3D driver.

IronPeter · Post by **IronPeter** » Wed Oct 10, 2007 3:21 am

It's push buffer dump after hypervisor FB_SETUP. Programmers from Sony decided not to clean up push buffer after context objects set up.

http://www.everfall.com/paste/id.php?ew29498z816w

Enjoi it.

Glaurung · Post by **Glaurung** » Thu Oct 11, 2007 5:21 am

Hi all,

I could reproduce the FIFO hack described by IronPeter, using firmware v1.93. I had to setup a large blit (which is decomposed into many 1024x1024 blits by the HV) in order to have time to tweak the FIFO area. Instead of patching with a NOP (which works), I chose to set the FIFO end pointer two operations back. I also had to remove the L1GPU_FB_BLIT_WAIT_FOR_COMPLETION flag from the call to lv1_gpu_context_attribute(), so to sum it up:

Code: Select all

        /* large blit */
	lv1_gpu_context_attribute&#40;ps3.context_handle,
					   L1GPU_CONTEXT_ATTRIBUTE_FB_BLIT,
					   dst_offset,
					   GPU_IOIF + src_offset,
					   &#40;1ULL << 31&#41; |
					   &#40;1280 << 16&#41; | 1280,
					   1280*4&#41;;

        /* go back two operations */
	ps3.fifo_regs&#91;0x10&#93; -= 8;

        /* wait for end of GPU operation */
	while &#40;ps3.fifo_regs&#91;0x11&#93; != ps3.fifo_regs&#91;0x10&#93; &&
	       ps3.fifo_regs&#91;0x15&#93; != ps3.fifo_regs&#91;0x10&#93;&#41;;

        /* copy our operation to the fifo */
	memcpy&#40;&ps3.fifo&#91;fifo_idx&#93;, blit_program, sizeof&#40;blit_program&#41;&#41;;

        /* fill the vram in white */
	memset&#40;ps3.vram, 0xff, ps3.vram_size&#41;;

        /* kick off the GPU */
	ps3.fifo_regs&#91;0x10&#93; += sizeof&#40;blit_program&#41;;

I was also able to send various other blit commands in the FIFO, using documentation from the nouveau project. For example, the following FIFO commands will do a YUYV blit instead of a ARGB blit:

Code: Select all

uint32_t blit_program&#91;&#93; = &#123;
	0x00106300, // SURFACE_FORMAT &#40;size 4, subchannel 3&#41;
	0x0000000a, //  SURFACE_FORMAT_A8R8G8B8
	0x14001400, //  &#40;&#40;pitch&#123;dst&#125; << 16&#41; | pitch&#123;dst&#125;&#41;
	0x00000000, //  src_offset
	0x00000000, //  dst_offset

	0x0024c2fc, // NV_IMAGE_BLIT_OPERATION &#40;size 9, subchannel 6&#41;
	0x00000001, //  not dither &#40;0 = dither&#41;
	0x00000005, //  STRETCH_BLIT_FORMAT_YUYV
	0x00000003, //  STRETCH_BLIT_OPERATION_COPY
	0x00000000, //  &#40;dstX << 16 | dstY&#41;
	0x02d00400, //  &#40;&#40;height << 16&#41; | width&#41;
	0x00000000, //  &#40;dstX << 16 | dstY&#41;
	0x02d00400, //  &#40;&#40;height << 16&#41; | width&#41;
	0x00100000, //  step_x in 12.20 fixed point
	0x00100000, //  step_y in 12.20 fixed point

	0x0010c400, // STRETCH_BLIT_SRC_SIZE &#40;size 4, subchannel 6&#41;
	0x02d00400, //  &#40;&#40;height << 16&#41; | width&#41;
	0x00021400, //  pitch_src
		    //    | &#40;STRETCH_BLIT_SRC_FORMAT_ORIGIN_CORNER << 16&#41;
		    //    | &#40;STRETCH_BLIT_SRC_FORMAT_FILTER_POINT_SAMPLE << 24&#41;
	0x0d000000, //  GPU_IOIF + src_offset
	0x00000000, //  srcX | &#40;srcY<<16&#41;
&#125;;

Thanks IronPeter for your hard work. I'll now try playing a bit with subchannel bindings, your most recent post looks quite interesting.

Seather · Post by **Seather** » Thu Oct 11, 2007 2:17 pm

Would this hypervisor be anything close to the Xen from Sun?

jimparis · Post by **jimparis** » Thu Oct 11, 2007 3:58 pm

1) Xen isn't from Sun, it's from XenSource.
2) This thread has nothing to do with that, please keep it on-topic.
3) It is the same in theory but there are no similarities that are useful to us.

IronPeter · Post by **IronPeter** » Thu Oct 11, 2007 4:07 pm

Glaurung, nice.

I want to say few words about testing methodology. Segmentation fault in the kernel mode is not very good. I made few things to make life easier:

1.) I extended ps3fb's memory mapping for the last 65536 bytes. open( "/dev/fb0" ), mmap it, use push buffer in the client mode.

2.) I extended ps3fb's ioctl for the fifo control registers read/write.

3.) lv1_gpu_context_intr ( interruption ) is very useful for you. Extend ioctl with this function. And it is better to disable (also by ioctl) regular kernel's lv1_gpu_context_intr in the driver.

IronPeter · Post by **IronPeter** » Thu Oct 11, 2007 4:34 pm

Glaurung, would you like to test blit from videomemory to the system area? I have no access for ps3 for a short time.

http://wiki.ps2dev.org/ps3:hypervisor:l ... y_allocate returns only 252 megs, the top of the videomemory seems to contatin RAMIN area ( the global GPU control area ).

This area must be protected from read/write, but probably...

forums.ps2dev.org

The hunt for HV's FIFO/Push buffer...

The hunt for HV's FIFO/Push buffer...

Re: The hunt for HV's FIFO/Push buffer...

Re: The hunt for HV's FIFO/Push buffer...

Re: The hunt for HV's FIFO/Push buffer...

push buffer is open for read-write

funny push buffer disassembling

Re: funny push buffer disassembling

Complete subchannel mapping

confirmed FIFO hack

few words about methodology