Who wants 252MB more RAM for PS3 homebrew.
> The driver is still experimental, feel free to test if interested:
> git clone http://mandos.homelinux.org/~glaurung/git/ps3vram.git/
Hello,
Thanks a lot for posting this very useful extension for the PS3.
I have compiled it against YDL 5 and in general it works quite well.
Two problems:
(1) Inserting the module for the first time usually fails with error
" could not allocate XDR buffer". After 3-4 retries (with intermediate rmmod)
it works.
(2) I do not get a speed of 100-150MB/s. hdparm shows ~30MB/s, and
if I switch all disk I/0 to ps3vram in my test-program I get a slight speed decrease.
I do have the dma-version, not the memcpy-version of ps3vram.
Regards
Helmut Dersch
> git clone http://mandos.homelinux.org/~glaurung/git/ps3vram.git/
Hello,
Thanks a lot for posting this very useful extension for the PS3.
I have compiled it against YDL 5 and in general it works quite well.
Two problems:
(1) Inserting the module for the first time usually fails with error
" could not allocate XDR buffer". After 3-4 retries (with intermediate rmmod)
it works.
(2) I do not get a speed of 100-150MB/s. hdparm shows ~30MB/s, and
if I switch all disk I/0 to ps3vram in my test-program I get a slight speed decrease.
I do have the dma-version, not the memcpy-version of ps3vram.
Regards
Helmut Dersch
Stupid question, but...
Excuse me if this is a really dumb question.
Could the kernel be reconfigured to allocate the framebuffer from the video RAM? Maybe even to use the bottom chunk that it is currently blitting to already? It seems like that might have two beneficial effects:
1. Free up more primary RAM for applications.
2. Speed up video operations.
Even if the framebuffer is allocated to the video RAM above the currently used segment, it seems like maybe the blit operation could be handed off to the GPU, which arguably might speed things up a bit.
Could the kernel be reconfigured to allocate the framebuffer from the video RAM? Maybe even to use the bottom chunk that it is currently blitting to already? It seems like that might have two beneficial effects:
1. Free up more primary RAM for applications.
2. Speed up video operations.
Even if the framebuffer is allocated to the video RAM above the currently used segment, it seems like maybe the blit operation could be handed off to the GPU, which arguably might speed things up a bit.
Re: Stupid question, but...
Having the CPU write directly to VRAM would be incredibly slow.jovi wrote:Excuse me if this is a really dumb question.
Could the kernel be reconfigured to allocate the framebuffer from the video RAM? Maybe even to use the bottom chunk that it is currently blitting to already? It seems like that might have two beneficial effects:
1. Free up more primary RAM for applications.
2. Speed up video operations.
Even if the framebuffer is allocated to the video RAM above the currently used segment, it seems like maybe the blit operation could be handed off to the GPU, which arguably might speed things up a bit.
Currently we store the framebuffer in RAM and use the GPU to copy it to VRAM, which is the fastest way. The only benefit would be to free up a little bit of primary RAM, but if your intended application really cares that much about 1-2% of your RAM then you probably need to rethink your general approach anyway.
>Having the CPU write directly to VRAM would be incredibly slow.
Writes are fast. DMA writes are at ~5 GB/s I think.
XDR frame buffer is allocated at system startup. By default ps3fb allocates 9 megs for 1080i frame buffer. You can change that value to 4-5 MiBs if you do not want to use HD resolutions.
Writes are fast. DMA writes are at ~5 GB/s I think.
XDR frame buffer is allocated at system startup. By default ps3fb allocates 9 megs for 1080i frame buffer. You can change that value to 4-5 MiBs if you do not want to use HD resolutions.
Sure, but that still requires that we prepare a buffer beforehand that we can pass to gpu_blit or whatever. If we want random access to the framebuffer (which AFAIK is something Linux expects), then direct CPU writes are all we can use, and those are something like 10.6MB/sec when going direct to VRAM.>Having the CPU write directly to VRAM would be incredibly slow.
Writes are fast. DMA writes are at ~5 GB/s I think.
I still don't understand. If the CPU writes directly to video ram through a pointer, it's 10MB/sec. That's what I measured with my original ps3vram driver. If you want to DMA, you'll need to prepare a buffer beforehand that you can pass to the DMA hardware, so you'll still need a copy of the framebuffer in main RAM. Can you be more specific about how to get GiBs/s with random writes to VRAM from the CPU?IronPeter wrote:VRAM writes by CPU are fast ( GiBs/s for DMA ).
> still don't understand. If the CPU writes directly to video ram through a pointer, it's 10MB/sec
It is better to try SPU-initiated DMAs for VRAM writes. I'm not sure about exact numbers, need to retest. But I think that 5GiB/s is possible to achieve.
High VRAM write rate is very usable for resource uploading ( textures, etc ). Not very usable for SW 2D driver, because you need to blend framebuffer or to perform masked writes.
It is better to try SPU-initiated DMAs for VRAM writes. I'm not sure about exact numbers, need to retest. But I think that 5GiB/s is possible to achieve.
High VRAM write rate is very usable for resource uploading ( textures, etc ). Not very usable for SW 2D driver, because you need to blend framebuffer or to perform masked writes.
The hardware is probably implementing your write asjimparis wrote:I still don't understand. If the CPU writes directly to video ram through a pointer, it's 10MB/sec.
1) Read 128 bytes
2) Change one word
3) Write 128 bytes
The read will take you down to MB/s.
Is video memory cached? The cache will behave like this, but if you're writing entire cache lines at a time there's an instruction you can use to clear the line before you start writing. That'll remove step 1.
If it isn't, there probably isn't any way around it from PPU. DMA from SPU should get full speed (doing aligned transfers of multiples of 128 bytes).
Right. But the original question was whether we could move the kernel's framebuffer memory directly into VRAM. I still don't think we can. Applications can mmap that space and we don't know their access patterns. We could implement our own caching in cpu-local RAM using a smaller buffer and paging in/out as necessary via MMU tricks, but that gets complicated and I'm not sure it's worth it.Is video memory cached? The cache will behave like this, but if you're writing entire cache lines at a time there's an instruction you can use to clear the line before you start writing. That'll remove step 1.