2006年7月的時候,屬於劍橋大學的一個學術兼遊戲媒體製作單位Geomerics發表了它們的第一個產品「Enlighten」,作用是一個tookit可以在其餘的繪圖引擎內加入Real time Radiosity的效果。
然後2007年2月的時候與Epic簽約,整合進UE3;並且在GDC07的時候展示了XBOX360版。(正好在同時,まいにちいっしょ也做出了類似Real Time Radiosity、或者說Ambient occlusion的效果,演算法來自3dsmax知名的plug-in,SkyLight)
Not so hot at replacing fillrate
- Bit difficult to use it as a substitute GPU
- Can't really render part of the scene on SPU and combine results on GPU
Triangle rasterisation setup, streaming…
Hardware filtering, mipmaps, perspective correction…Antialiasing,zbuffering,stencilling…
GPU really benefits from having dedicated hardware for this
Would be god damn complicated
- Also not really workable for post processing
Render scene >dma round through SPUs >process >render through GPU again
Could delay results by a frame?
Still not particularly desirable and large amount of data to move
Most previous work has focused on vertex based operations
- Makes sense given flexibility of SPUs
RSX can efficiently read textures straight from main memory
- Huge advantage for generating anything intended for the GPU on the SPU
- One of the best points of the PS3 architecture
Textures easy to process on SPUs
- Simple to stream in/out in chunks
- However, random accesses need to be made coherent - so not much good if you can't do this
Easy to do inplace modifications
- Possibility of progressively updating a texture
- No need to double buffer if you get sync right
Memory bandwidth is likely to be the bottleneck
- Generating large textures is going to generate problems in a heavily loaded system
- Generating full screen images still going to be unfriendly
- Compression is your friend
- Need to make your memory usage count
Textures are just a storage medium
- Ultimately just a way of getting data into shaders
- Many possibilities!
原理講完了,來點爽快的數字吧,Geometrics提供了他們在SPE上執行的效果與性能數字。
基本上Enlighten本身為了求可以整合進客戶的繪圖流程,他們表示Enlighten本身在一般的GPU(這應該指的是Xenos,或者是同時期的高階SM3 device,在GDC06[06Q1]的時候應該是G71和R580)上可以達到100fps的速度(也就是大約10ms左右),這樣才能在實際工作的狀況下,維持60fps。
當然它使用到了Render to Texture之類的技術,所以頻寬其實吃得也不會少,如果GPU負載高的話他們也有提供offload到Host CPU(比方說Xenon的Host CPU其中一個thread、或者是PC上的Multi-Core)的功能。
但是在PS3上,他們企圖全部由SPE來處理。因為RSX其實頻寬並不是很充足,做Render to Texture本身就是個很耗資源的事情。
Originally implemented a GPU version for reference on the PS3
- Runs perfectly fine, but expensive resource
結果SPE.... 實在太快了。XD
在1個SPE上只佔了5ms的執行時間,幾乎是60fps(15ms per frame)下1/3的單SPE資源,或者是SPE總資源的5%左右(5/ 15x6 大約5.5%)
(順道給個對比,當初まいにちいっしょ用了總共4個SPE來達到30fps,大約135ms;不過因為本身型態所致,他們畢竟沒有花很多時間在optimize上;而Geometrics的optimize工作,至少有從1月收到PS3硬體,到五月 Devstation07 發表之間這段時間可作,他們主要的業務也是這個)
SPU version much faster
- System runs in 5ms on a single SPU (!)
- That's = 1/3r d of an SPU @ 60 fps
- Or 5% of the total SPU potential at 60fps
That' s why we are excited
- Algorithm is scalable so we can crank up the quality
- Still need to explore possibilities unique to the PS3
- SPUs are more flexible than GPUs - haven't really exploited this yet
- Very promising future!