■
というのは、手元のRadeon 4850でPCIe Speed Test動かすと
(http://developer.amd.com/gpu/ATISTREAMPOWERTOY/Pages/default.aspx)
===> Testing device 0 <=== Device type: RV770 Max resource 2D width/height: 8192/8192 Total GPU memory size: 512 MB Total CPU cached space size: 508 MB Total CPU uncached space size: 1279 MB GPU engine clock: 665 MHz GPU memory clock: 993 MHz Number of timing loops: 100 [ 16 bytes] CPU->GPU= 320.000 KB/sec, GPU->CPU 400.000 KB/sec [ 32 bytes] CPU->GPU= 640.000 KB/sec, GPU->CPU 800.000 KB/sec [ 64 bytes] CPU->GPU= 1.280 MB/sec, GPU->CPU 1.600 MB/sec [ 128 bytes] CPU->GPU= 2.560 MB/sec, GPU->CPU 3.200 MB/sec [ 256 bytes] CPU->GPU= 8.533 MB/sec, GPU->CPU 8.533 MB/sec [ 512 bytes] CPU->GPU= 17.067 MB/sec, GPU->CPU 17.067 MB/sec [ 1024 bytes] CPU->GPU= 34.133 MB/sec, GPU->CPU 34.133 MB/sec [ 2048 bytes] CPU->GPU= 68.267 MB/sec, GPU->CPU 68.267 MB/sec [ 4096 bytes] CPU->GPU= 136.533 MB/sec, GPU->CPU 204.800 MB/sec [ 8192 bytes] CPU->GPU= 273.067 MB/sec, GPU->CPU 409.600 MB/sec [ 16384 bytes] CPU->GPU= 546.133 MB/sec, GPU->CPU 819.200 MB/sec [ 32768 bytes] CPU->GPU= 1.638 GB/sec, GPU->CPU 1.092 GB/sec [ 65536 bytes] CPU->GPU= 2.185 GB/sec, GPU->CPU 2.185 GB/sec [ 131072 bytes] CPU->GPU= 2.621 GB/sec, GPU->CPU 2.185 GB/sec [ 262144 bytes] CPU->GPU= 2.621 GB/sec, GPU->CPU 2.016 GB/sec [ 524288 bytes] CPU->GPU= 2.621 GB/sec, GPU->CPU 2.185 GB/sec [ 1048576 bytes] CPU->GPU= 2.621 GB/sec, GPU->CPU 2.185 GB/sec [ 2097152 bytes] CPU->GPU= 2.655 GB/sec, GPU->CPU 2.208 GB/sec [ 4194304 bytes] CPU->GPU= 2.672 GB/sec, GPU->CPU 2.208 GB/sec [ 8388608 bytes] CPU->GPU= 2.663 GB/sec, GPU->CPU 2.213 GB/sec [ 16777216 bytes] CPU->GPU= 2.693 GB/sec, GPU->CPU 2.213 GB/sec [ 33554432 bytes] CPU->GPU= 2.695 GB/sec, GPU->CPU 2.213 GB/sec [ 67108864 bytes] CPU->GPU= 2.696 GB/sec, GPU->CPU 2.213 GB/sec [ 134217728 bytes] CPU->GPU= 2.694 GB/sec, GPU->CPU 2.213 GB/sec calResAllocLocal2D() returned an error when trying to allocate 268435456 bytes! Peak CPU->GPU Bandwidth = 2.696 GB/sec [data size = 67108864 bytes] Peak GPU->CPU Bandwidth = 2.213 GB/sec [data size = 8388608 bytes]
こんな感じだったのでもうちょっと出てもよい気がしたという話。
ほんとにPCIeの8Gが出るなら、CPUと比べても十分速いと言えるのだけど…
あと何かの役に立つかもしれないけど、
http://software.intel.com/en-us/articles/increasing-memory-throughput-with-intel-streaming-simd-extensions-4-intel-sse4-streaming-load/
Uncachableなメモリ領域が必要になる時って来るのだろうか。と、思ったが、アレに使えそうだな。あとで調べよう。
手元のマシンが4850 + Q9550で、これちょうど一年ぐらい前のピーク(の一歩手前)な感じがするのだけど、あれから一年、PCが進化した感じは全然しないな。
未だに4870とかGTX280とかから全然変わってないし、Core2QuadとNehalemてあんまり変わってないし。