Grid and Block dimension in (py)CUDA
I got a question regarding the dimensions of the blocks and grids in
(py)CUDA. I know that there are limits in the total size of the blocks,
but not of the grids
And that the actual blocksize influences the runtime. But what I'm
wondering about is: Does it make a difference if I have a block of 256
threads, to start it like (256,1) or to start it like (128,2), like (64,4)
etc.
If it makes a difference: which is the fastest?
Cheers, Andi
No comments:
Post a Comment