CUDA編05::線形メモリ

現在地 >> メニュー >> CUDA >> CUDA編05 >> CUDA編05::線形メモリ
関連：CUDA編05::TextureReference

線形メモリについて

線形メモリは、GPU上にある32ビットのアドレス空間である。
一般には、

「cudaMalloc()」でメモリ確保
「cudaFree()」でメモリ解放

である。

CPU―GPU間のデータ転送は、「cudaMemcpy()」で行う。

なお、

２次元データの場合は「cudaMallocPitch()」、「cudaMemcpy2D()」
3次元データの場合は「cudaMalloc3D()」、「cudaMemcpy3D()」

が用意されている。

線形メモリとテクスチャとバインド

付属のプログラミングガイドをよく読むと、テクセル値を取ってくる関数には

tex1Dfetch関数　→　線形メモリ用
tex*D関数　→　CUDA配列用

がある。（プログラミングガイド2.3 B.8）

ということは、線形メモリの場合は「tex1Dfetch関数」を使うので、
次のような制限がでてくる。

1次元テクスチャ限定
テクスチャ座標は整数型
フィルタ、境界条件設定は不可
等々

線形メモリで使用可能なテクスチャオブジェクト

線形メモリでは次のようなテクスチャオブジェクトと組み合わせることができる。

■texture<Type, 1, cudaReadModeElementType>
■texture<unsigned char, 1, cudaReadModeNormalizedFloat>
■texture<signed char, 1, cudaReadModeNormalizedFloat>
■texture<unsigned short, 1, cudaReadModeNormalizedFloat>
■texture<signed short, 1, cudaReadModeNormalizedFloat>

なお「Type」には、「uchar2」「uchar4」などの「２」または「4」の要素も指定できる。

線形メモリのバインド

線形メモリのバインドは「cudaBindTexture関数」を使用する。

//C版
cudaError_t cudaBindTexture (size_t * offset, const struct textureReference *texref, 
		const void *devPtr, const struct cudaChannelFormatDesc * desc, size_t size)

//C++版
template<class T , int dim, enum cudaTextureReadMode readMode> cudaError_t
cudaBindTexture (size_t  *offset, const struct texture< T, dim, readMode > & tex,
		const void *devPtr, const struct cudaChannelFormatDesc & desc, size_t size = UINT_MAX)

【offset】
オフセット。

【texref】
バインドするテクスチャオブジェクト。

【devPtr】
GPU上の処理データの場所。

【size】
メモリ領域のサイズ。

【例】

texture<float,1,cudaReadModeElementType> texRef;

//GPUのメモリ確保、データ転送
float *deviceAry;
cudaMalloc( …　… );//GPUのメモリ確保
cudaMemcpy( …　…);//GPUへデータをコピー(Host -> GPU)

//バインド
cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc<float>();
CUDA_SAFE_CALL(cudaBindTexture(NULL,texRef,deviceAry,channelDesc)) ;

テクスチャフェッチ

線形メモリでは「tex1Dfetch関数」を使ってテクセル値を取得する。

【tex1Dfetch関数】

■template<class Type>
Type tex1Dfetch(texture<Type, 1, cudaReadModeElementType> texRef,  int x);

■float tex1Dfetch(texture<unsigned char, 1, cudaReadModeNormalizedFloat> texRef,  int x);
■float tex1Dfetch(texture<signed char, 1, cudaReadModeNormalizedFloat> texRef,  int x);
■float tex1Dfetch(texture<unsigned short, 1, cudaReadModeNormalizedFloat> texRef,  int x);
■float tex1Dfetch(texture<signed short, 1, cudaReadModeNormalizedFloat> texRef,  int x);

【例】

texture<float,1,cudaReadModeElementType> texRef;
…　…
//------------- CUDAでの処理 ---------------//
__global__ void cuInverse(float *gpuAry,int widthStep)
{
	//現在位置の取得
	unsigned int x = blockIdx.x*blockDim.x + threadIdx.x;
	unsigned int y = blockIdx.y*blockDim.y + threadIdx.y;
	unsigned int Idx = y*widthStep+x;

	gpuAry[Idx] = 1.f - tex1Dfetch(texRef,Idx);//floatの場合1～0の間で扱う
}