1 #ifndef HALIDE_MINI_CUDA_H 2 #define HALIDE_MINI_CUDA_H 9 #if defined(WINDOWS) && defined(BITS_32) 10 #define CUDAAPI __stdcall 232 #define CU_POINTER_ATTRIBUTE_CONTEXT 1 struct Halide::Runtime::Internal::Cuda::CUDA_MEMCPY3D_st CUDA_MEMCPY3D
Device is using TCC driver model.
Device supports caching globals in L1.
Alternate maximum 3D texture height.
Global memory bus width in bits.
Specifies whether there is a run time limit on kernels.
size_t WidthInBytes
Width of 3D memory copy in bytes.
Maximum block dimension X.
Maximum number of threads per block.
Deprecated, use CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_LAYERED_WIDTH.
size_t dstPitch
Destination pitch (ignored when dst is array)
Deprecated, use CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_LAYERED_LAYERS.
Maximum layers in a cubemap layered texture.
Maximum 3D surface depth.
void * reserved1
Must be NULL.
Deprecated, use CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_BLOCK.
Maximum block dimension Y.
Maximum cubemap surface width.
Unique id for a group of devices on the same multi-GPU board.
enum Halide::Runtime::Internal::Cuda::CUjit_option_enum CUjit_option
size_t dstXInBytes
Destination X in bytes.
PCI device ID of the device.
size_t Depth
Depth of 3D memory copy.
Deprecated, use CU_DEVICE_ATTRIBUTE_MAX_SHARED_MEMORY_PER_BLOCK.
Device is on a multi-GPU board.
Maximum 2D layered texture width.
Minor compute capability version number.
Maximum cubemap layered texture width/height.
Maximum grid dimension Z.
Maximum 3D surface width.
size_t dstY
Destination Y.
Alternate maximum 3D texture width.
CUarray srcArray
Source array reference.
CUdeviceptr srcDevice
Source device pointer.
Maximum layers in a cubemap layered surface.
Device can map host memory into CUDA address space.
Maximum 1D layered surface width.
size_t dstHeight
Destination height (ignored when dst is array; may be 0 if Depth==1)
This file defines the class FunctionDAG, which is our representation of a Halide pipeline, and contains methods to using Halide's bounds tools to query properties of it.
Maximum block dimension Z.
Maximum number of 32-bit registers available per multiprocessor.
Maximum layers in a 2D layered surface.
struct CUarray_st * CUarray
enum Halide::Runtime::Internal::Cuda::CUmemorytype_enum CUmemorytype
Maximum 3D surface height.
Alignment requirement for textures.
Maximum layers in a 1D layered surface.
CUarray dstArray
Destination array reference.
Maximum mipmapped 2D texture height.
CUmemorytype dstMemoryType
Destination memory type (host, device, array)
Maximum 2D surface width.
size_t Height
Height of 3D memory copy.
void * reserved0
Must be NULL.
Maximum mipmapped 2D texture width.
Maximum 2D layered texture height.
Maximum shared memory available per block in bytes.
size_t srcHeight
Source height (ignored when src is array; may be 0 if Depth==1)
Maximum mipmapped 1D texture width.
CUdeviceptr dstDevice
Destination device pointer.
Device supports stream priorities.
Maximum 2D layered surface height.
Maximum cubemap texture width/height.
size_t srcXInBytes
Source X in bytes.
struct CUmod_st * CUmodule
CUDA module.
Maximum 1D texture width.
Device has ECC support enabled.
Not visible externally, similar to 'static' linkage in C.
Maximum 2D linear texture pitch in bytes.
struct CUstream_st * CUstream
CUDA stream.
Maximum 2D linear texture width.
CUmemorytype srcMemoryType
Source memory type (host, device, array)
Maximum resident threads per multiprocessor.
struct CUfunc_st * CUfunction
CUDA function.
Maximum 3D texture depth.
const void * srcHost
Source host pointer.
Maximum 2D texture height.
Alternate maximum 3D texture depth.
Maximum 1D surface width.
Maximum shared memory available per multiprocessor in bytes.
Number of asynchronous engines.
Typical clock frequency in kilohertz.
PCI bus ID of the device.
Size of L2 cache in bytes.
Maximum 2D surface height.
Maximum 3D texture height.
struct CUevent_st * CUevent
CUDA event.
Major compute capability version number.
Maximum grid dimension Y.
Device can possibly copy memory and execute a kernel concurrently.
Device shares a unified address space with the host.
Maximum 1D linear texture width.
Alignment requirement for surfaces.
size_t dstLOD
Destination LOD.
Maximum 1D layered texture width.
Number of multiprocessors on device.
Maximum 2D layered surface width.
Device can possibly execute multiple kernels concurrently.
Maximum 3D texture width.
Maximum pitch in bytes allowed by memory copies.
Maximum grid dimension X.
Maximum 2D linear texture height.
Maximum number of 32-bit registers available per block.
Pitch alignment requirement for textures.
size_t dstZ
Destination Z.
Maximum 2D texture height if CUDA_ARRAY3D_TEXTURE_GATHER is set.
Peak memory clock frequency in kilohertz.
Maximum layers in a 2D layered texture.
void * dstHost
Destination host pointer.
struct CUctx_st * CUcontext
CUDA context.
Device supports caching locals in L1.
Maximum 2D texture width.
PCI domain ID of the device.
Memory available on device for constant variables in a CUDA C kernel in bytes.
Compute mode (See CUcomputemode for details)
Device can allocate managed memory on this system.
Maximum 2D texture width if CUDA_ARRAY3D_TEXTURE_GATHER is set.
Maximum cubemap layered surface width.
Device is integrated with host memory.
Deprecated, use CU_DEVICE_ATTRIBUTE_MAXIMUM_TEXTURE2D_LAYERED_HEIGHT.
Maximum layers in a 1D layered texture.
size_t srcPitch
Source pitch (ignored when src is array)