PyCUDA, Google Colab and the GPU

PyCUDA GPU program compilation workflow.
Google Colab welcome page.
import pycuda.driver as cuda
import pycuda.autoinit
print(“%d device(s) found.” % cuda.Device.count())
dev = cuda.Device(0)
print(“Device: %s”, dev.name())
print(“ Compute Capability: %d.%d” % dev.compute_capability())
print(“ Total Memory: %s KB” % (dev.total_memory()//(1024)))
atts = [(str(att), value)
for att, value in dev.get_attributes().items()]
atts.sort()

for att, value in atts:
print(“ %s: %s” % (att, value))
!pip install pycuda
  1. pycuda.driver: it contains functions for memory handling, as allocation, deallocation and transfers, for the dumping of information on the GPU card etc.; in the example, the cuda short hand is given to pycuda.driver;
  2. pycuda.autoinit: it does not use a short hand notation and this call serves for the device initialization, memory cleanup and context creation.
1 device(s) found.
Device: %s Tesla P100-PCIE-16GB
Compute Capability: 6.0
Total Memory: 16671616 KB
ASYNC_ENGINE_COUNT: 2
CAN_MAP_HOST_MEMORY: 1
CLOCK_RATE: 1328500
COMPUTE_CAPABILITY_MAJOR: 6
COMPUTE_CAPABILITY_MINOR: 0
COMPUTE_MODE: DEFAULT
CONCURRENT_KERNELS: 1
ECC_ENABLED: 1
GLOBAL_L1_CACHE_SUPPORTED: 1
GLOBAL_MEMORY_BUS_WIDTH: 4096
GPU_OVERLAP: 1
INTEGRATED: 0
KERNEL_EXEC_TIMEOUT: 0
L2_CACHE_SIZE: 4194304
LOCAL_L1_CACHE_SUPPORTED: 1
MANAGED_MEMORY: 1
MAXIMUM_SURFACE1D_LAYERED_LAYERS: 2048
MAXIMUM_SURFACE1D_LAYERED_WIDTH: 32768
MAXIMUM_SURFACE1D_WIDTH: 32768
MAXIMUM_SURFACE2D_HEIGHT: 65536
MAXIMUM_SURFACE2D_LAYERED_HEIGHT: 32768
MAXIMUM_SURFACE2D_LAYERED_LAYERS: 2048
MAXIMUM_SURFACE2D_LAYERED_WIDTH: 32768
MAXIMUM_SURFACE2D_WIDTH: 131072
MAXIMUM_SURFACE3D_DEPTH: 16384
MAXIMUM_SURFACE3D_HEIGHT: 16384
MAXIMUM_SURFACE3D_WIDTH: 16384
MAXIMUM_SURFACECUBEMAP_LAYERED_LAYERS: 2046
MAXIMUM_SURFACECUBEMAP_LAYERED_WIDTH: 32768
MAXIMUM_SURFACECUBEMAP_WIDTH: 32768
MAXIMUM_TEXTURE1D_LAYERED_LAYERS: 2048
MAXIMUM_TEXTURE1D_LAYERED_WIDTH: 32768
MAXIMUM_TEXTURE1D_LINEAR_WIDTH: 134217728
MAXIMUM_TEXTURE1D_MIPMAPPED_WIDTH: 16384
MAXIMUM_TEXTURE1D_WIDTH: 131072
MAXIMUM_TEXTURE2D_ARRAY_HEIGHT: 32768
MAXIMUM_TEXTURE2D_ARRAY_NUMSLICES: 2048
MAXIMUM_TEXTURE2D_ARRAY_WIDTH: 32768
MAXIMUM_TEXTURE2D_GATHER_HEIGHT: 32768
MAXIMUM_TEXTURE2D_GATHER_WIDTH: 32768
MAXIMUM_TEXTURE2D_HEIGHT: 65536
MAXIMUM_TEXTURE2D_LINEAR_HEIGHT: 65000
MAXIMUM_TEXTURE2D_LINEAR_PITCH: 2097120
MAXIMUM_TEXTURE2D_LINEAR_WIDTH: 131072
MAXIMUM_TEXTURE2D_MIPMAPPED_HEIGHT: 32768
MAXIMUM_TEXTURE2D_MIPMAPPED_WIDTH: 32768
MAXIMUM_TEXTURE2D_WIDTH: 131072
MAXIMUM_TEXTURE3D_DEPTH: 16384
MAXIMUM_TEXTURE3D_DEPTH_ALTERNATE: 32768
MAXIMUM_TEXTURE3D_HEIGHT: 16384
MAXIMUM_TEXTURE3D_HEIGHT_ALTERNATE: 8192
MAXIMUM_TEXTURE3D_WIDTH: 16384
MAXIMUM_TEXTURE3D_WIDTH_ALTERNATE: 8192
MAXIMUM_TEXTURECUBEMAP_LAYERED_LAYERS: 2046
MAXIMUM_TEXTURECUBEMAP_LAYERED_WIDTH: 32768
MAXIMUM_TEXTURECUBEMAP_WIDTH: 32768
MAX_BLOCK_DIM_X: 1024
MAX_BLOCK_DIM_Y: 1024
MAX_BLOCK_DIM_Z: 64
MAX_GRID_DIM_X: 2147483647
MAX_GRID_DIM_Y: 65535
MAX_GRID_DIM_Z: 65535
MAX_PITCH: 2147483647
MAX_REGISTERS_PER_BLOCK: 65536
MAX_REGISTERS_PER_MULTIPROCESSOR: 65536
MAX_SHARED_MEMORY_PER_BLOCK: 49152
MAX_SHARED_MEMORY_PER_MULTIPROCESSOR: 65536
MAX_THREADS_PER_BLOCK: 1024
MAX_THREADS_PER_MULTIPROCESSOR: 2048
MEMORY_CLOCK_RATE: 715000
MULTIPROCESSOR_COUNT: 56
MULTI_GPU_BOARD: 0
MULTI_GPU_BOARD_GROUP_ID: 0
PCI_BUS_ID: 0
PCI_DEVICE_ID: 4
PCI_DOMAIN_ID: 0
STREAM_PRIORITIES_SUPPORTED: 1
SURFACE_ALIGNMENT: 512
TCC_DRIVER: 0
TEXTURE_ALIGNMENT: 512
TEXTURE_PITCH_ALIGNMENT: 32
TOTAL_CONSTANT_MEMORY: 65536
UNIFIED_ADDRESSING: 1
WARP_SIZE: 32

--

--

--

We are teaching, researching and consulting parallel programming on Graphics Processing Units (GPUs) since the delivery of CUDA. We also play Matlab and Python.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Boost Your Web App’s Performance

Swift101: Optional

Complete tutorial about Git, Github and Version Control

RSpec-Rails, Part 1

ASP.NET Core API Starter Project — Structured Logging using Serilog

Mobile-app-development-process

Design your custom payment form with SqPaymentForm

Trying out the Ergo blockchain environment:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Vitality Learning

Vitality Learning

We are teaching, researching and consulting parallel programming on Graphics Processing Units (GPUs) since the delivery of CUDA. We also play Matlab and Python.

More from Medium

Developing a Basketball Minimap for Player Tracking using Broadcast Data and Applied Homography

Part 1: Python and DICOM: Two easy tools to save time and energy when working with medical images…

My MATLAB code which uses a support package works,

Google Colab vs Google Colab Pro on Multidimensional TFT Architectures