Waits for all kernels in all streams on a CUDA device to complete.
cuda_synchronize(device = NULL)
device
device for which to synchronize. It uses the current device given by cuda_current_device() if no device is specified.
cuda_current_device()