FAQ Database Discussion Community


Running CUDA programs on Quadro K620m

cuda,nvidia
I have laptop which has Quadro K620m GPU. I am trying to learn CUDA programming and downloaded the network installer from NVIDIA site. During CUDA SDK installation, just when its checking the hardware of the machine, it displays Do you want to Continue? This graphics driver could not find compatible...

Video4Linux Loopback Device on Linux4Tegera

linux,ubuntu-14.04,nvidia,firewire
I am interfacing Bumblebee2 camera with Nvidia Tegra TK1 board. I have installed firewire1394 driver along with coriander 2.0.2 to get the camera output and it is working fine. But I am not able to load the video for Linux module. I have installed the following two packages as well...

Compiling Optix with Qt Creator - Linking Issues

c++,qt,cuda,nvidia,optix
I am trying to compile some sample projects given from the Nvidia OptiX SDK with Qt-Creator. I wrote the .pro file and edited it for my own needs with this help Compiling Optix with Qt Creator! I have exactly the same .pro file, except that i edited the direction of...

Why use memset when using CUDA?

c,cuda,nvidia
I saw in a CUDA code example that memset is used to initialize vectors to all 0's that will store the sum of two others vectors. For example: hostRef = (float *)malloc(nBytes); gpuRef = (float *)malloc(nBytes); memset(hostRef, 0, nBytes); memset(gpuRef, 0, nBytes); What purpose does this serve if nothing else...

Is it possible to say which pointer was allocated by cudaMalloc and which by malloc?

c,memory-management,cuda,gpgpu,nvidia
For example, I have a float pointer in the host code float *p Is it possible to determine a type(device/host) of memory to which he points?...

Why doesn't OpenCL Nvidia compiler (nvcc) use the registers twice?

opencl,nvidia,nvcc,ptx
I'm doing a small OpenCL benchmark using Nvidia drivers, my kernel performs 1024 fuse multiply-adds and store the result in an array: #define FLOPS_MACRO_1(x) { (x) = (x) * 0.99f + 10.f; } // Multiply-add #define FLOPS_MACRO_2(x) { FLOPS_MACRO_1(x) FLOPS_MACRO_1(x) } #define FLOPS_MACRO_4(x) { FLOPS_MACRO_2(x) FLOPS_MACRO_2(x) } #define FLOPS_MACRO_8(x) {...

Coalesced memory access to 2d array with CUDA

c++,arrays,cuda,gpgpu,nvidia
I'm working on a piece of CUDA C++ code and need each thread to, essentially, access a 2D array in global memory by BOTH row-major AND column-major. Specifically, I need each thread-block to: generate it's own 1-d array (let's say, gridDim # of elements) Write these to global memory Read...

Can I use NVIDIA nsight to troubleshoot WPF performance?

c#,wpf,nvidia,nsight
I have a WPF application with a bottleneck on the GPU. I thought I could use NVIDIA nsight to see what WPF is doing, but the setup documentation says I should disable WPF hardware acceleration. Without disabling hardware acceleration I still get results, but now I'm not sure - are...

QGLWidget - distortion occured

qt,opengl,nvidia,optix
I would like to display sample6 of the OptixSDK in a QGLWidget. My application has only 3 QSlider for the rotation around the X,Y,Z axis and the QGLWidget. For my understanding, paintGL() gets called whenever updateGL() is called by my QSlider or Mouseevents. Then I initialize a rotation matrix and...

CUDA fails when trying to use both onboard iGPU and Nvidia discrete card. How can i use both discrete nvidia and integrated (onboard) intel gpu? [closed]

cuda,intel,nvidia,multi-gpu
I had recently some trouble making my pc (ivybridge) use the onboard gpu (intel igpu HD4000) for normal screen display usage, while i run my CUDA programs for computations on the discrete Nvidia GT 640 i have on my machine. The problem was that under iGPU display, CUDA would be...

OpenGL: GL_FRAMEBUFFER_UNSUPPORTED on specific combinations of framebuffer attachments

c++,opengl,textures,nvidia,framebuffer
Im trying to attach multiple targets to a framebuffer object. I have the following problem: There is no error, when using float texture attachments and a depth attachment. There is also no error, when using float texture attachments and integer texture attachments. Although these combinations work, I cant use float,...

Different results on CPU and GPU

c++,cuda,double,gpu,nvidia
I implemented the same algorithm both on CPU and GPU using C++ and CUDA C. In order to check if the results are correct I check if the 2 arrays of double calculated by both are the same with a precision of 1.0E-8 . And the result is that the...

How do I find the glx library name?

c++,opengl,nvidia,centos6,glx
I am trying to use a glX function (glXSwapIntervalMESA()) but the compiler is returning an undefined reference error. I have tried linking with X11 and Xext, and glx, though the last library apparently does not exist. libGL includes some entry points for glx, but I would guess that others (e.g....

What is version of cuda for nvidia 304.125

ubuntu,cuda,ubuntu-14.04,nvidia
I am using ubuntu 14.04. I want to install CUDA. But I don't know which version is good for my laptop. I trace my drive that is $cat /proc/driver/nvidia/version NVRM version: NVIDIA UNIX x86_64 Kernel Module 304.125 Mon Dec 1 19:58:28 PST 2014 GCC version: gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1)...

Corinder installation on ubuntu

ubuntu-14.04,nvidia,firewire
I am trying to install Coriander on Ubuntu, but when I tried to make it I got the following error: make all-recursive make[1]: Entering directory `/home/ubuntu/Downloads/coriander-2.0.0' Making all in po make[2]: Entering directory `/home/ubuntu/Downloads/coriander-2.0.0/po' make[2]: Nothing to be done for `all'. make[2]: Leaving directory `/home/ubuntu/Downloads/coriander-2.0.0/po' Making all in src make[2]:...

Why is my CUDA implementation equally fast as my CPU implementation

c++,cuda,nvidia,convolution
I created some code to do a 2D convlution on a 1300x1300 grayscale image and a 15x15 kernel, in standard C++ and in CUDA. Both versions: CPU: #include <iostream> #include <exception> #define N 1300 #define K 15 #define K2 ((K - 1) / 2) template<int mx, int my> inline int...

Cuda program not working for more than 1024 threads

cuda,gpu,nvidia
My program is of Odd-even merge sort and it's not working for more than 1024 threads. I have already tried increasing the block size to 100 but it still not working for more than 1024 threads. I'm using Visual Studio 2012 and I have Nvidia Geforce 610M. This is my...

How can I find out which thread is getting executed on which core of the GPU?

cuda,gpu,nvidia
I'm developing some simple programs in Cuda and i want to know which thread is getting executed on which core of the GPU. I'm using Visual Studio 2012 and i have a NVIDIA GeForce 610M graphic card. Is it possible to do so... I've already searched a lot on google...

Point Grey Bumblebee2 firewire 1394 with Nvidia Jetson TK1 board

opencv,ubuntu-14.04,nvidia
I have successfully interfaced Point Grey Bumblebee2 firewire1394 camera with Nvida Jetson TK1 board and I get the video using Coriander and video for Linux loop back device is working as well. But when I tried to access camera using OpenCV and Coriander at the same time, I have conflicts....

What is the difference between the CUDA tookit and the CUDA sdk

cuda,gpgpu,nvidia
I am installing CUDA on Ubuntu 14.04 and have a Maxwell card (GTX 9** series) and I think I have installed everything properly with the toolkit as I can compile my samples. However, I read that in places that I should install the SDK (This appears to be talked about...

How to debug OpenCV program without Nvidia DLLs?

c++,visual-studio,opencv,nvidia,pdb
Visual Studio Community 2013 Windows 8.1 64bit OpenCV 3.0 beta GPU: NVIDIA GeForce GT 540M and a Intel core graphics. When I want to debug a OpenCV program, all symbol files (.pdb) loaded succeed except nvinit.dll, detoured.dll, Nvd3d9wrap.dll, nvdxgiwrap.dll. VS told me this: 'ImageWatchT.exe' (Win32): Loaded 'C:\Windows\SysWOW64\nvinit.dll'. Loading disabled by...

Display OptiX sample6 in QGLWidget

qt,opengl,nvidia,optix
I want to display sample6 of the OptixSDK in a QGLWidget. I've read the topic in the Nvidia OptiX Forum but I do not get ahead, because unfortunalety I have no idea how I shall override the paintGL() method. At first I simply tried to read the outputbuffer of sample6...

Running CUDA GUI samples from a passive (inactive) GPU

cuda,nvidia,nsight,amd-processor
I managed to successfully run CUDA programs on a GeForce GTX 750 Ti while using a AMD Radeon HD 7900 as the rendering device (actually connected to the display) using this guide; for instance, the Vector Addition sample runs nicely. However, I can only run applications that do not produce...

OpenGL GLX_EXT_swap_control exists but cant link functions

c++,opengl,cmake,nvidia
I cant use glXSwapBufferEXT in my code, i get undeclared identifier. But for instance glXQueryDrawable works. In my cmake file im linking Opengl libraries, and including them for the compiler. In my header im including GL/glx.h and GL/glxext.h running glxinfo shows GLX_EXT_swap_control exists, and testing extensions in my app also...

Firewire 1394 camera with OpenCV

c++,opencv,camera,nvidia,firewire
I am trying to run face detection demo from OpenCV using firewire 1394 camera. While doing so I got the following error. Unable to stop the stream.: Bad file descriptor In capture ... VIDIOC_STREAMON: Inappropriate ioctl for device Unable to stop the stream.: Inappropriate ioctl for device Here is the...

Not all work-items being used opencl

c++,linux,opencl,gpgpu,nvidia
so I'm able to compile and execute my kernel, the problem is that only two work-items are being used. I'm basically trying to fill up a float array[8] with {0,1,2,3,4,5,6,7}. So this is a very simple hello world application. Bellow is my kernel. // Highly simplified to demonstrate __kernel void...

Firewire1394 on nvida jetson tk1

ubuntu-14.04,nvidia,tegra
I am trying to interface Point Grey Bumblebee2 stero camera with the nvida tegra tk1 using PCI express. Nvidia board detects the PCI express. lspci 00:00.0 PCI bridge: NVIDIA Corporation Device 0e12 (rev a1) 01:00.0 FireWire (IEEE 1394): LSI Corporation FW643 [TrueFire] PCIe 1394b Controller (rev 08) 02:00.0 PCI bridge:...

Java OpenGL EXCEPTION_ACCESS_VIOLATION on glDrawArrays only on NVIDIA

java,opengl,lwjgl,nvidia,gldrawarrays
I'm working on a game in java using lwjgl and it's OpenGL implementation. Never had any problems until I exchanged it with a colleague who uses NVIDIA instead of AMD, and suddenly it crashes on a line that works on AMD but it only crashes at that point in the...

Is prefix scan CUDA sample code in gpugems3 correct?

cuda,gpu,nvidia,prefix-sum
I've written a piece of code to call the kernel in gpugem3 but the results that I got is a bunch of negative numbers instead of prefix scan. I'm wondering if my kernel call is wrong or there is something wrong with the gpugem3 code? here is my code: #include...

Reading event counters with concurrent exection

cuda,profiling,nvidia
I am trying to read performance counters with nvprof while executing two kernels concurrently. nvprof --concurrent-kernels on --events fb_subp0_write_sectors ./myprogram However by doing this the kernel execution seems to serialize. What I want out of this is exactly how they perform when they are running concurrently. Is it possible at...

Linux - relations between graphics drivers and Mesa

linux,opengl,nvidia,drivers
When I install a nvidia proprietary driver then Nvidia OpenGL implementation is used (I don't need Mesa). Which OpenGL implementation can be used with an open source nvidia driver - Nouveau ? Does Nouveau also provide OpenGL implementation or it has to use Mesa OpenGL implementation ? Can I use...

Practice computing grid size for CUDA

cuda,nvidia
dim3 block(4, 2) dim3 grid((nx+block.x-1)/block.x, (ny.block.y-1)/block.y); I found this code in Professional CUDA C Programming on page 53. It's meant to be a naive example of matrix multiplication. nx is the number of columns and ny is the number of rows. Can you explain how the grid size is computed?...

onSensorChanged() not fired on Android (nVidia Shield Tablet)

android,nvidia,android-sensors
I'm trying to create an augmented reality app for the nVidia shield. I tried my app on another Android device and it works. Unfortunately, on the shield, the onSensorChanged event is not fired. Here's my code: _sensorEventListener = new SensorEventListener() { public void onSensorChanged(SensorEvent event) { AndroidAttitude.this.processSensorEvent(event); } public void...