Optimising Sparse Array Math

I have a sparse array: term_doc its size is 622256x715 of Float64. It is very sparse: Of its ~444,913,040 cells, only about 22,215 of them normally are nonempty. Of the 622256 rows only 4,699 are occupied though of the 715 columns all are occupied. The operator I would like to...

Stream compaction with Thrust; best practices and fastest way?

I am interested in porting some existing code to use thrust to see if I can speed it up on the GPU with relative ease. What I'm looking to accomplish is a stream compaction operation, where only nonzero elements will be kept. I have this mostly working, per the example...