r,cuda,rcpp , Building a tiny R package with CUDA and Rcpp


Building a tiny R package with CUDA and Rcpp

Question:

Tag: r,cuda,rcpp

I'm working on a tiny R package that uses CUDA and Rcpp, adapted from the output of Rcpp.package.skeleton(). I will first describe what happens on the master branch for the commit entitled "fixed namespace". The package installs successfully if I forget CUDA (i.e., if I remove the src/Makefile, change src/rcppcuda.cu to src/rcppcuda.cpp, and comment out the code that defines and calls kernels). But as is, the compilation fails.

I also would like to know how to compile with a Makevars or Makevars.in instead of a Makefile, and in general, try to make this as platform independent as is realistic. I've read about Makevars in the R extensions manual, but I still haven't been able to make it work.

Some of you may suggest rCUDA, but what I'm really after here is improving a big package I've already been developing for some time, and I'm not sure that switching is worth starting again from scratch.

Anyway, here's what happens when I do an R CMD build and R CMD INSTALL on this one (master branch, commit entitled "fixed namespace").

* installing to library ‘/home/landau/.R/library’
* installing *source* package ‘rcppcuda’ ...
** libs
** arch - 
/usr/local/cuda/bin/nvcc -c rcppcuda.cu -o rcppcuda.o --shared -Xcompiler "-fPIC" -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -I/apps/R-3.2.0/include -I/usr/local/cuda/include 
rcppcuda.cu:1:18: error: Rcpp.h: No such file or directory
make: *** [rcppcuda.o] Error 1
ERROR: compilation failed for package ‘rcppcuda’
* removing ‘/home/landau/.R/library/rcppcuda’

...which is strange, because I do include Rcpp.h, and Rcpp is installed.

$ R

R version 3.2.0 (2015-04-16) -- "Full of Ingredients"
Copyright (C) 2015 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)

...

> library(Rcpp)
> sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: CentOS release 6.6 (Final)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Rcpp_0.11.6
> 

I'm using CentOS,

$ cat /etc/*-release
CentOS release 6.6 (Final)
LSB_VERSION=base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
CentOS release 6.6 (Final)
CentOS release 6.6 (Final)

CUDA version 6,

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2013 NVIDIA Corporation
Built on Thu_Mar_13_11:58:58_PDT_2014
Cuda compilation tools, release 6.0, V6.0.1

and I have access to 4 GPUs of the same make and model.

$ /usr/local/cuda/samples/bin/x86_64/linux/release/deviceQuery 
/usr/local/cuda/samples/bin/x86_64/linux/release/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 4 CUDA Capable device(s)

Device 0: "Tesla M2070"
  CUDA Driver Version / Runtime Version          6.0 / 6.0
  CUDA Capability Major/Minor version number:    2.0
  Total amount of global memory:                 5375 MBytes (5636554752 bytes)
  (14) Multiprocessors, ( 32) CUDA Cores/MP:     448 CUDA Cores
  GPU Clock rate:                                1147 MHz (1.15 GHz)
  Memory Clock rate:                             1566 Mhz
  Memory Bus Width:                              384-bit
  L2 Cache Size:                                 786432 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (65535, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Enabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Bus ID / PCI location ID:           11 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

...

> Peer access from Tesla M2070 (GPU0) -> Tesla M2070 (GPU1) : Yes
> Peer access from Tesla M2070 (GPU0) -> Tesla M2070 (GPU2) : Yes
> Peer access from Tesla M2070 (GPU0) -> Tesla M2070 (GPU3) : Yes
> Peer access from Tesla M2070 (GPU1) -> Tesla M2070 (GPU1) : No
> Peer access from Tesla M2070 (GPU1) -> Tesla M2070 (GPU2) : Yes
> Peer access from Tesla M2070 (GPU1) -> Tesla M2070 (GPU3) : Yes
> Peer access from Tesla M2070 (GPU2) -> Tesla M2070 (GPU1) : Yes
> Peer access from Tesla M2070 (GPU2) -> Tesla M2070 (GPU2) : No
> Peer access from Tesla M2070 (GPU2) -> Tesla M2070 (GPU3) : Yes
> Peer access from Tesla M2070 (GPU1) -> Tesla M2070 (GPU0) : Yes
> Peer access from Tesla M2070 (GPU1) -> Tesla M2070 (GPU1) : No
> Peer access from Tesla M2070 (GPU1) -> Tesla M2070 (GPU2) : Yes
> Peer access from Tesla M2070 (GPU2) -> Tesla M2070 (GPU0) : Yes
> Peer access from Tesla M2070 (GPU2) -> Tesla M2070 (GPU1) : Yes
> Peer access from Tesla M2070 (GPU2) -> Tesla M2070 (GPU2) : No
> Peer access from Tesla M2070 (GPU3) -> Tesla M2070 (GPU0) : Yes
> Peer access from Tesla M2070 (GPU3) -> Tesla M2070 (GPU1) : Yes
> Peer access from Tesla M2070 (GPU3) -> Tesla M2070 (GPU2) : Yes

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.0, CUDA Runtime Version = 6.0, NumDevs = 4, Device0 = Tesla M2070, Device1 = Tesla M2070, Device2 = Tesla M2070, Device3 = Tesla M2070
Result = PASS

Edit: it compiles for any commit after "fixed namespace" on either branch, but there are still problems with combining Rcpp and CUDA

To make the package compile, it turns out that I just needed to separate my C++ and CUDA code into separate *.cpp and *.cu files. However, when I try the "compiling cpp and cu separately" commit on the master branch, I get

> library(rcppcuda)
> hello()
An object of class "MyClass"
Slot "x":
 [1]  1  2  3  4  5  6  7  8  9 10

Slot "y":
 [1]  1  2  3  4  5  6  7  8  9 10

Error in .Call("someCPPcode", r) : 
  "someCPPcode" not resolved from current namespace (rcppcuda)
> 

The error goes away in the withoutCUDA branch in the commit entitled "adding branch withoutCUDA".

> library(rcppcuda)
> hello()
An object of class "MyClass"
Slot "x":
 [1]  1  2  3  4  5  6  7  8  9 10

Slot "y":
 [1]  1  2  3  4  5  6  7  8  9 10

[1] "Object changed."
An object of class "MyClass"
Slot "x":
 [1] 500   2   3   4   5   6   7   8   9  10

Slot "y":
 [1]    1 1000    3    4    5    6    7    8    9   10

> 

The only differences between the "compiling cpp and cu separately" commit on master and the "adding branch withoutCUDA" commit on withoutCUDA are

Also, it would still be convenient be able to use CUDA and Rcpp in the same *.cu file. I would really like to know how to fix the "fixed namespace" commit on the master branch.


Answer:

Going through your package there are multiple aspects that need to be changed.

  1. You shouldn't use a 'Makefile' but a 'Makevars' file instead to improve compatibility for multiple architecture builds.
  2. Try to follow the standard variable names (e.g. CPPC should be CXX), this makes everything play together much better.
  3. Don't try and compile the shared object yourself, there are good macros within the base R makefile that make this much simpler (e.g. PKG_LIBS, OBJECTS, etc.)
  4. With multiple compilers, you will want to use the OBJECTS macro. Here you will override R's base attempt to set the object files to be linked (make sure you include them all).
  5. You also need (AFAIK) to make CUDA functions available with extern "C". You will prefix both the function in the .cu file and when you declare it at the start of your cpp file.

The following Makevars worked for me whereby I modified my CUDA_HOME, R_HOME, and RCPP_INC (switched back for you). Note, this is where a configure file is recommended to make the package as portable as possible.

CUDA_HOME = /usr/local/cuda
R_HOME = /apps/R-3.2.0
CXX = /usr/bin/g++

# This defines what the shared object libraries will be
PKG_LIBS= -L/usr/local/cuda-7.0/lib64 -Wl,-rpath,/usr/local/cuda-7.0/lib64 -lcudart -d


#########################################

R_INC = /usr/share/R/include
RCPP_INC = $(R_HOME)/library/Rcpp/include

NVCC = $(CUDA_HOME)/bin/nvcc
CUDA_INC = $(CUDA_HOME)/include 
CUDA_LIB = $(CUDA_HOME)/lib64

LIBS = -lcudart -d
NVCC_FLAGS = -Xcompiler "-fPIC" -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -I$(R_INC)

### Define objects
cu_sources := $(wildcard *cu)
cu_sharedlibs := $(patsubst %.cu, %.o,$(cu_sources))

cpp_sources := $(wildcard *.cpp)
cpp_sharedlibs := $(patsubst %.cpp, %.o, $(cpp_sources))

OBJECTS = $(cu_sharedlibs) $(cpp_sharedlibs)

all : rcppcuda.so

rcppcuda.so: $(OBJECTS)

%.o: %.cpp $(cpp_sources)
        $(CXX) $< -c -fPIC -I$(R_INC) -I$(RCPP_INC)

%.o: %.cu $(cu_sources)
        $(NVCC) $(NVCC_FLAGS) -I$(CUDA_INC) $< -c

A follow-up point (as you say this is a learning exercise):

A. You aren't using one of the parts of Rcpp that make it such a wonderful package, namely 'attributes'. Here is how your cpp file should look:

#include <Rcpp.h>
using namespace Rcpp;

extern "C"
void someCUDAcode();

//[[Rcpp::export]]
SEXP someCPPcode(SEXP r) {
  S4 c(r);
  double *x = REAL(c.slot("x"));
  int *y = INTEGER(c.slot("y"));
  x[0] = 500.0;
  y[1] = 1000;
  someCUDAcode();
  return R_NilValue;
}

This will automatically generate the corresponding RcppExports.cpp and RcppExports.R files and you no longer need a .Call function yourself. You just call the function. Now .Call('someCPPcode', r) becomes someCPPcode(r) :)

For completeness, here is the updated someCUDAcode.cu file:

__global__ void mykernel(int a){
  int id = threadIdx.x;
  int b = a;
  b++;
  id++;
}


extern "C"
void someCUDAcode() {
  mykernel<<<1, 1>>>(1);
}

With respect to a configure file (using autoconf), you are welcome to check out my gpuRcuda package using Rcpp, CUDA, and ViennaCL (a C++ GPU computing library).


Related:


How (in a vectorized manner) to retrieve single value quantities from dataframe cells containing numeric arrays?


r,dataframes,vectorization
I've got a dataframe that includes columns like the one on the right here: lengthArray speed_max 1 4 24, 18, 24, 18 2 10 2, 2, 2, 2, 2, 2, 2, 2, 2, 2 3 4 -999, -999, -999, -999 4 2 -999, -999 5 2 18, 18 6 1...

Converting column from military time to standard time


r,excel
I'm trying to convert a column showing the time of road traffic accidents from military time to standard time. The data looks like this: Col1 Time..24hr. 1 1404 2 322 3 1945 4 1005 5 945 I'd then like to convert to 12hr so for '322' I'd like to make...

Remove quotes to use result as dataset name


r,string
I've got a vector with a long list of dataset names. E.g myvector<-c('ds1','ds2,'ds3') I'd like to use the names ds1..ds3 to write a file, taking the file name from the vector. Like this: write.csv(dataset[i],file=paste(myvector[i],'.csv',sep='') with dataset being d1...ds3, but without quotes. How can I remove the quotes and refer to...

Replace -inf, NaN and NA values with zero in a dataset in R


r,time-series,nan,zoo
I am trying to run some trading strategies in R. I have downloaded some stock prices and calculated returns. The new return dataset has a number of -inf, NaN, and NA values. I am reproducing a row of the dataset (log_ret). Its a zoo dataset. library(zoo) log_ret <- structure( c(0.234,-0.012,-Inf,NaN,0.454,Inf),...

Subtract time in r, forcing unit of results to minutes [duplicate]


r,posix,posixct
This question already has an answer here: Getting consist units from diff command in R 4 answers I successfully subtracted two POSIXct cols of df1 (below). However, since the time differences are >= 1 hour in all rows, R gives the results in hours. I know that this make...

How can I minimize this function in R?


r,function,optimization,mathematical-optimization
I'm attempting to write a formula that will determine a value of a that minimizes the function output myfun (i.e. a-fptotal). MWE: c <- as.matrix(c(.25,.5,.25)) d <- as.matrix(c(10000,12500,15000)) e <- 700 f <- 1.1 tr <- .30 myfun <- function(a) { b <- max(a-e,0) df <- data.frame(u1=c(c*b*.40),u2=c(c*b*.60)) df$year <- 1:nrow(df)...

how to call Java method which returns any List from R Language? [on hold]


java,r,rjava
How to call java method which returns list from R Language.

Translating Stata to R: collapse


r,data.table,stata,code-translation
Just came across a .do file that I need to translate into R because I don't have a Stata license; my Stata is rusty, so can someone confirm that the code is doing what I think it is? Here's the Stata code: collapse (min) MinPctCollected = PctCollected /// (mean) AvgPctCollected...

Count number of rows meeting criteria in another table - R PRogramming


r
I have two tables, one with property listings and another one with contacts made for a property (i.e. is someone is interested in the property they will "contact" the owner). Sample "listings" table below: listings <- data.frame(id = c("6174", "2175", "9176", "4176", "9177"), city = c("A", "B", "B", "B" ,"A"),...

Sleep Shiny WebApp to let it refresh… Any alternative?


r,shiny,sleep
I have a WebApp that have some renderUI({})... and some of them depend on the input of another. This makes that, briefly, a red error in the webpage appear when I select some options. Because the if() clause of some renderUI({}) depend on the input of a selectizer. The error...

Serial modification of objects in R


r,oop
I have a number of matrices of the same size: m1.m <- matrix(c(1,2,3,4), nrow=2, ncol=2) m2.m <- matrix(c(5,6,7,8), nrow=2, ncol=2) ... I want to set uniform column and row names to all of them. Currently I am doing it like this: new_col_names <- c("Col1","Col2") new_row_names <- c("Row1","Row2") change_names <- function(m,...

Aggregating data in R


r
user_id date datetime page 217568 6/12/2015 49:23.9 Vodafone | How to get in touch with Vodafone 135437 6/10/2015 43:35.7 My Vodafone – Manage your Vodafone Pay Monthly Account Online – Vodafone 196094 6/13/2015 33:39.4 Check the status of Vodafone’s mobile network in real-time 74197 6/6/2015 52:46.1 undefined 153501 6/5/2015 02:55.5...

how to read a string as a complex number?


r
I have a string which has a complex format, how can I use complex() to treat it as a complex number? For example: myStr="0.76+0.41j" now I want to do sth like: myStr_complex=complex(myStr) # my question is how should I do this part? Eventually Im(myStr_complex) should print 0.41 ...

Histogram-like summary for interval data


r,statistics,histogram
How do I get a histogram-like summary of interval data in R? My MWE data has four intervals. interval range Int1 2-7 Int2 10-14 Int3 12-18 Int4 25-28 I want a histogram-like function which counts how the intervals Int1-Int4 span a range split across fixed-size bins. The function output should...

How to plot data points at particular location in a map in R


r,google-maps,ggmap
I have a dataset that looks like this: LOCALITY numbers 1 Airoli 72 2 Andheri East 286 3 Andheri west 208 4 Arya Nagar 5 5 Asalfa 7 6 Bandra East 36 7 Bandra West 72 I want to plot bubbles (bigger the number bigger would be the bubble) inside...

Am I using sapply incorrectly?


r,sapply
This code is suppose to take in a word, and compute values for letters of the word, based on the position of the letter in the word. So for a word like "broke" it's suppose to compute the values for the letter "r" and "k" strg <- 'broke' #this part...

CUDA cuBlasGetmatrix / cublasSetMatrix fails | Explanation of arguments


cuda,gpgpu,gpu-programming,cublas
I've attempted to copy the matrix [1 2 3 4 ; 5 6 7 8 ; 9 10 11 12 ] stored in column-major format as x, by first copying it to a matrix in an NVIDIA GPU d_x using cublasSetMatrix, and then copying d_x to y using cublasGetMatrix(). #include<stdio.h>...

How to quickly read a large txt data file (5GB) into R(RStudio) (Centrino 2 P8600, 4Gb RAM)


r,large-data
I have a large data set, one of the files is 5GB. Can someone suggest me how to quickly read it into R (RStudio)? Thanks

Store every value in a sequence except some values


r
If I do the following to a string of letters: x <- 'broke' y <- nchar(x) z <- sequence(y) How do I store every value of the z that isn't the first, last, or middle values of the sequence. In this example if z is 1 2 3 4 5...

Set a timer in R to execute a program


r,timer
I have a program to execute per 15 seconds, how can I achieve this, the program is as followed: print_test<-function{ cat("hello world") } ...

Find multiple consecutive empty lines


r
I'm trying to chop up a text file into the articles it contains. Usually this is done by identifying a pattern each article begins with. Unfortunately the database I downloaded the articles from doesn't have that. The only pattern I can find is that after each article there are 3...

R stops displaying maps


r,google-maps,ggmap
Few days ago I was familiarizing myself with displaying maps, plotting points on the map from http://rpubs.com/nickbearman/r-google-map-making Today, I have intermittent success in displaying maps. library(ggmap) map <- qmap('Anaheim', zoom = 10, maptype = 'roadmap') Outputs Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=Anaheim&zoom=10&size=640x640&scale=2&maptype=roadmap&language=en-EN&sensor=false And when I go to the URL...

Select / subset spatial data in R


r,dictionary,spatial
I am working on a large data set with spatial data (lat/long). My data set contains some positions that I don´t want in my analysis (it makes the files to heavy to process in ArcMap- many Go of data). This is why I want to subset the relevant data for...

Highlighting specific ranges on a Graph in R


r,graph,highlight
library(season) plot(CVD$yrmon, CVD$cvd, type = 'o',pch = 19,ylab = 'Number of CVD deaths per month',xlab = 'Time') if i wanted to highlight a region of the graph based on x values say from 1994-1998 how do i do this? Any thought would be appreciated Thanks....

ggplot2 & facet_wrap - eliminate vertical distance between facets


r,ggplot2
I'm working with some data that I want to display as a nxn grid of plots. Edit: To be more clear, there's 21 categories in my data. I want to facet by category, and have those 21 plots in a 5 x 5 square grid (where the orphan is by...

How to build a 'for' loop with input$i in R Shiny


r,loops,for-loop,shiny
In my shiny app, I build a a number of checkboxes using a for loop, like this: landelist <- c("Danmark", "Tjekkiet", "Østrig", "Belgien", "Tyskland", "Sverige", "USA", "Norge", "Island") landecheckbox <- c() for (land in landelist){ landechek <- paste0("<label class=\"checkbox inline\"><input id=\"", land, "\" type=\"checkbox\" checked><span>", land, "</span></label>") landecheckbox <- c(landechek,...

ggplot equivalent for matplot


r,ggplot2
Is there an equivalent in ggplot2 to plot this dataset? I use matplot, and read that qplot could be used, but it really does not work. ggplot/matplot data<-rbind(c(6,16,25), c(1,4,7), c(NA, 1,2), c(NA, NA, 1)) as.data.frame(data) matplot(data, log="y",type='b', pch=1) ...

Using R to Assign Treatments to Groups


r
We have seven exposures and 24 groups. We would like to randomly assign five of the seven exposures to groups while also ensuring that we end up with a consistent count for each exposure, meaning that each exposure ends up being exposed about the same number of times. I have...

Skip some lines with fread


r,fread
I am interested to skip some lines of my data frame before the header names . How can i do it by skiping all the lines before ID_REF or if ID_REF is not present, check for the pattern ILMN_ and deleting all the lines keeping immediate first if not containing...

Fitted values in R forecast missing date / time component


r,time-series,forecasting
I've been doing a variety of models in R with time series data (in XTS format) and I keep running into the same issue where there's no date / time component to the fitted values / forecasts and thus I can't graph them on the same graph as the original...

dplyr multiple inputs from Shiny


r,shiny,dplyr
I have a Shiny app that takes input from radio button and then use that to perform filter to the data frame using dplyr in the server side. It works, but now I want to expand it to take multiple inputs to filter, and I have no idea how to...

Keep the second occurrence in a column in R


r,conditional,subset,find-occurrences
I have quite a simple dataset: ID Value Time 1 censored 1 1 censored 2 1 uncensored 3 1 uncensored 4 1 censored 5 1 censored 6 2 censored 1 2 uncensored 2 2 uncensored 3 2 uncensored 4 2 censored 5 I want to keep the first uncensored occurrence,...

copy a list of data.tables


r,data.table
I have the following situation: 1) a list of data tables 2) For testing purposes I deliberately want to (deeply) copy the whole list including the data tables 3) I want to take some element from the copied list and add a new column. Here is the code: library(data.table) x...

Appending a data frame with for if and else statements or how do put print in dataframe


r,loops,data.frame,append
How do I put what I printed in a dataframe with a for loop and if else statements? Basically, this code: list<-c("10","20","5") for (j in 1:3){ if (list[j] < 8) print("Greater") else print("Less") }) #[1] "Less" #[1] "Less" #[1] "Greater" Or should it be something more like this? f3 <-...

Correlate by levels of a variable in R


r,correlation
I would like to correlate two variables and have the output reported separately for levels of a third variable. My data are similar to this example: var1 <- c(7, 8, 9, 10, 11, 12) var2 <- c(18, 17, 16, 15, 14, 13) categories <- c(1, 2, 3, 1, 2, 3)...

Linear multivariate regression in R


r
I want to model that a factory takes an input of, say, x tonnes of raw material, which is then processed. In the first step waste materials are removed, and a product P1 is created. For the "rest" of the material, it is processed once again and another product P2...

Limit the color variation in R using scale_color_grey


r,colors,ggplot2
Before I start, allow me to explain my graph: I have two Genotypes (WTB and whd) and each have two conditions (0 and 7), so I have four lines. Now, I want to make a plot where each variable and its condition is the same color. Anything with whd will...

Return Column Names when True in R


r
I am using R for a project and I have a data frame in in the following format: A B C 1 1 0 0 2 0 1 1 I want to return a data frame that gives the Column Name when the value is 1. i.e. Impair1 Impair2 1...

R — frequencies within a variable for repeating values


r,count,duplicates
I've got a column A, which has several values, some of them repeating. So, example: A = c(5, 9, 6, 5, 5). I need to go through A and count the frequencies of each of the values in A. So, for this example, for the set of 5s in A,...

How to split a text into two meaningful words in R


r,string-split,stemming,text-analysis
I had a text data frame having sentences, and as I wanted the list of separate words in another dataframe I used the "qdap package" function "all_words" Words = all_words(df$problem_note_text, begins.with=NULL , alphabetical = FALSE, apostrophe.remove = TRUE, char.keep = char2space, char2space = "~~") Now have a dataframe which has...

Convert strings of data to “Data” objects in R [duplicate]


r,date,csv
This question already has an answer here: as.Date with dates in format m/d/y in R 2 answers My problem is that the as.Date function does not convert the values in a "date" column of a data frame into Date objects. I have a data.frame nmmaps. Here is a short...

R: recursive function to give groups of consecutive numbers


r,if-statement,recursion,vector,integer
Given a sorted vector x: x <- c(1,2,4,6,7,10,11,12,15) I am trying to write a small function that will yield a similar sized vector y giving the last consecutive integer in order to group consecutive numbers. In my case it is (defining groups 2, 4, 7, 12 and 15): > y...

R Program Vector, record Column Percent


r,vector,percentage
This is my vector head(sep) I must find percent of all SEP 11 in each row. For instance, in first row, percent of SEP 11 is 100 * ((63 + 124)/ (63 + 124 + 0 + 0)) And would like this stored in newly created 8th column Thanks dput...

Fitting a subset model with just one lag, using R package FitAR


r,time-series
I am trying to fit a subset model with only lag 4. In the manual it's written "you must use p=c(0,0,0,4) since p=4 will fit a full AR(4)". I did this. #fit a subset model with just lag 4 Fit=FitAR(p=c(0,0,0,4), lag.max = "default", ARModel = "ARz") However, I get the...

Subsetting rows by passing an argument to a function


r,subset
I have the following data frame which I imported into R using read.table() (I incorporated read.table() within read_data() which is a function I created that also throw messages in case the file name is not written appropriately): > raw_data <- read_data("n44.txt") [1] #### Reading txt file #### > head(raw_data) subject...

how to get values from selectInput with shiny


r,shiny
I am playing around with the shiny packages for some hours now, and wanted to make a select input widget that enables me to download a certain data set from the server. So i figured out a way to get me this data frame containing all my IDs for downloading:...

Rbind in variable row size not giving NA's


r,rbind
The initial data frame mergedDf is PROD_CODE 1 PRD0900033,PRD0900135,PRD0900220,PRD0900709 2 PRD0900097,PRD0900550 3 PRD0900121 4 PRD0900353 5 PRD0900547,PRD0900614 After calling mergedDf<-data.frame(do.call('rbind', strsplit(as.character(mergedDf$PROD_CODE),',',fixed=TRUE))) Output becomes X1 X2 X3 X4 1 PRD0900033 PRD0900135 PRD0900220 PRD0900709 2 PRD0900097 PRD0900550 PRD0900097 PRD0900550 3 PRD0900121 PRD0900121 PRD0900121 PRD0900121 4 PRD0900353 PRD0900353 PRD0900353 PRD0900353 5 PRD0900547 PRD0900614...

optimization algorithm for circular data


r,optimization,circular,maximization
Background: I am interested in localizing a sound source from a suite of audio recorders. Each audio array consists of 6 directional microphones spaced evenly every 60 degrees (0, 60, 120, 180, 240, 300 degrees). I am interested in finding the neighboring pair of microphones with the maximum set of...

How to set x-axis with decreasing power values in equal sizes


r,plot,ggplot2,cdf
Currently I am doing some cumulative distribution plot using R and I tried to set x-axis with decreasing power values (such as 10000,1000,100,10,1) in equal sizes but I failed: n<-ceiling(max(test)) qplot(1:n, ecdf(test)(1:n), geom="point",xlab="check-ins", ylab="Pr(X>=x)")+ geom_step() +scale_x_reverse(breaks=c(10000,1000,100,10,1)) +scale_shape_manual(values=c(15,19)) It seems that the output has large interval for 10000, then all the...

R: Using the “names” function on a dataset created within a loop


r,paste,assign,names
I am using a for loop to read in multiple csv files and naming the datasets import1, import2, etc. For example: assign(paste("import",i,sep=""), read.csv(files[i], header=FALSE)) However, I now want to rename the variables in each dataset. I have tried the following: names(as.name(paste("import",i,sep=""))) <- c("xxxx", "yyyy") But get the error "target of...