How can I make the below function faster, for larger datasets it takes too long to complete sapply? Here is wwhat I am attempting in the below code: Extracting a number from a string of characters. Since the extracted number may be 1 or 2 digits long. Therefore,I have used...

I have a frame data "testData" as follows: id content 1 I came from China 2 I came from America 3 I came from Canada 4 I came from Japan 5 I came from Mars And I also have another frame data "addr" as follows: id addr 1 America 2...

In R, I want to split a data frame along a factor variable, and then apply a function to the data pertaining to each level of that variable. I want to do all of this inside my function. Somehow, the data aren't being split? I don't understand all of the...

I am trying some split-apply-combine methods. How do I split a data into different categories and then sort each categories on descending order of a particular column. First I split mtcars spmtcars <-split(mtcars, mtcars$cyl) then if I do sort_mtc <- spmtcars[order(mpg), ] Error in order(mpg) : object 'mpg' not found...

I've got this table/array in csv: GroupID Channel Daysbeforelast 1 A 35 1 B 31 1 C 29 1 D 17 1 E 15 1 D 5 1 C 0 2 B 66 2 E 17 2 D 15 2 A 2 2 C 0 2 F 0 2 A...

This code is suppose to take in a word, and compute values for letters of the word, based on the position of the letter in the word. So for a word like "broke" it's suppose to compute the values for the letter "r" and "k" strg <- 'broke' #this part...

Is there a reason why I can't append values to an empty vector when called within a nested lapply/apply function? I have an empty vector bucket where I'd like to push values into, however, the output says the bucket is reinitialized with each iteration. I would appreciate any insight into...

I would like to use an objective function based on a list of elements, each of which is the result of applying a function over a dataframe (df) ((function is, say, variance of df's observations' "measure")). That is, I have a list of dfs. I naturally want to sapply my...

I have the following type of data: token <- list( cameron = rep("people", 12)) I'm applying a function like the following: token <- sapply(token, function(x){ x <- str_trim(x, side = "both") }) The problem is sapply() messes up the name of the structure. Running names(token) returns NULL. Running str(token) shows...

It seems this question has been asked a couple of times in different forms, but I could't find the right solution. I have a SpatialPoint object with several Polygons and would like to subset and plot one polygon using the slot "ID". Using the example from this question: Sr1 =...

I hope someone will be able to help me on this problem. I have a list object that includes 48 vectors and each vector has a length of 2,000,000 observations in it. Here is a code that creates the same structure with only 100,000 items per vector: mtx_sim <- matrix(data...

<- updated for completeness (thanks to hrbrmstr for pointing it out)-> I'm trying to extract some data from Pubmed and I've been reading the example from here (relevant diagram here). A redacted version of my data looks like: <PubmedArticleSet> <PubmedArticle> <MedlineCitation Owner="NLM" Status="MEDLINE"> <PMID Version="1">11841882</PMID> <Article PubModel="Print"> <PublicationTypeList> <PublicationType UI="D002363">Case...

I have a simple question. Assuming I have a list Obj of length 500 Obj[[1]], Obj[[2]], ....Obj[[500]], #for each Obj[[i]], it has an element Obj[[i]]$logL, My question is how to extract logL of each Obj to avoid a for loop like this? logL = rep(NA, length(Obj)) for(i in 1: length(Obj)){...

I have a function (weisurv) that has 2 parameters - sc and shp. It is a function through time (t). Time is a sequence, i.e. t<-seq(1:100). weisurv<-function(t,sc,shp){ surv<-exp(-(t/sc)^shp) return(surv) } I have a data frame (df) that contains a list of sc and shp values (like 300+ of them). For...

I have a dataframe consisting of a series of paired columns. Here is a small example. df1 <- as.data.frame(matrix(sample(0:1000, 36*10, replace=TRUE), ncol=1)) df2 <- as.data.frame(rep(1:12, each=30)) df3 <- as.data.frame(matrix(sample(0:500, 36*10, replace=TRUE), ncol=1)) df4 <- as.data.frame(c(rep(5:12, each=30),rep(1:4, each=30))) df5 <- as.data.frame(matrix(sample(0:200, 36*10, replace=TRUE), ncol=1)) df6 <- as.data.frame(c(rep(8:12, each=30),rep(1:7, each=30))) Example <-...

I have a four column matrix with a chronological index, and three columns of names (strings). Here is some toy data: x = rbind(c(1,"sam","harry","joe"), c(2,"joe","sam","jack"),c(3,"jack","joe","jill"),c(4,"harry","jill","joe")) I want to create three additional vectors that count (for each row) any previous (but not subsequent) occurrences of the name. Here would be the...

Trying to avoid using a for loop in the following code by utilizing sapply, if at all possible. The solution with loop works perfectly fine for me, I'm just trying to learn more R and explore as many methods as possible. Objective: have a vector i and two vectors sf...

I have dataframe df with two columns col1, col2, includes NA values in them. I have to calculate mean, sd for them. I have calculated them separately with below code. # Random generation set.seed(12) df <- data.frame(col1 = sample(1:100, 10, replace=FALSE), col2 = sample(1:100, 10, replace=FALSE)) # Introducing null values...

I have 7 dataframes where the first variable is just a list of the 50 states. The problem is that in some of them, the states are all capitals, all lower case or mixed. Instead of writing 7 different tolower() commands, I was wondering if there is a way to...

Why are the two functions fn and gn below different? I don't think they should be, but I must be missing something. vars <- letters[1:10] a <- b <- 1 fn <- function (d) { sapply( vars, exists ) } gn <- function (d) { sapply( vars, function (x) {...

I have a matrix: mat <- matrix(c(0,0,0,0,1,1,1,1,-1,-1,-1,-1), ncol = 4 , nrow = 4) and I apply the following functions to filter out the columns with only positive entries, but for the columns that have negative entries I get a NULL. How can I suppress the NULLs from the output...