I hope someone will be able to help me on this problem. I have a list object that includes 48 vectors and each vector has a length of 2,000,000 observations in it. Here is a code that creates the same structure with only 100,000 items per vector: mtx_sim <- matrix(data...

I am trying some split-apply-combine methods. How do I split a data into different categories and then sort each categories on descending order of a particular column. First I split mtcars spmtcars <-split(mtcars, mtcars$cyl) then if I do sort_mtc <- spmtcars[order(mpg), ] Error in order(mpg) : object 'mpg' not found...

In R, I want to split a data frame along a factor variable, and then apply a function to the data pertaining to each level of that variable. I want to do all of this inside my function. Somehow, the data aren't being split? I don't understand all of the...

I have a frame data "testData" as follows: id content 1 I came from China 2 I came from America 3 I came from Canada 4 I came from Japan 5 I came from Mars And I also have another frame data "addr" as follows: id addr 1 America 2...

I have a function (weisurv) that has 2 parameters - sc and shp. It is a function through time (t). Time is a sequence, i.e. t<-seq(1:100). weisurv<-function(t,sc,shp){ surv<-exp(-(t/sc)^shp) return(surv) } I have a data frame (df) that contains a list of sc and shp values (like 300+ of them). For...

I have 7 dataframes where the first variable is just a list of the 50 states. The problem is that in some of them, the states are all capitals, all lower case or mixed. Instead of writing 7 different tolower() commands, I was wondering if there is a way to...

This code is suppose to take in a word, and compute values for letters of the word, based on the position of the letter in the word. So for a word like "broke" it's suppose to compute the values for the letter "r" and "k" strg <- 'broke' #this part...

I have a dataframe consisting of a series of paired columns. Here is a small example. df1 <- as.data.frame(matrix(sample(0:1000, 36*10, replace=TRUE), ncol=1)) df2 <- as.data.frame(rep(1:12, each=30)) df3 <- as.data.frame(matrix(sample(0:500, 36*10, replace=TRUE), ncol=1)) df4 <- as.data.frame(c(rep(5:12, each=30),rep(1:4, each=30))) df5 <- as.data.frame(matrix(sample(0:200, 36*10, replace=TRUE), ncol=1)) df6 <- as.data.frame(c(rep(8:12, each=30),rep(1:7, each=30))) Example <-...

Trying to avoid using a for loop in the following code by utilizing sapply, if at all possible. The solution with loop works perfectly fine for me, I'm just trying to learn more R and explore as many methods as possible. Objective: have a vector i and two vectors sf...

I have the following type of data: token <- list( cameron = rep("people", 12)) I'm applying a function like the following: token <- sapply(token, function(x){ x <- str_trim(x, side = "both") }) The problem is sapply() messes up the name of the structure. Running names(token) returns NULL. Running str(token) shows...

How can I make the below function faster, for larger datasets it takes too long to complete sapply? Here is wwhat I am attempting in the below code: Extracting a number from a string of characters. Since the extracted number may be 1 or 2 digits long. Therefore,I have used...

I have dataframe df with two columns col1, col2, includes NA values in them. I have to calculate mean, sd for them. I have calculated them separately with below code. # Random generation set.seed(12) df <- data.frame(col1 = sample(1:100, 10, replace=FALSE), col2 = sample(1:100, 10, replace=FALSE)) # Introducing null values...

I've got this table/array in csv: GroupID Channel Daysbeforelast 1 A 35 1 B 31 1 C 29 1 D 17 1 E 15 1 D 5 1 C 0 2 B 66 2 E 17 2 D 15 2 A 2 2 C 0 2 F 0 2 A...

It seems this question has been asked a couple of times in different forms, but I could't find the right solution. I have a SpatialPoint object with several Polygons and would like to subset and plot one polygon using the slot "ID". Using the example from this question: Sr1 =...

I would like to use an objective function based on a list of elements, each of which is the result of applying a function over a dataframe (df) ((function is, say, variance of df's observations' "measure")). That is, I have a list of dfs. I naturally want to sapply my...

<- updated for completeness (thanks to hrbrmstr for pointing it out)-> I'm trying to extract some data from Pubmed and I've been reading the example from here (relevant diagram here). A redacted version of my data looks like: <PubmedArticleSet> <PubmedArticle> <MedlineCitation Owner="NLM" Status="MEDLINE"> <PMID Version="1">11841882</PMID> <Article PubModel="Print"> <PublicationTypeList> <PublicationType UI="D002363">Case...

I have a four column matrix with a chronological index, and three columns of names (strings). Here is some toy data: x = rbind(c(1,"sam","harry","joe"), c(2,"joe","sam","jack"),c(3,"jack","joe","jill"),c(4,"harry","jill","joe")) I want to create three additional vectors that count (for each row) any previous (but not subsequent) occurrences of the name. Here would be the...

Why are the two functions fn and gn below different? I don't think they should be, but I must be missing something. vars <- letters[1:10] a <- b <- 1 fn <- function (d) { sapply( vars, exists ) } gn <- function (d) { sapply( vars, function (x) {...

Is there a reason why I can't append values to an empty vector when called within a nested lapply/apply function? I have an empty vector bucket where I'd like to push values into, however, the output says the bucket is reinitialized with each iteration. I would appreciate any insight into...

I have a simple question. Assuming I have a list Obj of length 500 Obj[[1]], Obj[[2]], ....Obj[[500]], #for each Obj[[i]], it has an element Obj[[i]]$logL, My question is how to extract logL of each Obj to avoid a for loop like this? logL = rep(NA, length(Obj)) for(i in 1: length(Obj)){...

I have a matrix: mat <- matrix(c(0,0,0,0,1,1,1,1,-1,-1,-1,-1), ncol = 4 , nrow = 4) and I apply the following functions to filter out the columns with only positive entries, but for the columns that have negative entries I get a NULL. How can I suppress the NULLs from the output...