FAQ Database Discussion Community


R: Apply cut using row-specific breaks

r,matrix,apply
I would like to transform a matrix of latent scores to observed scores. One can do so by apply break points/thresholds to the original matrix, thus ending up having a new, categorical matrix. Doing so is simple, for example: #latent variable matrix true=matrix(c(1.45,2.45,3.45, 0.45,1.45,2.45, 3.45,4.45,5.45) ,ncol=3,byrow=TRUE) #breaks for the cut...

R apply function with multiple dynamic and static parameters

r,apply
I would like to apply the distancePointSegment function to all points in my vector, both are given in the code snippet below. The function takes in 6 values, 2 of which are dynamic (column/row specific) and 4 are static. # Function that I want to apply: distancePointSegment <- function(px, py,...

Apply a function over all combinations of arguments (output as list)

r,apply
This solution is almost what I need, but do not worked to my case. Here is what I have tried: comb_apply <- function(f,...){ exp <- expand.grid(...,stringsAsFactors = FALSE) apply(exp,1,function(x) do.call(f,x)) } #--- Testing Code l1 <- list("val1","val2") l2 <- list(2,3) testFunc<-function(x,y){ list(x,y) } #--- Executing Test Code comb_apply(testFunc,l1,l2) comb_apply(paste,l1,l2) It...

Create new variable based on size of value in other column

r,apply
I am attempting to create a df with a new variable called 'epi' (stands for episode)... which is based on the 'days.since.last' variable. when the value of 'days.since.last' is greater than 90, I want the episode variable to increase by 1. Here is the original df deid session.number days.since.last 1...

how to subtract a value from one column from a value from a previous row, different column in r

r,function,data.frame,apply,difference
I have a dataframe composed of 3 columns and ~2000 rows. ID DistA DistB 1 100 200 2 239 390 3 392 550 4 700 760 5 770 900 The first column (ID) is a unique identifier for each row. I'd like my script to read each row, and subtract/compare...

apply, sapply and lappy return NULL

r,apply,lapply,sapply
I have a matrix: mat <- matrix(c(0,0,0,0,1,1,1,1,-1,-1,-1,-1), ncol = 4 , nrow = 4) and I apply the following functions to filter out the columns with only positive entries, but for the columns that have negative entries I get a NULL. How can I suppress the NULLs from the output...

Why apply() converts date objects to numeric objects? [duplicate]

r,apply
This question already has an answer here: why all date strings are changed into numbers? 2 answers Why apply() converts my date objects to numeric before calling the user function? apply(matrix(seq(as.Date("2010-01-01"), as.Date("2010-01-05"), 1)), 1, function(x) { return(class(x)) }) [1] "numeric" "numeric" "numeric" "numeric" "numeric" And why as.Date() doesn't have...

Python pandas apply on more columns

python,pandas,dataframes,apply
How can I generate more columns in a dataframe using apply with more columns? My df is: A B C 0 11 21 31 1 12 22 31 If I want to generate only one column that works perfectly: df['new_1']=df[['A','C','B']].apply(lambda x: x[1]/2, axis=1) The result is: A B C new_1...

Create a datatable containing the Nth digit of each of a list of file names

r,apply
I have a list of files containing output from a large model. I load these as a datatable using: files <- list.files(path.expand("/XYZ/"), pattern = ".*\\.rds", full.names = TRUE) dt<- as.data.table(files) This datatable "dt" has just 1 column, the file name. e.g XZY_00_34234.rds the 50th and 51st character of each file...

Subset dataframe such that all values in each row are less than a certain value

r,dataframes,apply
I have a dataframe with a dimension column and 4 value columns. How can I subset the column such that all 4 columns for each record are less than a given x? I know I could do this manually using subset and specifying the condition for each column, but is...

Combine a function and for loop

r,plyr,apply,bioinformatics
I have data for different tissues like so tissueA tissueB tissueC gene1 4.5 6.2 5.8 gene2 3.2 4.7 6.6 And I want to calculate a summary statistic that is x = Σ [1-log2(i,j)/log2(i,max)]/n-1 where n is the number of tissues (here it is 3), (i,max) is the highest value for...

dplyr - error message after applying function

r,apply,spatial,dplyr
I am trying to apply a IDW (inverse distance weighting) to different groups in a database. I am trying to use dplyr to apply this function to each group, but i am making a mistake in the Split-Apply-Combine. The current function returns 10 values for each group of 10 observations,...

r - Use apply to take values of one collumn and calculate values for another collumn

r,apply
I have a data frame with measurements. One collumn show the measurements in mm, and the other in units (which is a relative scale varying with zoom-level on the stereo microscope I was using). I want to go through every row of my data frame, and for each "length_mm" that...

Find the intersection of two dataframes and compute the average of an integer row in the dataframe

r,dataframes,apply
I have two dataframes that contain id, score, and studentName. I would like to create a dataframe that contains only ids that appear in both test1 and test2. Then, I would like to average the students' scores. Here is some sample data: test1 <- data.frame(id = numeric(0), score = integer(0),...

Using sapply (or apply) inside objective function for optim ; object in list element no longer recognized

r,apply,mathematical-optimization,sapply
I would like to use an objective function based on a list of elements, each of which is the result of applying a function over a dataframe (df) ((function is, say, variance of df's observations' "measure")). That is, I have a list of dfs. I naturally want to sapply my...

Counting with zero values in the apply function

r,count,data.frame,apply
I am attempting to use the count with zero occurrences based on a defined list within the apply function. I have managed to do these separately, but would ideally like to have them in a single line. Here is my aim: list <- c("x", "y", "z") df V1 V2 V3...

Apply a weighted average function to a dataframe without grouping it, as if it was a single group

python,pandas,apply
I want to apply a function that computes something similar to a weighted average absolute deviation of all the elements of my data frame. I already have a solution for it, but it seems quirky to me because I have to use groupby with a lambda function that always returns...

Loop or apply for sum of rows based on multiple conditions in R dataframe

r,loops,apply,subsetting,multiple-conditions
I've hacked together a quick solution to my problem, but I have a feeling it's quite obtuse. Moreover, it uses for loops, which from what I've gathered, should be avoided at all costs in R. Any and all advice to tidy up this code is appreciated. I'm still pretty new...

R: deleting columns where certain percentage of values is missing [duplicate]

r,data.frame,apply,na,missing-data
This question already has an answer here: Deleting columns from a data.frame where NA is more than 15% of the column length 1 answer I'm working with a data frame resembling the extract below. sample.df Obs Var1 Var2 Var3 A0001 21 21 21 A0002 21 78 321 A0003 32...

applying a function to a timeseries in a R dataframe

r,apply
I'm trying apply a function to column in a dataframe that contains dates and keep getting an error. Not exactly sure what I'm doing wrong. Here is my df: dates total 1 2014-12-08 01:10:00 163.7 2 2014-12-08 01:10:00 163.9 3 2014-12-08 01:12:00 163.6 4 2014-12-08 08:27:00 163.0 5 2014-12-08 08:35:00...

Faster way to capture regex

regex,r,performance,apply
I want to use regex to capture substrings - I already have a working solution, but I wonder if there is a faster solution. I am applying applyCaptureRegex on a vector with about 400.000 entries. exampleData <- as.data.frame(c("[hg19:21:34809787-34809808:+]","[hg19:11:105851118-105851139:+]","[hg19:17:7482245-7482266:+]","[hg19:6:19839915-19839936:+]")) captureRegex <- function(captRegEx,str){ sapply(regmatches(str,gregexpr(captRegEx,str))[[1]], function(m) regmatches(m,regexec(captRegEx,m))) } applyCaptureRegex <- function(mir,r){...

Apply Function with multiple parameter

r,apply
Here is the data source: https://www.dropbox.com/s/z5jsvwbzz5fumqp/countyComplete.csv?dl=0 I want to multiply 2 columns (pop2010 * percapitaincome) for each county and then divide it by the count of state, grouped by state. How can I do it using any of the apply functions in R. here my try myfun<-function(x,y){ x*y } y<-county$per_capita_income...

R apply: using element indices in the function

r,apply
I have a three dimensional data structure reflecting data at particular longitudes, latitudes, and depth. I would like to apply a function to this data. Normally, say I want to find the depth-averaged value I'd do the following: apply(MyData, MAR = c(1, 2), mean) which makes sense to me. What...

Apply list of functions to list of values

r,apply
In reference to this question, I was trying to figure out the simplest way to apply a list of functions to a list of values. Basically, a nested lapply. For example, here we apply sd and mean to built in data set trees: funs <- list(sd=sd, mean=mean) sapply(funs, function(x) sapply(trees,...

Produce multiple ggplot boxplots through apply functions

r,ggplot2,apply
I was wondering if it was possible to produce a set of boxplots similar to that produced by this nested loop combinations using an apply function. It may not be possible/necessary but I thought it should be possible, I just cant wrap my head around how to do it. I...

Can't overload apply method in Scala Implicit class

scala,apply,implicit
I am writing a retry function with async and await def awaitRetry[T](times: Int)(block: => Future[T]): Future[T] = async { var i = 0 var result: Try[T] = Failure(new RuntimeException("failure")) while (result.isFailure && i < times) { result = await { Try(block) } // can't compile i += 1 } result.get...

R apply Return submatrix

r,matrix,syntax,apply
Given this code: test=matrix(c(1,2,3,4,5,6,7,8,9,10,11,12),4) splitData=data.frame(first=c(1,3),second=c(2,4)) apply(splitData,1,function (x) {test[x[1]:x[2],]}) I get this matrix: [,1] [,2] [1,] 1 3 [2,] 2 4 [3,] 5 7 [4,] 6 8 [5,] 9 11 [6,] 10 12 Why don't I get a list of matrices? Intended result: [[1]] [,1] [,2] [,3] [1,] 1 5 9...

R: applying a function over a group

r,aggregate,apply
I am looking to apply a function to a data frame and then store the results of that function in a new column in the data frame. Here is a sample of my data frame, tradeData: Login AL Diff a 1 0 a 1 0 a 1 0 a 0...

Fill in elements of Pandas DataFrame element by element

python,pandas,dataframes,apply
I have a data frame that needs to be re-represented. The original data frame has each row as a unique search term and the columns are all the resulting products. So each row is a different length. I want to turn this into a rectangular dataframe (called rectangle in the...

Pandas speedup apply on max()

python,pandas,apply,pandasql
I'd like to know how I may speed up the following function, e.g. with Cython? def groupby_maxtarget(df, group, target): df_grouped = df.groupby([group]).apply(lambda row: row[row[target]==row[target].max()]) return df_grouped This function groups by a single column and returns all rows where each group's target achieves its max value; the resulting dataframe is returned....

R Applying vector to vector function for matrix to matrix

r,function,matrix,vector,apply
I am using distHaversine, which takes two points and gives a distance, i.e. distHaversine(c(35,-75),c(35.1,-74.9)) prints: [1] 11501.11 I have two matricies, A and B that are (m by 2) and (n by 2), i.e. A has m points and B has n points. How can I use distHaversine on A...

Using lapply over list of data.tables to assign list member name as variable

r,list,data.table,apply
I have a list of data.tables library(data.table) set.seed(27) test <- list() test$a <- data.table(x = rnorm(n = 10), y = rnorm (n = 10)) test$b <- data.table(x = rnorm(n = 10), y = rnorm (n = 10)) Each member of the list has a unique name test In preparation to...

applying a Function to Each element in an array (O'Reilly cookbook example)

arrays,function,each,element,apply
for nested data. I tried <?php $names = array('firstnames' => array("Baba", "Billy"), 'lastnames' => array("O'Riley", "O'Reilly")); array_walk_recursive($names, function (&value, $key) { $value = htmlentities($value, ENT_QOUTES); }) foreach ($names as $nametypes) { foreach ($nametypes as $name) { print "$name\n"; } } ?> (An example from the book O'reilly PHP Cookbook 3rd...

Way to vectorize this loop? Multiply two matrices, store information, do this many times without looping

r,loops,matrix,vectorization,apply
Suppose (small numbers in this example) I have an array that is 3 x 14 x 5 call this set.seed(1) dfarray=array(rnorm(5*3*14,0,1),dim=c(3,14,5)) I have a matrix that corresponds to this and is 39 (which is 13*3) x 14 Call this matrix: dfmat = matrix(rnorm(13*3*14,0,1),39,14) dfmat = cbind(dfmat,rep(1:3,13)) dfmat = dfmat[order(dfmat [,15]),]...

Apply function to dataframe with changing argument

r,apply
I have 2 objects: A data frame with 3 variables: v1 <- 1:10 v2 <- 11:20 v3 <- 21:30 df <- data.frame(v1,v2,v3) A numeric vector with 3 elements: nv <- c(6,11,28) I would like to compare the first variable to the first number, the second variable to the second number...

Which R implementation gives the fastest JSD matrix computation?

r,performance,algorithm,matrix,apply
JSD matrix is a similarity matrix of distributions based on Jensen-Shannon divergence. Given matrix m which rows present distributions we would like to find JSD distance between each distribution. Resulting JSD matrix is a square matrix with dimensions nrow(m) x nrow(m). This is triangular matrix where each element contains JSD...

Create column using function

r,apply
I am having problems with trying to create a new column using a conditional calculation based on a function. I have some small datasets that are used to interpolate a reference temperature (Tref) based on altitude (CalcAlt). The function works when I try to do a single calculation but I...

How to generalize mapply to work “crosswisely”?

r,function,vectorization,apply,mapply
Mapply applies a 2-dimensional function to the 1st elements of each m-dimensional vector, and then to the 2nd elements of each, etc. The result is an m-dimensional vector. For example > mapply(sum, 1:5, 12:16) [1] 13 15 17 19 21 Now, is there a DIRECT alternative to mapply that applies...

Apply CASE WHEN in sqldf statement for manipulating multiple columns

r,data.frame,apply,sqldf
I have a dataframe datwe with 37 columns. I am interested in converting the integer values(1,2,99) in columns 23 to 35 to character values('Yes','No','NA'). datwe$COL23 <- sqldf("SELECT CASE COL23 WHEN 1 THEN 'Yes' WHEN 2 THEN 'No' WHEN 99 THEN 'NA' ELSE 'Name ittt' END as newCol FROM datwe")$newCol I...

Cleanest iteration/functional application on Pandas Dataframe regardless of length

python,pandas,apply
I constantly struggle with cleanly iterating or applying a function to Pandas DataFrames of variable length. Specifically, a length 1 DataFrame slice (Pandas Series). Simple example, a DataFrame and a function that acts on each row of it. The format of the dataframe is known/expected. def stringify(row): return "-".join([row["y"], str(row["x"]),...

arithmetic sequence: use apply or map instead of for loop

javascript,sequence,apply
I have a for loop like so: var myary = []; for(i=0; i<3; i++){ myary[i] = i; } //yields [0, 1, 2] I'd like to accomplish the same with myary.apply() or a functional equivalent, but I am not familiar with generating arithmetic sequences via functional methods in JavaScript. Is this...

Creating a matrix of multiple counters in R

r,counter,apply
So, my goal is to take an input vector and to make an output matrix of different counters. So every time a value appears in my inputs, I want to find that counter and iterate it by 1. I understand that I'm not good at explaining this, so I illustrated...

Scheme: Using apply on a list of tuples

scheme,apply
How do I use apply in Scheme to multiply the first element of each tuple by a number? Example, if my list x = ( (1 2) (3 4) ) I want to do something like: (apply * 2 (car x)) so that it would return ( (2 2) (6...

How to apply a function to multiple columns of a data frame aggregated by some other columns?

r,dataframes,apply
I have a data frame df with four columns, e.g. A B C D x a 1 3 x a 3 4 x b 5 5 x b 6 8 y a 6 5 y a 8 9 y b 7 0 y b 4 2 I want to aggregate...

Reshape start-end time intervals to smaller intervals in R

r,apply,reshape,seq
Here is duration data by time intervals. id <- c("A", "B", "B", "B", "C", "C", "D", "E", "F", "F", "F", "F") start <- c(368, 200, 230, 788, 230, 521, 272, 306, 0, 162, 337, 479) end <- c(373.98, 229.98, 233.98, 842.98, 239.98, 639.98, 285.98, 306.98, 95.98, 162.98, 339.98, 539.98) value...

Use Apply family function to create multiple new dataframe

r,apply
I am new to R programming language and currently I working on some financial data. The problem is a bit complicated to describe so I think it's better to start it step by step. First here is a small portion of the master dataframe(named:log_return) I am working on: Date AUS.Yield...

Using “apply” to apply a function to a matrix where parameters are column-specific

r,apply
I'm trying to avoid using loops by using apply to apply a user-defined function to a matrix. The problem I have is that there are additional parameters that my function uses and they differ for each column of the matrix. Below is a toy example. Say I have the following...

Using backticks and operators in apply family functions

r,data.frame,operators,apply
I saw in a recent answer an apply family function with assignments built-in and can't generalize it. lst <- list(a=1, b=2:3) lst $a [1] 1 $b [1] 2 3 This can't yet be made into a data.frame because of the unequal lengths. But by coercing the max length to the...

Unexpected behavior of apply v. for loop in R

r,bigdata,apply
I want to use apply instead of a for loop to speed up a function that creates a character string vector from paste-collapsing each row in a data frame, which contains strings and numbers with many decimals. The speed up is notable, but apply forces the numbers to fill the...

Using lapply to list percentage of null variables in every column in R

r,apply,lapply,mapply
I was given a large csv that is 115 columns across and 1000 rows. The columns have a variety of data, some is character-based, some is integer, etc. However, the data has a LOT of null variables of varying types (NA, -999, NULL, etc.). What I want to do is...

apply.daily Error: length of 'dimnames' [1] not equal to array extent in R

r,apply,xts
It occurs when I use apply.daily for an asset that has a total of 10 days worth of intraday data called rs packages needed are: library("xts") library("highfrequency") Where the error occurs: ts <- apply.daily(rs,function(x){ aggregatets(x ,on="minutes", k=15)}) ** REPRODUCIBLE DATA ** rs <- structure(c(222950, 222880, 222960, 222975, 222800, 222750, 222769,...

New Column with query result in SQL Server

sql,sql-server,count,apply
I need add new column with query result. I have this Query: SELECT DISTINCT Arrival , Flight , TotalPax.SumPassengers , TotalPaxLocal.SumLocalPassengers , STD , STA --, PassengerID --, Departure --, JourneyNumber --, SegmentNumber --, LegNumber --, InventoryLegKey --, RecordLocator FROM #TempLocalOrg tmp CROSS APPLY ( SELECT COUNT(1) AS SumPassengers FROM...

Apply a list of n *expressions* to each row of a dataframe?

r,apply,lapply,mapply
In short, I have a list of expressions that I want to apply to each row of a dataframe. This is very similar to this question, but there is a subtle difference in that I do not have a list of functions, but have a list of expressions. Here's what...

monthly means with apply for multidimensional arrays

r,aggregate,apply
I want to compute the mean over the 3-D of a multidimensional array. As this dimension is supposed to be the time, I wanted to computed monthly means. For that, I tried to use apply, but I am not sure where the problem is. Let's say my data is as...

Pandas DataFrame apply function doubling size of DataFrame

python,function,pandas,apply
I have a Pandas DataFrame with numeric data. For each non-binary column, I want to identify the values larger than its 99th percentile and create a boolean mask that I will later use to remove the rows with outliers. I am trying to create this boolean mask using the apply...

R - Replace a double loop by a function from the apply family

r,loops,apply
I have these loops : xall = data.frame() for (k in 1:nrow(VectClasses)) { for (i in 1:nrow(VectIndVar)) { xall[i,k] = sum(VectClasses[k,] == VectIndVar[i,]) } } The data: VectClasses = Data Frame containing the characteristics of each classes VectIndVar = Data Frame containing each record of the data base The two...

Rolling a function on a data frame

python,pandas,dataframes,apply
I have the following data frame C. >>> C a b c 2011-01-01 0 0 NaN 2011-01-02 41 12 NaN 2011-01-03 82 24 NaN 2011-01-04 123 36 NaN 2011-01-05 164 48 NaN 2011-01-06 205 60 2 2011-01-07 246 72 4 2011-01-08 287 84 6 2011-01-09 328 96 8 2011-01-10 369...

R: How to compute the mean by ID in a given data frame?

r,apply
I have the following data: ID Value 1 3 1 5 How can I compute the mean by ID, and put the mean in the data frame as a new variable such that it is repeated for the same ID. The result should look like this: ID Value Mean 1...

Using the apply / (array subscripting) method in a builder pattern

arrays,scala,apply
Given a trivial function returning an array: scala> def methodReturnsArray() = { Array(1.0, 2.0) } methodReturnsArray: ()Array[Double] We can go ahead and invoke the function: scala> val myarr = methodReturnsArray myarr: Array[Double] = Array(1.0, 2.0) scala> myarr(0) res21: Double = 1.0 However, it is not possible to use the apply...

Scala apply with no parameters

scala,subclass,apply
I want to create a program that generates random things when requested, such as letters in the below example. I've defined an abstract class with a companion object to return one of the subclasses. abstract class Letter class LetterA extends Letter { override def toString = "A" } class LetterB...

R: faster alternative of period.apply

r,time-series,apply
I have the following data prepared Timestamp Weighted Value SumVal Group 1 1600 800 1 2 1000 1000 2 3 1000 1000 2 4 1000 1000 2 5 800 500 3 6 400 500 3 7 2000 800 4 8 1200 1000 4 I want to calculate for each group...

split dataframe in groups before each non-NA

r,split,subset,apply,na
I am looking to split my dataframe into subsets according to the column "Height" with each subset having one row with a value and 0-Inf rows with NAs. This is, to be able to apply functions to the subsets afterwards, specifically order the rows according to their "Diameter" value,...

Apply function over columns of dataframe in R, compile results

r,for-loop,apply,lapply
I've searched here and on Google and haven't found an answer that I can apply to my situation. Lets say I have a dataframe with columns for Element 1, Element 2, Element 3, Metric, Other. I have another internal function that has three arguments (input_dataframe, element_position, metric_position) that I use...

How to convert frequency distribution to probability distribution in R

r,matrix,probability,apply,frequency-distribution
I have a matrix with n rows of observations. Observations are frequency distributions of the features. I would like to transform the frequency distributions to probability distributions where the sum of each row is 1. Therefore each element in the matrix should be divided by the sum of the row...

Improving run time for R with nested for loops

r,runtime,apply
My reproducible R example: f = runif(1500,10,50) p = matrix(0, nrow=1250, ncol=250) count = rep(0, 1250) for(i in 1:1250) { ref=f[i] for(j in 1:250) { p[i,j] = f[i + j - 1] / ref-1 if(p[i,j] == "NaN") { count[i] = count[i] } else if(p[i,j] > (0.026)) { count[i] = (count[i]...

Using ifelse Within apply

r,if-statement,nested,data.frame,apply
I am trying to make a new column in my dataset give a single output for each and every row, depending on the inputs from pre-existing columns. In this output column, I desire "NA" if any of the input vales in a given row are "0". Otherwise (if none of...

combine the i:th matrix from N lists

r,list,matrix,apply
I would like to combine large rectangular matrices stored in multiple lists. E.g. rbind.fill.matrix {plyr} the i:th matrix from all N lists. The number of matrices n within each list is equal across N lists. #Dummy data using N=2, n=2 # binary matrices ls1 <- replicate(n=2, list(matrix(rbinom(1,0.5,n=20), nrow=2))) ls2 <-...

apply a function to a matrix of lists in R

r,list,matrix,apply
I have a matrix of lists. How do I apply a function to each set of the lists and return a matrix of the same dimensions as my original matrix? I tried apply(X=data.matrix , MARGIN=c(1,2) , function(x) min(x$P) ) but it returned Error in min(x$P) : (converted from warning) no...

Conditionally apply a function with values over a certain value

r,apply
I'm sure there is an easy solution to this but I cannot seem to output the correct values. I have a dataframe and I would like to calculate an average based on values above a certain value, in this case 150. df1 <- as.data.frame(matrix(sample(0:1000, 36*10, replace=TRUE), ncol=1)) df2 <- as.data.frame(matrix(sample(0:500,...

Averaging values between paired columns across a large data frame

r,apply,mean,sapply,tapply
I have a dataframe consisting of a series of paired columns. Here is a small example. df1 <- as.data.frame(matrix(sample(0:1000, 36*10, replace=TRUE), ncol=1)) df2 <- as.data.frame(rep(1:12, each=30)) df3 <- as.data.frame(matrix(sample(0:500, 36*10, replace=TRUE), ncol=1)) df4 <- as.data.frame(c(rep(5:12, each=30),rep(1:4, each=30))) df5 <- as.data.frame(matrix(sample(0:200, 36*10, replace=TRUE), ncol=1)) df6 <- as.data.frame(c(rep(8:12, each=30),rep(1:7, each=30))) Example <-...

How to replace NA values in a data.table with na.spline

r,data.table,apply,spline
I'm trying to prepare some demographic data retrieved from Eurostat for further processing, amongst others replacing any missing data with corresponding approximated ones. First I was using data.frames only, but then I got convinced that data.tables might offer some advantages over regular data.frames, so I migrated to data.tables. One thing...

Finding closest value by ID with unequal lengths

r,apply
I have a data frame and a vector of unequal lengths. They do not share an id. df <- data.frame( id = factor(rep(1:24, each = 10)), x = runif(20)*100 ) a <- sort(runif(100*100)) Now, I would really like run over each row of the data frame and find the location...

Fitting fevd fuction to list R

r,list,dataframes,apply
I have data for 90 climate stations. For each station, I have made 100+ simulations using a statistical model. So, in R, I have 90 dataframes, each dataframe has 100+ simulations arranged column-wise. Now, I would like to fit an extreme value distribution (EVD) to each climate station. That is,...

Problems with apply R

r,svm,apply
I Have a problem with using the apply function in R. I made the following function: TrainSupportVectorMachines <- function(trainingData,kernel,G,C){ ####train het model fit<-svm(Device~.,data=trainingData,kernel=kernel,probability=TRUE, gamma =G, costs=C) return(fit); } I want to train the model with different values of Cost(c). Therefore, I tried the following commend: cst = matrix(2^(-4:-2),ncol=3) kernl =...

Elastic Search : General and conditional filters

elasticsearch,conditional,apply,exists
I'm using Elastic Search, with query match_all and filtering. In my situation I want to apply a general filter and filters by condition. Here in pseudo: query: match all (works fine) filter range date between d1 and d2 (works fine without bullet 3) filter (apply only if field exists, but...

Call data.frame columns inside of R functions?

r,function,apply
What is the proper way to do this? I have a function that works great on its own given a series of inputs and I'd like to use this function on a large dataset rather than singular values by looping through the data by row. I have tried to update...

R Code: using UDF with multiple arguments for apply function

r,user-defined-functions,apply,udf,multiple-arguments
My UDF: testfn = function(x1, x2, x3){ if(x1 > 0){y = x1 + x2 + x3} if(x1 < 0){y = x1 - x2 - x3} return(y) } My Sample Test set: test = cbind(rep(1,3),c(2,4,6),c(1,2,3)) Running of apply: apply(test, 1, testfn, x1 = test[1], x2 = test[2], x3 = test[3]) This...

Using apply for simulation instead of nested for loops

r,for-loop,simulation,apply
I reproduced in R a simulation that was originally done in Stata. I used 'for' loops since this is the only way I know how to make this work. It takes quite a long time to run, so I would like to use one of the 'apply' commands instead to...

Mean by levels of factor in R, append as new column [duplicate]

r,apply,mean
This question already has an answer here: Joining aggregated values back to the original data frame 5 answers I have what I fear may be a simple problem, to which I almost have the solution (indeed, I do have a solution, but it's clumsy). I have a data frame...

Apply + lubridate returns numeric

r,apply,lubridate
I have a data set that looks like this birds[,1:3] Source: local data frame [15 x 3] year month day 1 2015 5 13 2 2015 5 14 3 2015 5 15 4 2015 5 16 5 2015 5 17 6 2014 5 28 7 2014 5 29 8 2014...

Create a matrix by comparing differently sized vectors (without a for loop)

r,apply
I'm starting out in R, and pretty sure this is achievable via one of the apply functions. I have two differently sized vectors, a <- c('A', 'B', 'C') and b <- c('A', 'B', 'C', 'D', 'E'). I want to compare the values of a and b, and where they match,...

Preventing apply from returning a vector?

r,apply
I'm running a large number of data frames with variable dimensions through a series of apply() calls that look something like the code below. df1 = t(data.frame('test'=c(0,0,1,0))) df1 = apply(df1,2,function(j){sub(0,'00',j)}) df1 = apply(df1,2,function(j){sub(1,'01',j)}) df1 = apply(df1,2,function(j){sub(2,'10',j)}) df1 In some rare cases where the data frame is size 1xn the first...

apply() not assigning values

r,apply
I have a subset of a data frame of 16 columns. They are all factors, with the same levels and labels. I am trying to use one of the apply() functions to assign the levels and labels at once, but my function is printing the results rather than assigning them...

Percentage values after matrix multiplication

r,matrix,apply,matrix-multiplication
By matrix multiplication I get the following matrix, which, let's say, shows how many customers who purchased product A, sooner or later, also purchased product B, product C and so on. Obviously, the diagonal values represent 100% of all purchases of a particular product. I'm looking for a way of...

Nesting aggregate within apply to aggregate multiple columns by multiple variables in R

r,aggregate,nested-loops,apply,summary
I have a dataframe with sets of scores, and sets of grouping variables, something like: s1 s2 s3 g1 g2 g3 4 3 7 F F T 6 2 2 T T T 2 4 9 G G F 1 3 1 T F G I want to run an...