FAQ Database Discussion Community


Appending a data frame with for if and else statements or how do put print in dataframe

r,loops,data.frame,append
How do I put what I printed in a dataframe with a for loop and if else statements? Basically, this code: list<-c("10","20","5") for (j in 1:3){ if (list[j] < 8) print("Greater") else print("Less") }) #[1] "Less" #[1] "Less" #[1] "Greater" Or should it be something more like this? f3 <-...

Plotting a data frame in R

r,graph,data.frame
I have this data frame and I'd like to know if there's a way to plot this using the ggplot2 library (or anything that works). The first row has a bunch of zip codes and the second row contains weather data (temperature in this case) associated with the corresponding zip...

Applying a function to each quantile of an R dataframe

r,data.frame,quantile
I have an R dataframe and I want to apply an estimation function for each of its quantiles. Here's an example with lm(): df <- data.frame(Y = sample(100), X1 = sample(100), X2 = sample(100)) estFun <- function(df){lm(Y ~ X1 + X2, data = df)} If I split that in two...

Subset data frames inside of a list based on column classes

r,list,data.frame,subset
I have a very large list comprised of data frames, every element of the list is a different data frame, where each column is comprised of different types of variables, and data frames of different lengths. I want to subset the data frames in this list, and keep only those...

R: Volatility function that interprets NAs

r,data.frame,time-series,na
I am looking for help with getting a volatility function to work with my dataframe. In the function below, I'm just trying to get price daily log returns for each security (each column in my data is a different security's prices over time), and then calculate an annualized vol. volcalc=...

R table manipulation

r,table,data.frame,row
I have a data.frame as below PRODUCT=c(rep("A",4),rep("B",2)) ww1=c(201438,201440,201444,201446,201411,201412) ww2=ww1-6 DIFF=rep(6,6) DEMAND=rep(100,6) df=data.frame(PRODUCT,ww1,ww2,DIFF,DEMAND) df<- df[with(df,order(PRODUCT, ww1)),] df PRODUCT ww1 ww2 DIFF DEMAND 1 A 201438 201432 6 100 2 A 201440 201434 6 100 3 A 201444 201438 6 100 4 A 201446 201440 6 100 5 B 201411 201405 6...

Paste the elements of two columns [duplicate]

r,performance,data.frame,paste,rcpp
This question already has an answer here: Speedy/elegant way to unite many pairs of columns 3 answers I have a data.frame of the following kind set.seed(12) d = data.frame(a=sample(5,x=1:9), b=sample(5,x=1:9), c=sample(5,x=1:9), d=sample(5,x=1:9), e=sample(5,x=1:9), f=sample(5,x=1:9)) d # a b c d e f # 1 1 1 4 4 2...

Combining time series data into a single data frame

r,date,data.frame,time-series
I have multiple data frames that look like this: > head(Standard.df) Count.S Date Month Week Year 552 15 2008-01-01 2008-01-01 2007-12-31 2008-01-01 594 11 2008-01-02 2008-01-01 2007-12-31 2008-01-01 1049 10 2008-01-03 2008-01-01 2007-12-31 2008-01-01 511 12 2008-01-04 2008-01-01 2007-12-31 2008-01-01 717 10 2008-01-06 2008-01-01 2007-12-31 2008-01-01 1744 3 2008-01-07 2008-01-01...

read multiple CSV and add to list in a loop in R

r,csv,data.frame
I have multiple CSV files and want to read them in R. the file names are provided as an argument, so I don't know the names in advance. This is the reason, why I do it in a loop. Next, I want each dataframe to be appended to a list....

Getting list from row R

r,data.frame,xls
I have file, when u can check here: http://www.nbp.pl/kursy/Archiwum/archiwum_tab_a_1999.xls I want to get first row of this file into list. When I do this: dane <- read.xls("http://www.nbp.pl/kursy/Archiwum/archiwum_tab_a_1999.xls") names(dane) I recieved list but some weird values like X1, X2. I want list of this elements: Nr / No. Data / Date...

Separating one column into six after loading a .txt file

r,data.frame
I know this question has probably been asked quite often, but I am still having a lot of trouble with it. So far what I have is week1.txt. I am able to load the data into R with week1 = read.csv("week1.txt") but the file is created so that the values...

How do I draw a better quality graph for this example using R ggplot() function? [closed]

r,ggplot2,data.frame
I have a data frame, that comprises of three Categorical and one Continuous variable and I have converted into long format using melt and it is as shown below. data_long <- structure(list(Year = c(2010L, 2010L, 2010L, 2011L, 2011L, 2011L, 2012L, 2012L, 2012L, 2012L, 2013L, 2013L, 2013L, 2014L, 2014L, 2014L, 2010L,...

Events in last 21 days for every row by Name

r,data.frame,data.table,dplyr
This is what my dataframe looks like. The two rightmost columns are my desired columns.These two columns check the condition whether in the last 21 days there is an "Email" ActivityType and whether in the last 21 days there is a "Webinar" ActivityType. Name ActivityType ActivityDate Email(last21days) Webinar(last21day)** John Email...

handling NA values in apply functions returning more than one value

r,data.frame,sapply
I have dataframe df with two columns col1, col2, includes NA values in them. I have to calculate mean, sd for them. I have calculated them separately with below code. # Random generation set.seed(12) df <- data.frame(col1 = sample(1:100, 10, replace=FALSE), col2 = sample(1:100, 10, replace=FALSE)) # Introducing null values...

Apply rolling function across list of data frames

r,list,data.frame
I have a list of data frames, with each data frame corresponding to a date and the rows of each data frame corresponding to hourly periods in the day. I need to apply a rolling function over equivalent time stamps for every day. For example, for a rolling 5 period...

multidimensional arrays of data frames

r,multidimensional-array,data.frame
I like to define an 3D-array of data.frame(A,B,C,...) so that I can do for (x in 1:4) for (y in 1:5) for (z in 1:5) { m[x,y,z]$A <- dnorm(1) m[x,y,z]$B <- dnorm(1) m[x,y,z]$C <- dnorm(1) } it would be ok too, if I get a data.frame(x,y,z,A,B,C) with x,y,z ids and...

Changing values between two data.frames in R

r,data.frame
I have the following data.frames(sample): >df1 number ACTION 1 1 this 2 2 that 3 3 theOther 4 4 another >df2 id VALUE 1 1 3 2 2 4 3 3 2 4 4 1 4 5 4 4 6 2 4 7 3 . . . . . ....

Error when I used R to Convert basket data format to single format

r,data.frame,reshape
I want to convert a basket format to single,and used the code by "Convert basket to single", but an error happened as below: > Data <- read.table("r9.txt") > Data V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 1 A3322 neutral 157 158 159 160 161 162 163...

XML to data frame with missing nodes

xml,r,data.frame
Versions of this question have been asked before, as here and here. However, I still can't get it to work. I'm trying to parse an XML document into a data frame. The problem is that the some of the variables are not present for some of the observations, so I'm...

An error while looping a linear regression

r,loops,data.frame,regression
I would like to run a loop that will run per each category of one of the variables and produce a prediction per each regression so that the sum of the prediction variable will be deduced from the target variable .Here Is my toy data and code: df <- read.table(text...

reorder factor levels by data in R

r,data.frame
Suppose I have a factor whose levels are "1/1/2013" "1/1/2014" "1/1/2015" "10/1/2012" "10/1/2013" "10/1/2014" "4/1/2013" "4/1/2014" "4/1/2015" "7/1/2012" "7/1/2013" "7/1/2014" What is the easiest way to resort this by date? I know I can do this manually by picking and choosing... Thanks!...

R - convert summary to data.frame

r,data.frame
I'm new to R. I have this admission_table containing ADMIT, GRE, GPA and RANK. > head(admission_table) ADMIT GRE GPA RANK 1 0 380 3.61 3 2 1 660 3.67 3 3 1 800 4.00 1 4 1 640 3.19 4 5 0 520 2.93 4 6 1 760 3.00 2...

Resetting TIME column when AMT > 0

r,data.frame
I have a data frame that looks like this: ID TIME AMT 1 0 50 1 1 0 1 2 0 1 3 0 1 4 0 1 4 50 1 5 0 1 7 0 1 9 0 1 10 0 1 10 50 The TIME column in the...

R extract matching values from list of data frames

r,list,data.frame
I have a relatively large amount of data stored in a list of data frames with several columns. For each element of the list I wish to check one column against a reference and if present extract the value held in another column of the same element and place in...

R text manipulation

r,table,data.frame
I have a dataframe as below. I want to take ww1 column and create a new column newww1 as follows: my excel formula is =2012&TEXT((LEFT(201438,4)-2012)*53+RIGHT(201438,2),"0000") where instead of 201438 I will have a value from column ww1 the explanation of my formula is: take left 4 characters of ww1 subtract...

Convert Comma-Separated Column to Columns with Booleans

r,csv,data.frame
I have the following comma-separated data in one of my data.frame's columns called services. > dput(structure(df$services[1:5])) list("Global Expense Management, Company Privacy Policy", "Removal Services, Global Expense Management", "Removal Services, Exception &amp; Cost Admin, Global Cost Estimate, Company Privacy Policy", "Removal Services, Exception &amp; Cost Admin, Ancillary Services, Global Cost Estimate,...

Iteratively adding new columns to a list of data frames

r,list,for-loop,data.frame
I am trying to write some code that iterates through a list of data frames, adding to each a new column that contains the same values as an older column but shifted by 1. The first value in this column will be NA. Below is my code: for(dataframe in 1:length(listOfDataFrames)){...

Creating a matrix based on a function in R

r,matrix,data.frame
I have a symmetric matrix (dimension: 12,000 X 12,000) named A and I want to create another one based on a formula, which depends on the elements position. To explain: I want to create the D matrix (based on the values from A) using the formula: Dij = 1 -...

R construct summary of values from columns

r,data.frame,unique
I would like to make an array that summarises the rows of a data frame with the unique values contained within said rows. with sample the following example code: ref <- c(1:8) data1 <- c("A","","C","","","","A","") data2 <- c("A","","","A","C","","","") data3 <- c("","B","","","","","","B") data4 <- c("A","B","","","","D","A","") initial.data <- data.frame(ref, data1, data2, data3,...

Replacing certain values in a data frame as NAs

r,data.frame,na
Suppose I have a data.frame names <- c("John", "Mark", "Larry", "Will", "Kate", "Daria", "Tom") gender <- c("M", "M", "M", "M", "F", "F", "M") mark <- c(1, 2, 3, 1, 2, 3, 1) df <- data.frame(names, gender, mark) df names gender mark 1 John M 1 2 Mark M 2 3...

Fastest way to check if dataframe is empty [duplicate]

r,data.frame,timing
This question already has an answer here: Determine if data frame is empty 1 answer What is the fastest (every microsecond counts) way to check if a data.frame is empty? I need it in the following context: if (<df is not empty>) { do something here } Possible solutions:...

R: Create a column with values based on grouping specific rows

r,group-by,data.frame
Here is a sample data frame, ID <- c(101,102,103,201,202,203,301,302,303,401,402,403) Point_A <- c(10,20,30,40,50,60,70,80,90,100,110,120) df <- data.frame(ID,Point_A) ID Point_A 1 101 10 2 102 20 3 103 30 4 201 40 5 202 50 6 203 60 7 301 70 8 302 80 9 303 90 10 401 100 11 402 110...

fill a data.table based on value in another data.table

r,data.frame,data.table
I'm very new to data.table but would like to solve my problem with it, as I have the feeling it would be 1000 times faster than with "regular" data.frames. Here is my problem: What I have: 2 data.tables dt1 and dt2 like so: dt1 <- data.table(SID=paste0("S", 1:15), Chromo=rep(1:3, e=5), PP=rep(1:5,...

Mean all other other columns based on a single column in r

r,data.frame,unique,average
I have a large dataframe that has more than 40,000 columns and I am running into a problem similar to this Sum by distinct column value in R shop <- data.frame( 'shop_id' = c('Shop A', 'Shop A', 'Shop A', 'Shop B', 'Shop C', 'Shop C'), 'Assets' = c(2, 15, 7,...

Using ifelse Within apply

r,if-statement,nested,data.frame,apply
I am trying to make a new column in my dataset give a single output for each and every row, depending on the inputs from pre-existing columns. In this output column, I desire "NA" if any of the input vales in a given row are "0". Otherwise (if none of...

Table week to date

r,data.frame
I have a data frame like: > df week month year x 1 1-7 sep 2013 566 2 8-14 sep 2013 65 3 15-21 sep 2013 144 4 22-28 sep 2013 455 5 29-30 sep 2013 1212 And need to convert it to: > df_out date x 1 01/09/2013 80.86...

how to calculate row means in a data frame?

r,data.frame,stat
I have a dataframe with 1000 columns and 8 rows, I need to calculate row means.I tried this loop: final <- as.data.frame(matrix(nrow=8,ncol=1)) for(j in 1:8){ value<- mean(dataframe[j,]) final[j,]<-value } but got the following error: In mean.default(df2[j, ]) : argument is not numeric or logical: returning NA ...

Extracting a column from one dataset and creating another dataset with columns from a third dataset in R [duplicate]

r,data,merge,data.frame
This question already has an answer here: How to join (merge) data frames (inner, outer, left, right)? 8 answers So I have these two datasets: ID DOB ID2 count 1 4083 2007-10-01 3625 5 2 4408 2008-07-01 3603 2 3 4514 2007-07-01 3077 3 4 4396 2008-05-01 3413 5...

Best practice to get a dropped column in dplyr tbl_df

r,data.frame,dplyr
I remember a comment on r-help in 2001 saying that drop = TRUE in [.data.frame was the worst design decision in R history. dplyr corrects that and does not drop implicitly. When trying to convert old code to dplyr style, this introduces some nasty bugs when d[, 1] or d[1]...

Reshape a row to collumns based in condition data frame R

r,table,data.frame,reshape
I have this data frame in R: x 1 [email protected] 2 [email protected] 3 43 4 [email protected] 5 [email protected] 6 13 7 [email protected] 8 [email protected] 9 31 10 [email protected] 11 [email protected] 12 32 I would like to have a data frame with 3 columns, not just one: x y value 1...

Reshape a data frame

r,data.frame,reshape
I have the following data frame structure(list(X1 = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L), .Label = c("1", "2", "3"), class = "factor"), V1 = c(1, NA, 1, 0, NA, NA, NA, NA, NA, NA, NA, NA), V2 = c(NA, NA, NA, NA, 0, 0.25,...

How to aggregate data with the dyplr package in R

r,data.frame,grouping,dplyr
I would like to understand how to write the following code using the dplyr package: averageStepsDayType <- aggregate( NAreplacement$steps, by=list(interval=NAreplacement$interval, dayType=NAreplacement$dayType), mean ) This is the original data frame: > head(NAreplacement) steps date interval dayType 1 1.7169811 2012-10-01 0 weekday 2 0.3396226 2012-10-01 5 weekday 3 0.1320755 2012-10-01 10 weekday...

Create date column from datetime in R

r,date,datetime,data.frame
I am new to R and I am an avid SAS programmer and am just having a difficult time wrapping my head around R. Within a data frame I have a date time column formatted as a POSIXITct with the following the column appearing as "2013-01-01 00:53:00". I would like...

apply a function on a data frame return incorrect result

mysql,sql,r,function,data.frame
I'm having a problem applying a function on a data frame where the function has some sql code. Consider the following dummy data frame: df <- read.table(header=T, text=' ID 20 21 22 ') when I'm trying to do this (after I establish a connection to the db and it works):...

R: Create 2 columns with difference and percentages values of another column

r,data.frame,data.table,plyr,dplyr
I have a dataframe like this ID <- c(101,101,101,102,102,102,103,103,103) Pt_A <- c(50,100,150,20,30,40,60,80,90) df <- data.frame(ID,Pt_A) +-----+------+ | ID | Pt_A | +-----+------+ | 101 | 50 | | 101 | 100 | | 101 | 150 | | 102 | 20 | | 102 | 30 | | 102 |...

Create data frame from TEI XML using xml2

xml,r,data.frame,tei
I'm trying to create a data frame of an TEI-XML version of Moby Dick using Hadley Wickham's xml2 package. I want the data frame to ultimately look like this (for all the words in the novel): df <- data.frame( chapter = c("1", "1", "1"), words = c("call", "me", "ishmael")) I'm...

Extract series of observations from dataframe for complete sets of data

r,loops,data.frame,pattern-matching,subset
I have a data frame of values composed of 5 variables (class in brackets) 1) DateTime (as.POSIXct), 2) ID (character), 3) Sensor 1 (numeric), 4) Sensor 2 (numeric), 5) Sensor 3 (numeric) This data was collected from 5 tagged fish. Each fish has one tag with 3 sensors on it,...

Name list of data frames from data frame

r,list,data.frame,names
I usually read a bunch of .csv files into a list of data frames and name it manually doing. #...code for creating the list named "datos" with files from library # Naming the columns of the data frames names(datos$v1r1)<-c("estado","tiempo","x1","x2","y1","y2") names(datos$v1r2)<-c(...) names(datos$v1r3)<-c(...) I want to do this renaming operation automatically. To...

How to save a data frame in R

r,save,data.frame
According to the answer to this question, you can save a data frame "foo" in R with the save() function as follows: save(foo,file="data.Rda") Here is data frame "df": > str(df) 'data.frame': 1254 obs. of 2 variables $ text : chr "RT @SchmittySays: I love this 1st grade #science teacher from...

Apply CASE WHEN in sqldf statement for manipulating multiple columns

r,data.frame,apply,sqldf
I have a dataframe datwe with 37 columns. I am interested in converting the integer values(1,2,99) in columns 23 to 35 to character values('Yes','No','NA'). datwe$COL23 <- sqldf("SELECT CASE COL23 WHEN 1 THEN 'Yes' WHEN 2 THEN 'No' WHEN 99 THEN 'NA' ELSE 'Name ittt' END as newCol FROM datwe")$newCol I...

how can I delete the columns contain NA or the variance equal to 0

r,data.frame
I want to scale my data before do a PCA, but unfortunately I found some columns contains NA, and the variance of some columns equal to 0, I want to delete these columns. This is an example of my data df <- data.frame( v1 = 1:10 , v2 = rep(...

Performing Text Analytics on a text Column in Dataframe in R [closed]

r,data.frame,text-analysis
I have imported a CSV file into a dataframe in R and one of the columns contains Text. I want to perform analysis on the text. How do I go about it? I tried making a new dataframe containing only the text column. OnlyTXT= Txtanalytics1 %>% select(problem_note_text) View(OnlyTXT). ...

how to generate a linear regression matrix like cor()

r,data.frame,linear-regression
I have a dataframe like below : a1 a2 a3 a4 1 3 3 5 5 2 4 3 5 5 3 5 4 6 5 4 6 5 7 3 I want to do linear regression for every two columns in the dataframe, and set intercept as 0. In...

using the spread and/or dcast command

r,data.frame
Suppose I have the following data frame: Name Type Date Description CorrectionCode Bob X1 01/01 Desc1 394 Bob X2 01/01 Desc2 9348 Jim X3 03/04 Desc4 934 How would I get that into Name Type Date Description1 CorrectionCode1 Description2 CorrectionCode2 Bob X1 01/01 Desc1 394 Desc2 9348 Jim X3 03/04...

How to generate random numbers in a data.frame with range

r,random,data.frame,seq
I have a data.frame which I want to generate random numbers each list by a sequence. I used sample function to create random numbers but even I created random numbers for list [[1]], for set [[2]] same numbers produced again. So, here how can I create different random numbers for...

R: Transform 2 columns into a data frame based on counting column values

r,data.frame
I am trying to reshape 2 columns of a data frame into a new data frame. Both columns have text values. I need to count each combination of values and put them in a data frame. Below is an example of 2 columns I need to use: 2 Columns from...

Subtract from the previous row R [duplicate]

r,data.frame
This question already has an answer here: How to find the difference in value in every two consecutive rows in R? 3 answers I have a dataframe like so: df <- data.frame(start=c(5,4,2),end=c(2,6,3)) start end 5 2 4 6 2 3 And I want the following result: start end diff...

Filling a matrix from a dataframe with separate functions for diagonal and off-diagonal elements

r,matrix,data.frame
I have a following problem where I have a dataframe (df): df <- data.frame(inp = c("inp1", "inp2", "inp3"), A = c(1,2,3), B = c(1,2,3)) I need to construct a inp*inp square matrix from this dataframe that complies to certain formulas for diagonal and off-diagonal elements. The diagonal elements are calculated...

Combine two lists in R into a dataframe [duplicate]

r,list,data.frame
This question already has an answer here: R list to data frame 9 answers I have two lists in R of identical length and would like to combine them into a data frame with the total number of rows in the resulting data frame equivalent to the length of...

R: Avoid loop or row apply function

r,merge,data.frame,data.table
I've following two data frame df_sales and df_supply. I want to merge the sale to supply in such a manner that my df_sales table have DATE_SUPPLY and QNT_SUPPLY from df_supply on below conditions *Condition: DATE_SUPPLY should be recent DATE_SUPPLY of corresponding "ITEM" for corresponding "STORE", i.e, DATE_SALE <- max(df_supply[df_supply$DATE_SUPPLY <=...

Using backticks and operators in apply family functions

r,data.frame,operators,apply
I saw in a recent answer an apply family function with assignments built-in and can't generalize it. lst <- list(a=1, b=2:3) lst $a [1] 1 $b [1] 2 3 This can't yet be made into a data.frame because of the unequal lengths. But by coercing the max length to the...

Replacing values in a huge dataframe using R

r,data.frame,subsetting
I have a huge dataframe (600,000 x 12,000) and I need to replace some values. I have tried as below, but it takes more than 3 hours: mydata[mydata = “AA”] <- 0 mydata[mydata = “AB”] <- 1 mydata[mydata = “BA”] <- 1 mydata[mydata = “BB”] <- 2 mydata[mydata = “--”]...

R creating a sequence table from two columns

r,data.frame,seq
I have a table as below product=c("a","b","c") min=c(1,5,3) max=c(1,7,7) dd=data.frame(product,min,max) > dd product min max 1 a 1 1 2 b 5 7 3 c 3 7 I want to create a table which will look like below. I want to create one row for each value between and including...

How to get a mean from data.frame?

xml,r,data.frame
I have a data.frame, which i got from this: data <- ldply(xmlToList("http://www.nbp.pl/kursy/xml/a025z100205.xml"),data.frame) I create a list like this : list <- data[[6]] then I deleted NA values list <- list[!is.na(list)] And I got this [1] 0,0900 2,9915 2,5851 0,3850 2,7805 2,0566 2,1043 4,0921 1,4918 2,7837 [11] 4,7009 0,3723 3,3450 0,1561...

Apply function iteratively across a dataframe

r,data.frame,tail,cbind
I have a two-part question for applying a function across a dataset in R. i) Firstly, I have 2 data frames that I would like to be combined and paired iteratively, so that something like a cbind function would line up the 1st columns of each data frame next to...

Parsing large XML file in R is very slow

xml,r,performance,xml-parsing,data.frame
I need to extract data from a large xml file in R. The file size is 60 MB. I use the following R code to download the data from the Internet: library(XML) library(httr) url = "http://hydro1.sci.gsfc.nasa.gov/daac-bin/his/1.0/NLDAS_NOAH_002.cgi" SOAPAction = "http://www.cuahsi.org/his/1.0/ws/GetSites" envelope = "<?xml version=\"1.0\" encoding=\"utf-8\"?>\n<soap:Envelope xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\"...

R - Conditional Substr from dataframe

r,data.frame,character,substring,substr
I need to substr from a column based on start and end locations. The start and end locations are derived from a character search. For example, a single column in Dataframe with 3 rows: 'Bond, Mr. :James' 'Woman, Mrs. :Wonder' 'Hood, Mr. :Robin' Expected Answer in Column 2 is: 'Mr.'...

R, create a new sorted dataframe with use of dplyr?

r,data.frame,subset
i am new to R and a bit overwhelmed by an assignment. i am asked to create a new dataframe out of an existing one ( the diamonds data that come preinstalled with ggplot2). The dataframe should look as follows: mean_price median_price min_price max_price n All sorted by clarity where...

How to get maximum value from a column in a data.frame and get ALL records

r,data.frame,max
I have a data.frame and I wish to get the row that contains the maximum value for the given column Total Txn_date Cust_no Acct_no cust_type Credit Debit Total 09DEC2013 17382 601298644 I 1500 0 1500 16DEC2013 17382 601298644 I 500 0 500 17DEC2013 17382 601298644 I 0 60 60 18DEC2013...

R - How to re-order row index number

r,vector,indexing,data.frame,row
Simply put, I have the following data frame: Signal 4 9998 3 549 1 18 5 2.342 2 0.043 and I want to reset the index numbers to be like : Signal 1 9998 2 549 3 18 4 2.342 5 0.043 ...

Split Variable and insert NA's in between

r,variables,split,data.frame
I have a variable which looks like this: Var [1] 3, 4, 5 2, 4, 5 2, 4 1, 4, 5 I need to split it into a dataframe which looks like this: V1 V2 V3 V4 V5 NA NA 3 4 5 NA 2 NA 4 5 NA 2...

Creating a dataframe from an lapply function with different numbers of rows

r,data.frame,lapply
I have a list of dates (df2) and a separate data frame with weekly dates and a measurement on that day (df1). What I need is to output a data frame within a year prior to the sample dates (df2) and the measurements with this. eg1 <- data.frame(Date=seq(as.Date("2008-12-30"), as.Date("2012-01-04"), by="weeks"))...

How can send a list/vector/array of dataframes to a function in R

r,data.frame
I need to pass several dataframes, each of different dimensions to another function and process each of these data frames one by one. How should I do this. I tried frame <- c(df1,df2,df3...) p <- function(frame) { for(i in 1:seq_along(frame) { do_something(frame[i]) ..... } } But this doesn't work....

Getting Values of Specific Elements of a data frame in R

r,vector,data.frame,r-factor
I have a very simple code, I do not understand why not working the way I want. Basically, I have a data frame and want to capture the value of n'th element of a column in the data frame, and store it in a vector. Here is my code: COL1_VALUES...

Grep in R (zero or any character)

r,data.frame,xls
I have files with a lot of data, but in several of them date is in this format: YYYYMMDD, f. e. 20150704 And in the others date is in this format: YYYY-MM-DD, f. e. 2015-07-04 I want to grep to find specific date, can I do this by one grep...

Replace NA's and delete columns in an efficient way

r,performance,data.frame
I've got a dataframe which looks like follows: # Code: m3 <- c(NA, -3, NA, NA, -3) m2 <- c(rep(NA, 5)) m1 <- c(rep(NA, 5)) Zero <- c(rep(NA, 5)) p1 <- c(1, NA, NA, 1, NA) p2 <- c(NA, NA, NA, 2, NA) p3 <- c(3, NA, 3, 3, NA)...

Apply User-Defined Function to Specific Dataframe Columns

r,data.frame
I have seen the question here R apply user define function on data frame columns but mine is a little different... I would like to pass specific columns of my dataframe to a function, then return the result of that function to a new column of my dataframe. Basically, I...

Manipulating Data Frame in R by ID

r,data.frame,reshape
I have a data frame: ID date term estimate unit1 1/1/2015 intercept 1.01 unit1 1/1/2015 x1 2.01 unit1 1/1/2015 x2 3.01 unit1 1/1/2015 x3 4.01 unit1 1/1/2015 x4 5.01 unit2 1/1/2015 intercept 1.01 unit2 1/1/2015 x1 -1.01 unit2 1/1/2015 x2 1.01 unit2 1/1/2015 x3 2.01 unit1 1/2/2015 intercept 1.01 unit1...

Count consecutive occurrences of a specific value in every row of a data frame in R

r,data.frame
I've got a data.frame of monthly values of a variable for many locations (so many rows) and I want to count the numbers of consecutive months (i.e consecutive cells) that have a value of zero. This would be easy if it was just being read left to right, but the...

r cumsum-like function for splitting dataframe

r,data.frame
Given the following dataframe: mydf <- data.frame(x=c(1:10,10:1),y=c(10:1,1:10)) How is it possible to split it such that each sub-dataframe will have consecutive values of one column which are greater than the other column? For example in mydf, the outcome that I am hoping for is spliting it into three dataframes: (y...

Regression loop in R for data frames

r,loops,statistics,data.frame,regression
rm(list=ls()) myData <-read.csv(file="C:/Users/Documents/myfile.csv",header=TRUE, sep=",") for(i in names(myData)) { colNum <- grep(i,colnames(myData)) ##asigns a value to each column if(is.numeric(myData[3,colNum])) ##if row 3 is numeric, the entire column is { ##print(nxeData[,i]) fit <- lm(myData[,i] ~ etch_source_Avg, data=myData) #does a regression for each column in my csv file against my independent variable 'etch'...

Creating new columns by splitting a variable into many variables (in R)

r,string,split,data.frame
I want to create new columns by splitting a vector in a data frame. I have such a data frame: YEAR Variable1 Variable2 2009 000000 00000001 2010 000000 00000001 2011 000000 00000001 2009 000000 00000002 2010 000000 00000002 2009 000000 00000003 ... 2009 100000 10000001 2010 100000 10000001 ... 2009...

R - extract row as character string with double quote

r,data.frame,character
I have a data frame called dataf dataf<-data.frame(replicate(10,sample(0:1,10,rep=TRUE))) and I would like to extract each row of this data frame as a character string like with this function : result=data.frame(matrix(NA, ncol=1, nrow=10)) i=0 for(i in 0:9) { result[i+1,]=toString(dataf[i+1,]) } but result is not as expected : 1, 0, 0, 0,...

Convert a column of strings in data frame into numeric in R (not the usual kind)

r,string,vector,data.frame,numeric
So in R I have a column consisting of strings that look like something similar to this: "Peter","Paul","John","Melissa","Paul","Peter" ... And I want to convert these names to a numerical ID format, like this: 1,2,3,4,2,1 In other words - I want to create a numerical ID for the names where same...

R create new column based on if else condition

r,data.frame
I have a data frame consisting of n column and one of them is food. food column possible values are apple, tomato, cabbage, sausage, beer, vodka, potato. I want to create a new column in my data data frame as follows: if food==apple or food==tomato or food==potato, then assign vegetables,...

Split multiple values from a single variable within a data frame

r,data.frame
I have the following dataframe which contains several values for a single variable (Problemas.habituales) (see below) > read.csv("http://pastebin.com/raw.php?i=gnWRqJnY") Nombre.barrio Problemas.habituales 1 Actur Robos con violencia, Agresiones, Otros problemas 2 Actur Ningún problema 3 Centro Robos con violencia, Agresiones 4 San Pablo Ningún problema 5 San Pablo Ningún problema 6 Delicias...

How to select only complete in a panda data.frame

python,machine-learning,data.frame
I have the following data-set on python import pandas as pd bcw = pd.read_csv('http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data', header=None) Lines like 24 have missing values: 1057013,8,4,5,1,2,?,7,3,1,4 On column 7, there is a '?', and I want to drop this line. How can I achieve this? ...

R breaking words into new columns [duplicate]

r,data.frame
This question already has an answer here: Split column at delimiter in data frame 6 answers I'm having some problems in the following data frame: Treat.Name HWAH P_Control_1 2918.000 P_Control_2 2818.536 P_Control_3 2619.036 P_EMFL10_1 2740.786 P_EMFL10_2 2616.893 P_EMFL10_3 2395.964 I'm trying to break the character names in Treat.Name right...

How to call a data.frame inside an R Function by name

r,function,data.frame,call
I want to write a function that runs the same analysis on different data.frames. Here is a simple version of my code: set1 <- data.frame(x=c(1,2,4,6,2), y=c(4,6,3,56,4)) set2 <- data.frame(x=c(3,2,3,8,2), y=c(2,6,3,6,3)) mydata <- c("set1", "set2") for (dataCount in 1:length(data)) { lm(x~y, data=mydata) } How do I call a data.frame by name...

Divide dataframe rows based on repeat sequence

r,data.frame
I have an example dataframe below. I am trying to take each sequence of 3 rows and divide the first by the 3rd (or in other words, class "a" by class "c", for every id). What's the most straightforward way to do this? Thanks in advance. id class value 0...

How to subset by distinct rows in a data frame or matrix?

r,matrix,filter,data.frame,subset
Suppose I had the following matrix: matrix(c(1,1,2,1,2,3,2,1,3,2,2,1),ncol=3) Result: [,1] [,2] [,3] [1,] 1 2 3 [2,] 1 3 2 [3,] 2 2 2 [4,] 1 1 1 How can I filter/subset this matrix by whether or not each row has duplicate values? For example, in this case, I would only...

Summing multiple columns to equal -1,0,1 [closed]

r,matrix,data.frame
I'm trying to create a simple way to change the Total column to either -1, 1, or 0. When x=1, y=1, z=1, then Total=1. When x=-1, y=1, z=1, then Total=-1. In all other cases, Total=0. So, rows 2013-07-03 and 2013-07-05 should have Total=1. Row 2013-07-09 should have Total=-1. All other...

Common elements in data frames

r,data.frame,bioinformatics,intersection
I have three data frames, with a lot of information and the following row names: ENSG00000000971 ENSG00000000971 ENSG00000000971 ENSG00000004139 ENSG00000004139 ENSG00000003987 ENSG00000005001 ENSG00000004848 ENSG00000004848 ENSG00000005102 ENSG00000002330 ENSG00000002330 ENSG00000005486 ENSG00000005102 ENSG00000006047 ... ... ... What I want to do, is to find all the entries (row names) that are common in...

Convert list of overlapping data.frames into single data.frame

r,data.frame,plyr
I have some population information from multiple cohorts in a list. Each cohort covers an overlapping time period. The data looks like the following: > raw.data $`1` Year Pop 1 1920 1927433 2 1921 1914551 3 1922 1900776 $`2` Year Pop 1 1921 1915576 2 1922 1902075 3 1923 1887613...

R dataframe column multiplication with sapply

r,data.frame
I need to multiply columns in R data.frame. I want to do this based on certain patterns in the column names. This is very elementary task, but I struggle to make it work with sapply() or some related function. This is what I've tried thus far. df <- data.frame("pA" =...

R - ff package : find the most frequent element in ffdf and delete the rows where is located

r,data.frame,ff,ffbase
I need a suggestion to find the most frequent element in ffdf and after that to delete the rows where is located. I decided to try the ff package as I'm working with very big data and with base R I am running out of memory. Here is a little...

Grouping key/value columns into single rows

r,data.frame,data.table,tidyr
I'm trying to take key-value combinations and put all the values on the same row as the keys. I'm pretty sure I knew how to do this at one point (I think with data.table) and I've been looking at the usual suspects reshape2, tidyr, data.table, etc, but I can't seem...

Double for loop to save several files using R

r,for-loop,data.frame
I am trying to do a “for loop” to generate files based on the column "group". I want to create a file for each group. My data is much bigger, but a sample would be: id = c(1,2,3,4,5,6,7,8,9,10) group = c(3,1,3,2,1,3,1,2,4,4) weight = c(10,11,12,13,14,15,16,17,18,19) index1 = c(50,50,50,50,50,50,50,50,50,50) index2 = c(50,50,50,50,50,50,50,50,50,50)...

how to extract two columns of data using R by loop

r,loops,data.frame,multiple-columns
I have a dataframe with 1000 columns of data str(MT) 'data.frame': 1356 obs. of 1000 variables: $ Date : Factor w/ 1356 levels "Apr-1900","Apr-1901",..: 453 340 792 1 905 679 566 114 1244 1131 ... $ Year : int 1900 1900 1900 1900 1900 1900 1900 1900 1900 1900 ......

R: how to access members of array loaded in dataframe elements

r,data.frame
From a csv file I loaded date into an R dataframe that looks like this: > head(mydata) row lengthArray sports num_runs percent_runs 1 0 4 [24, 18, 24, 18] 0 0 2 1 10 [2, 2, 2, 2, 2, 2, 2, 2, 2, 2] 0 0 3 2 4 [0,...