FAQ Database Discussion Community


replacing NA with value in adjacent column in R

r,vector,replace,atomic,na
I want to replace the NA in the IMIAVG column with the value in the IMILEFT or IMIRIGHT column in the same row when necessary (i.e Row 1, 6, 7). I've tried multiple things but nothing seems to work. Does this need a loop? Please note errors keep coming up...

Calculate column medians with NA's

r,na,median
I am trying to calculate the median of individual columns in R and then subtract the median value with every value in the column. The problem that I face here is I have N/A's in my column that I dont want to remove but just return them without subtracting the...

Sum NA values in r

r,sorting,na
I am using a dataframe that has multiple NA values so I was thinking about sorting the attributes based on their NA values. I was trying to use a for loop and this is what I have so far: > data <- read.csv("C:/Users/Nikita/Desktop/first1k.csv") > for (i in 1:length(data) ) {...

How to select the first and last one test without NA in r

r,select,na
I have asked a similar question yesterday: How to select the last one test without NA in r But I found the problem becomes difficult that there are NA exist in the first column. My dataframe is similar like this: Person W.1 W.2 W.3 W.4 W.5 1 NA 57 52...

functions na.rv(T), na.omit, is.finite, etc. don't work for the mean of a column

r,mean,na,inf
I'm trying to calculate the mean of a large df, dividing observations by Id and month and none of the answers I found work as I expect, sometimes they empty my sample and that's not useful. If df is: permno company amihud illiq MonthYr 10026 J & J SNACK FOODS...

Group instances based on NA values in r

r,file,csv,instance,na
I am reading a csv file and unfortunately my dataframe has many missing values. A small snip is as following: df <- data.frame(Size= c(800, 850, 1100, 1200, 1000), Value= c(900, NA, 1300, 1100, NA), Location= c(NA, 'midcity', 'uptown', NA, 'Lakeview'), Num1 = c(2, NA, 3, 2, NA), Num2 = c(2,3,3,1,2),...

How test if an NA value is equal to zero; replace if so, leave as NA if not

r,replace,na
Edited as hadn't included the full set of factors in the example dataset, which is causing the original solutions to break. I'm trying to clean a dataset by determining if an NA should be replaced with a 0, or if left as NA. The below is a sample data set....

Aggregate NAs in R

r,aggregate,nan,na
I'm having trouble handling NAs while calculating aggregated means. Please see the following code: tab=data.frame(a=c(1:3,1:3), b=c(1,2,NA,3,NA,NA)) tab a b 1 1 1 2 2 2 3 3 NA 4 1 3 5 2 NA 6 3 NA attach(tab) aggregate(b, by=list(a), data=tab, FUN=mean, na.rm=TRUE) Group.1 x 1 1 2 2 2...

How to deal with NA when using lappy in R

r,if-statement,na
I have a data frame err consisting of 796 rows and 54432 columns I have to check the columns that have values not exceeding 20 and -20. This is my approach: do.call(cbind, (lapply(err, function(x) if((all(x<20) & all(x>-20))) return(x) ))) I Have NA values in all of the columns and after...

strptime returning NA values [closed]

r,date,format,na,strptime
I'm trying to use strptime to format dates I'm reading in but only get NA values are returned in the output. My raw data is in the format of 1974-01-01, and the length of the dataset is 12049 so the last date is 2006-12-31. The code I use is: Data$date.yyyymmdd...

Create Flag Variables in R using a function

r,function,data.frame,na
I am wanting to create a function in R, that would output flag variables that are derived from the original variables in the data frame, and then ideally for every variable in the data frame. I want to create a new variable for each variable in the data frame, and...

Why pmax(dataFrame, int) would introduce NAs?

r,data.frame,na
I'm running into a behavior of pmax that I can't quite understand: pmax(data.frame(matrix(1:16, nrow=4)), c(6)) would return X1 X2 X3 X4 1 6 NA 9 13 2 6 6 10 14 3 6 7 11 15 4 6 8 12 16 What I don't understand is why only the entries...

converting “1984-03-25 02:00:00” to POSIX gives NA

r,datetime,posix,na
While converting a vector of date-time to POSIXlt, just one particular time "25-Mar-1984-02:00" "is converted to POSIXlt but returns NA! So, this row was getting omitted in my analysis/plots. is.na(as.POSIXlt("25-Mar-1984-02:00",format = "%d-%b-%Y-%H:%M")) is.na(as.POSIXlt("25-Mar-1984-03:00",format = "%d-%b-%Y-%H:%M")) is.na(as.POSIXlt("25-Mar-1984-01:00",format = "%d-%b-%Y-%H:%M")) is.na(as.POSIXlt("24-Mar-1984-02:00",format = "%d-%b-%Y-%H:%M")) is.na(as.POSIXlt("26-Mar-1984-02:00",format = "%d-%b-%Y-%H:%M")) returns TRUE, FALSE, FALSE,...

From multiple matricies, calculate percentage of a value from each matrix and store them into a new vector

r,function,matrix,percentage,na
I searched through stackoverflow and have failed to find a solution to my specific problem. Here, I want to calculate the percentage of an element from multiple matrices with NA values, and store the calculations into a new vector - specifically, each matrix contains 1s, 0s, and NA values. This...

Excel VLookup #NV error

excel,vlookup,na
I'm trying to make a VLookup in Excel but I get everytime a #NV error. This is table EVENTS: This is table TRACK: the formula on field F2 in table EVENTS is =SVERWEIS(E2;TRACKS!$A$2:$B$52;1;FALSCH) SVERWEIS is the word for VLOOKUP in the German version. FALSCH means wrong...

Merging data frames row-wise and column-wise in R

r,merge,repeat,na
How can one merge two data frames, one column-wise and other one row-wise? For example, I have two data frames like this: A: add1 add2 add3 add4 1 k NA NA NA 2 l k NA NA 3 j NA NA NA 4 j l NA NA B: age size...

split dataframe in groups before each non-NA

r,split,subset,apply,na
I am looking to split my dataframe into subsets according to the column "Height" with each subset having one row with a value and 0-Inf rows with NAs. This is, to be able to apply functions to the subsets afterwards, specifically order the rows according to their "Diameter" value,...

ggplot line graph with NA values

r,ggplot2,na
I'm having with trouble with ggplot trying to plot 2 incomplete time series on the same graph where the y data does not have the same values on the x-axis (year) - NAs are thus present for certain years : test<-structure(list(YEAR = c(1937, 1938, 1942, 1943, 1947, 1948, 1952, 1953,...

logical dataframe with numerical dataframe and substitute FALSE by NA with R

r,na
this is probably a very stupid question to be asked here, but I'm an absolute beginner in r and I looked everywhere and tried multiple things and I couldn't solve the problem. So I have a two dataframes: df which contains numeric values and na's and another, j which contains...

NA output from sum of numbers in R

r,sum,na,func
I have a function and I get NA from sum function, the first sum works well, but the second sum does not work and returns NA. This is the function: Gramm.Pred.Err <- function(acts , grammProbs) { acts <- as.numeric(acts) grammProbs <- as.numeric(grammProbs) print("acts is:") print(acts) print("grammProbs is:") print(grammProbs) false.ind =...

R date column error using data[data==“”] <- NA

r,date,na
I am working with a data set which has all kinds of column classes, including class "Date". I try to assign NA to all empty values in this data set the following way: data[data==""] <- NA Obviously the date column makes some problems here, because there is the following error:...

return NA value in NumericVector Rcpp unexpected behavior

r,rcpp,na
I am writing a cpp function to replace any NA values with the next non-na value. Code works properly regarding the replacement, however I want to return back the NA values for those that don't have a later non-NA value. Eg: fill_backward(c(1, NA, 2)) --> 1, 2, 2 fill_backward(c(1, NA,...

Inserting NA after Test

r,list,na
I have a list full of of data.frames with two columns, time and signal. The data.frames are the results of GC chromatographic analysis from a process that was periodically sampled. I want to compare the gc data I've collected. I've written a function to convert the times and peak areas...

Python pandas - averaging 10 min measurements to 15min mean and 60min mean depending on the length of the data gap

python,pandas,mean,na
i'm quite new in programming with pyhton and I hope anyone of you is in mood to help me. Well, i have many differnt climate stations with solar radiation measurements in a 1 minute and also in 10 minutes time resolution. The measurements contains also Na values. Now I'd like...

Replacing certain values in a data frame as NAs

r,data.frame,na
Suppose I have a data.frame names <- c("John", "Mark", "Larry", "Will", "Kate", "Daria", "Tom") gender <- c("M", "M", "M", "M", "F", "F", "M") mark <- c(1, 2, 3, 1, 2, 3, 1) df <- data.frame(names, gender, mark) df names gender mark 1 John M 1 2 Mark M 2 3...

Replace Inf in R data.table / Show number of Inf in colums

r,data.table,infinite,na
I can't figure out how to use an is.na(x) like function for infinite numbers in R with a data table or show per column how many Inf's there are: colSums(is.infinite(x)) I use the following example data set: DT <- data.table(a=c(1/0,1,2/0),b=c("a","b","c"),c=c(1/0,5,NA)) DT a b c 1: Inf a Inf 2: 1...

How to convert NA from factor vector to value of 0

r,na
I want to convert NA from factor vector to value of 0 This is an example myVec <- c(NA, 1, 2, NA) myVec [1] NA 1 2 NA myVec <- factor(myVec) [1] <NA> 1 2 <NA> Levels: 1 2 for(i in 1:4){ if( is.na(myVec[i]) ) { myVec[i] = "0"} }...

Function to assign cell value to subsequent NA-cells (same column) [duplicate]

r,if-statement,na
This question already has an answer here: Replacing NAs with latest non-NA value 4 answers Thanks for taking your time to look at my problem! Am new to the forum and relatively new to R, but I'll do my best to formulate the question clearly. I have a big...

barplot column for

r,bar-chart,na
I would like to have a column in my barplot for missing data. adult <- read.csv( "http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data", header = FALSE, na.strings = "?", strip.white = TRUE ) colnames(adult) <- c("age", "workClass", "fnlwgt", "education", "educationNum", "maritalStatus", "occupation", "relationship", "race", "sex", "capitalGain", "capitalLoss", "hoursPerWeek", "nativeCountry", "prediction") barplot(table(adult$workClass), main="Job Distribution", xlab="Job", ylab="Count",las=2) I...

dplyr join define NA values

r,left-join,dplyr,na
Can I define a "fill" value for NA in dplyr join? For example in the join define that all NA values should be 1? require(dplyr) lookup <- data.frame(cbind(c("USD","MYR"),c(0.9,1.1))) names(lookup) <- c("rate","value") fx <- data.frame(c("USD","MYR","USD","MYR","XXX","YYY")) names(fx)[1] <- "rate" left_join(x=fx,y=lookup,by=c("rate")) Above code will create NA for values "XXX" and "YYY". In my...

Identify data blocks

r,na
I have a vector with either a negative value or NA and a threshold: threshold <- -1 example <- c(NA, NA, -0.108, NA, NA, NA, NA, NA -0.601, -0.889, -1.178, -1.089, -1.401, -1.178, -0.959, -1.085, -1.483, -0.891, -0.817, -0.095, -1.305, NA, NA, NA, NA, -0.981, -0.457, -0.003, -0.358, NA, NA)...

R: deleting columns where certain percentage of values is missing [duplicate]

r,data.frame,apply,na,missing-data
This question already has an answer here: Deleting columns from a data.frame where NA is more than 15% of the column length 1 answer I'm working with a data frame resembling the extract below. sample.df Obs Var1 Var2 Var3 A0001 21 21 21 A0002 21 78 321 A0003 32...

Removing non-numeric values from data in R

r,data.frame,na
I have a large matrix of data I want to import. Annoyingly all of the "NA" values are displayed as "*****" and when I read my data into R it imports as a matrix of factors. str(x) 'data.frame': 5 obs. of 5 variables: $ 1: Factor w/ 704 levels "*****","0","100.1",..:...

make sum of an empty set/set of NA's NA instead of 0?

r,sum,na
The sum function returns 0 if it is applied to an empty set. Is there a simple way to make it return NA if it is applied to a set of NA values? Here is a borrowed example: test <- data.frame(name = rep(c("A", "B", "C"), each = 4), var1 =...

omitting NA values with data.table

r,data.table,na
How can I get this to work? library(data.table) RRR <-data.table(1:15,runif(15),rgeom(15,0.5),rbinom(15,2,0.5)) na.omit(RRR[(RRR==0)] <- NA) I want to replace some values (here those ==0) by NA. And then remove that rows. Or if you want to run benchmarks you can use a larger data.table: set.seed(1) n <- 1000000 RRR <- data.table(matrix(rgeom(100*n,0.5), ncol=100))...

Recoding variables with NAs in R

r,na,recode
I am trying to code a new variable based on the values of three other variables. Specifically, if all of the variables are NA, I would like the new variable to take NA and if any of them are 1, it should take a 1, otherwise it should take a...

R: Volatility function that interprets NAs

r,data.frame,time-series,na
I am looking for help with getting a volatility function to work with my dataframe. In the function below, I'm just trying to get price daily log returns for each security (each column in my data is a different security's prices over time), and then calculate an annualized vol. volcalc=...

How to sum each row in a .xts object, where values are NOT missing

r,xts,na,data-manipulation
I currently have a line of code set up to sum each row/date of a xts object, where value for each data point = 1: universe.rt=sapply(X=2:nrow(rt),FUN=function(x){sum(rt[x,which(live[x,]==1)])/count[x]}) I want to change the code such that instead of summing up all points in a row where value = 1, I want to...

How to select the last one test without NA in r

r,select,na
My dataframe is similar like this: Person W.1 W.2 W.3 W.4 W.5 1 62 57 52 59 NA 2 49 38 60 NA NA 3 59 34 NA NA NA Is there a way to select the first and last test without "NA". I have 300 data entries, and W.1...

Empty rows in list as NA values in data.frame in R

r,list,lapply,na,rbind
I have a dataframe as follows: hospital <- c("PROVIDENCE ALASKA MEDICAL CENTER", "ALASKA REGIONAL HOSPITAL", "FAIRBANKS MEMORIAL HOSPITAL", "CRESTWOOD MEDICAL CENTER", "BAPTIST MEDICAL CENTER EAST", "ARKANSAS HEART HOSPITAL", "MEDICAL CENTER NORTH LITTLE ROCK", "CRITTENDEN MEMORIAL HOSPITAL") state <- c("AK", "AK", "AK", "AL", "AL", "AR", "AR", "AR") rank <- c(1,2,3,1,2,1,2,3) df...

Removing Columns Named “NA”

r,na
I'm dealing with some RNA-seq count data for which I have ~60,000 columns containing gene names and 24 rows containing sample names. When I did some gene name conversions I was left with a bunch of columns that are named NA. I know that R handles NA differently than a...