r , Error when Fitting a glmer with poisson error structure


Error when Fitting a glmer with poisson error structure

Question:

Tag: r

I hope somebody can help me. I'm trying to conduct an analysis which examines the number of samples of Hymenoptera caught over an elevational gradient. I want to examine the possibility of a uni-modal distribution in relation to elevation, as well as a linear distribution. Hence I am including I(Altitude^2) as an explanatory variable in the analysis.

I am trying to run the following model which includes a Poisson error structure (as we are dealing with count data) and date and Trap Type (Trap) as random effects.

model7 <- glmer(No.Specimens~Altitude+I(Altitude^2)+(1|Date)+(1|Trap),
       family="poisson",data=Santa.Lucia,na.action=na.omit)

However I keep receiving the following error message:

Error: (maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate
In addition: Warning messages:
1: Some predictor variables are on very different scales: consider rescaling 
2: In pwrssUpdate(pp, resp, tolPwrss, GQmat, compDev, fac, verbose) :
  Cholmod warning 'not positive definite' at file:../Cholesky/t_cholmod_rowfac.c, line 431
3: In pwrssUpdate(pp, resp, tolPwrss, GQmat, compDev, fac, verbose) :
  Cholmod warning 'not positive definite' at file:../Cholesky/t_cholmod_rowfac.c, line 431

Clearly I am making some big mistakes. Can anybody help me figure out where I am going wrong?

Here is the structure of the dataframe:

str(Santa.Lucia)
'data.frame':   97 obs. of  6 variables:
 $ Date        : Factor w/ 8 levels "01-Sep-2014",..: 6 6 6 6 6 6 6 6 6 6 ...
 $ Trap.No     : Factor w/ 85 levels "N1","N10","N11",..: 23 48 51 14 17 20 24 27 30 33 ...
 $ Altitude    : int  1558 1635 1703 1771 1840 1929 1990 2047 2112 2193 ...
 $ Trail       : Factor w/ 3 levels "Cascadas","Limones",..: 1 1 1 1 1 3 3 3 3 3 ...
 $ No.Specimens: int  1 0 2 2 3 4 5 0 1 1 ...
 $ Trap        : Factor w/ 2 levels "Net","Pan": 2 2 2 2 2 2 2 2 2 2 ...

And here is the complete data.set (these are only my preliminary analyses)

           Date Trap.No Altitude    Trail No.Specimens Trap
1   28-Aug-2014      W2     1558 Cascadas            1  Pan
2   28-Aug-2014      W5     1635 Cascadas            0  Pan
3   28-Aug-2014      W8     1703 Cascadas            2  Pan
4   28-Aug-2014     W11     1771 Cascadas            2  Pan
5   28-Aug-2014     W14     1840 Cascadas            3  Pan
6   28-Aug-2014     W17     1929    Tower            4  Pan
7   28-Aug-2014     W20     1990    Tower            5  Pan
8   28-Aug-2014     W23     2047    Tower            0  Pan
9   28-Aug-2014     W26     2112    Tower            1  Pan
10  28-Aug-2014     W29     2193    Tower            1  Pan
11  28-Aug-2014     W32     2255    Tower            0  Pan
12  30-Aug-2014      N1     1562 Cascadas            5  Net
13  30-Aug-2014      N2     1635 Cascadas            0  Net
14  30-Aug-2014      N3     1723 Cascadas            2  Net
15  30-Aug-2014      N4     1779 Cascadas            0  Net
16  30-Aug-2014      N5     1842 Cascadas            3  Net
17  30-Aug-2014      N6     1924    Tower            2  Net
18  30-Aug-2014      N7     1979    Tower            2  Net
19  30-Aug-2014      N8     2046    Tower            0  Net
20  30-Aug-2014      N9     2110    Tower            0  Net
21  30-Aug-2014     N10     2185    Tower            0  Net
22  30-Aug-2014     N11     2241    Tower            0  Net
23  31-Aug-2014      N1     1562 Cascadas            1  Net
24  31-Aug-2014      N2     1635 Cascadas            1  Net
25  31-Aug-2014      N3     1723 Cascadas            0  Net
26  31-Aug-2014      N4     1779 Cascadas            0  Net
27  31-Aug-2014      N5     1842 Cascadas            0  Net
28  31-Aug-2014      N6     1924    Tower            0  Net
29  31-Aug-2014      N7     1979    Tower            7  Net
30  31-Aug-2014      N8     2046    Tower            4  Net
31  31-Aug-2014      N9     2110    Tower            6  Net
32  31-Aug-2014     N10     2185    Tower            1  Net
33  31-Aug-2014     N11     2241    Tower            1  Net
34  01-Sep-2014      W1     1539 Cascadas            0  Pan
35  01-Sep-2014      W2     1558 Cascadas            0  Pan
36  01-Sep-2014      W3     1585 Cascadas            2  Pan
37  01-Sep-2014      W4     1604 Cascadas            0  Pan
38  01-Sep-2014      W5     1623 Cascadas            1  Pan
39  01-Sep-2014      W6     1666 Cascadas            4  Pan
40  01-Sep-2014      W7     1699 Cascadas            0  Pan
41  01-Sep-2014      W8     1703 Cascadas            0  Pan
42  01-Sep-2014      W9     1746 Cascadas            1  Pan
43  01-Sep-2014     W10     1762 Cascadas            0  Pan
44  01-Sep-2014     W11     1771 Cascadas            0  Pan
45  01-Sep-2014     W12     1796 Cascadas            1  Pan
46  01-Sep-2014     W13     1825 Cascadas            0  Pan
47  01-Sep-2014     W14     1840    Tower            4  Pan
48  01-Sep-2014     W15     1859    Tower            2  Pan
49  01-Sep-2014     W16     1889    Tower            2  Pan
50  01-Sep-2014     W17     1929    Tower            0  Pan
51  01-Sep-2014     W18     1956    Tower            0  Pan
52  01-Sep-2014     W19     1990    Tower            1  Pan
53  01-Sep-2014     W20     2002    Tower            3  Pan
54  01-Sep-2014     W21     2023    Tower            2  Pan
55  01-Sep-2014     W22     2047    Tower            0  Pan
56  01-Sep-2014     W23     2068    Tower            1  Pan
57  01-Sep-2014     W24     2084    Tower            0  Pan
58  01-Sep-2014     W25     2112    Tower            1  Pan
59  01-Sep-2014     W26     2136    Tower            0  Pan
60  01-Sep-2014     W27     2150    Tower            1  Pan
61  01-Sep-2014     W28     2193    Tower            1  Pan
62  01-Sep-2014     W29     2219    Tower            0  Pan
63  01-Sep-2014     W30     2227    Tower            1  Pan
64  01-Sep-2014     W31     2255    Tower            0  Pan
85   03/06/2015    WT47     1901    Tower            2  Pan
86   03/06/2015    WT48     1938    Tower            2  Pan
87   03/06/2015    WT49     1963    Tower            2  Pan
88   03/06/2015    WT50     1986    Tower            0  Pan
89   03/06/2015    WT51     2012    Tower            9  Pan
90   03/06/2015    WT52     2033    Tower            0  Pan
91   03/06/2015    WT53     2050    Tower            4  Pan
92   03/06/2015    WT54     2081    Tower            2  Pan
93   03/06/2015    WT55     2107    Tower            1  Pan
94   03/06/2015    WT56     2128    Tower            4  Pan
95   03/06/2015    WT57     2155    Tower            0  Pan
96   03/06/2015    WT58     2179    Tower            2  Pan
97   03/06/2015    WT59     2214    Tower            0  Pan
98   03/06/2015    WT60     2233    Tower            0  Pan
99   03/06/2015    WT61     2261    Tower            0  Pan
100  03/06/2015    WT62     2278    Tower            0  Pan
101  03/06/2015    WT63     2300    Tower            0  Pan
102  04/06/2015    WT31     1497 Cascadas            0  Pan
103  04/06/2015    WT32     1544 Cascadas            1  Pan
104  04/06/2015    WT33     1568 Cascadas            1  Pan
105  04/06/2015    WT34     1574 Cascadas            0  Pan
106  04/06/2015    WT35     1608 Cascadas            5  Pan
107  04/06/2015    WT36     1630 Cascadas            3  Pan
108  04/06/2015    WT37     1642 Cascadas            0  Pan
109  04/06/2015    WT38     1672 Cascadas            5  Pan
110  04/06/2015    WT39     1685 Cascadas            6  Pan
111  04/06/2015    WT40     1723 Cascadas            3  Pan
112  04/06/2015    WT41     1744 Cascadas            2  Pan
113  04/06/2015    WT42     1781 Cascadas            1  Pan
114  04/06/2015    WT43     1794 Cascadas            2  Pan
115  04/06/2015    WT44     1833 Cascadas            0  Pan
116  04/06/2015    WT45     1855 Cascadas            4  Pan
117  04/06/2015    WT46     1876 Cascadas            2  Pan           

Answer:

You're almost there. As @BondedDust suggests, it's not practical to use a two-level factor (Trap) as a random effect; in fact, it doesn't seem right in principle either (the levels of Trap are not arbitrary/randomly chosen/exchangeable). When I tried a model with quadratic altitude, fixed effect of trap, and random effect of Date, I was warned that I might want to rescale a parameter:

Some predictor variables are on very different scales: consider rescaling 

(you saw this warning mixed in with your error messages). The only continuous (and hence worth rescaling) predictor is Altitude, so I centered and scaled it with scale() (the only disadvantage is that this changes the quantitative interpretation of the coefficients, but the model itself is practically identical). I also added an observation-level random effect to allow for overdispersion.

The results seem OK, and agree with the picture.

library(lme4)
Santa.Lucia <- transform(Santa.Lucia,
                         scAlt=scale(Altitude),
                         obs=factor(seq(nrow(Santa.Lucia))))
model7 <- glmer(No.Specimens~scAlt+I(scAlt^2)+Trap+(1|Date)+(1|obs),
                family="poisson",data=Santa.Lucia,na.action=na.omit)

summary(model7)

## Random effects:
##  Groups Name        Variance Std.Dev.
##  obs    (Intercept) 0.64712  0.8044  
##  Date   (Intercept) 0.02029  0.1425  
## Number of obs: 97, groups:  obs, 97; Date, 6
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)   
## (Intercept)  0.53166    0.31556   1.685  0.09202 . 
## scAlt       -0.22867    0.14898  -1.535  0.12480   
## I(scAlt^2)  -0.52840    0.16355  -3.231  0.00123 **
## TrapPan     -0.01853    0.32487  -0.057  0.95451   

Test the quadratic term by comparing with a model that lacks it ...

model7R <- update(model7, . ~ . - I(scAlt^2))
## convergence warning, but probably OK ...
anova(model7,model7R)

On principle it might be worth looking at the interaction between the quadratic altitude model and Trap (allowing for different altitude trends by trap type), but the picture suggests it won't do much ...

library(ggplot2); theme_set(theme_bw())
ggplot(Santa.Lucia,aes(Altitude,No.Specimens,colour=Trap))+
    stat_sum(aes(size=factor(..n..)))+
        scale_size_discrete(range=c(2,4))+
            geom_line(aes(group=Date),colour="gray",alpha=0.3)+
                geom_smooth(method="gam",family="quasipoisson",
                            formula=y~poly(x,2))+
                    geom_smooth(method="gam",family="quasipoisson",
                                formula=y~poly(x,2),se=FALSE,
                                aes(group=1),colour="black")

enter image description here


Related:


Converting column from military time to standard time


r,excel
I'm trying to convert a column showing the time of road traffic accidents from military time to standard time. The data looks like this: Col1 Time..24hr. 1 1404 2 322 3 1945 4 1005 5 945 I'd then like to convert to 12hr so for '322' I'd like to make...

R: recursive function to give groups of consecutive numbers


r,if-statement,recursion,vector,integer
Given a sorted vector x: x <- c(1,2,4,6,7,10,11,12,15) I am trying to write a small function that will yield a similar sized vector y giving the last consecutive integer in order to group consecutive numbers. In my case it is (defining groups 2, 4, 7, 12 and 15): > y...

how to call Java method which returns any List from R Language? [on hold]


java,r,rjava
How to call java method which returns list from R Language.

Using R to Assign Treatments to Groups


r
We have seven exposures and 24 groups. We would like to randomly assign five of the seven exposures to groups while also ensuring that we end up with a consistent count for each exposure, meaning that each exposure ends up being exposed about the same number of times. I have...

Translating Stata to R: collapse


r,data.table,stata,code-translation
Just came across a .do file that I need to translate into R because I don't have a Stata license; my Stata is rusty, so can someone confirm that the code is doing what I think it is? Here's the Stata code: collapse (min) MinPctCollected = PctCollected /// (mean) AvgPctCollected...

Serial modification of objects in R


r,oop
I have a number of matrices of the same size: m1.m <- matrix(c(1,2,3,4), nrow=2, ncol=2) m2.m <- matrix(c(5,6,7,8), nrow=2, ncol=2) ... I want to set uniform column and row names to all of them. Currently I am doing it like this: new_col_names <- c("Col1","Col2") new_row_names <- c("Row1","Row2") change_names <- function(m,...

dplyr multiple inputs from Shiny


r,shiny,dplyr
I have a Shiny app that takes input from radio button and then use that to perform filter to the data frame using dplyr in the server side. It works, but now I want to expand it to take multiple inputs to filter, and I have no idea how to...

How to set x-axis with decreasing power values in equal sizes


r,plot,ggplot2,cdf
Currently I am doing some cumulative distribution plot using R and I tried to set x-axis with decreasing power values (such as 10000,1000,100,10,1) in equal sizes but I failed: n<-ceiling(max(test)) qplot(1:n, ecdf(test)(1:n), geom="point",xlab="check-ins", ylab="Pr(X>=x)")+ geom_step() +scale_x_reverse(breaks=c(10000,1000,100,10,1)) +scale_shape_manual(values=c(15,19)) It seems that the output has large interval for 10000, then all the...

optimization algorithm for circular data


r,optimization,circular,maximization
Background: I am interested in localizing a sound source from a suite of audio recorders. Each audio array consists of 6 directional microphones spaced evenly every 60 degrees (0, 60, 120, 180, 240, 300 degrees). I am interested in finding the neighboring pair of microphones with the maximum set of...

R: Using the “names” function on a dataset created within a loop


r,paste,assign,names
I am using a for loop to read in multiple csv files and naming the datasets import1, import2, etc. For example: assign(paste("import",i,sep=""), read.csv(files[i], header=FALSE)) However, I now want to rename the variables in each dataset. I have tried the following: names(as.name(paste("import",i,sep=""))) <- c("xxxx", "yyyy") But get the error "target of...

how to read a string as a complex number?


r
I have a string which has a complex format, how can I use complex() to treat it as a complex number? For example: myStr="0.76+0.41j" now I want to do sth like: myStr_complex=complex(myStr) # my question is how should I do this part? Eventually Im(myStr_complex) should print 0.41 ...

Skip some lines with fread


r,fread
I am interested to skip some lines of my data frame before the header names . How can i do it by skiping all the lines before ID_REF or if ID_REF is not present, check for the pattern ILMN_ and deleting all the lines keeping immediate first if not containing...

Replace -inf, NaN and NA values with zero in a dataset in R


r,time-series,nan,zoo
I am trying to run some trading strategies in R. I have downloaded some stock prices and calculated returns. The new return dataset has a number of -inf, NaN, and NA values. I am reproducing a row of the dataset (log_ret). Its a zoo dataset. library(zoo) log_ret <- structure( c(0.234,-0.012,-Inf,NaN,0.454,Inf),...

R Program Vector, record Column Percent


r,vector,percentage
This is my vector head(sep) I must find percent of all SEP 11 in each row. For instance, in first row, percent of SEP 11 is 100 * ((63 + 124)/ (63 + 124 + 0 + 0)) And would like this stored in newly created 8th column Thanks dput...

R stops displaying maps


r,google-maps,ggmap
Few days ago I was familiarizing myself with displaying maps, plotting points on the map from http://rpubs.com/nickbearman/r-google-map-making Today, I have intermittent success in displaying maps. library(ggmap) map <- qmap('Anaheim', zoom = 10, maptype = 'roadmap') Outputs Map from URL : http://maps.googleapis.com/maps/api/staticmap?center=Anaheim&zoom=10&size=640x640&scale=2&maptype=roadmap&language=en-EN&sensor=false And when I go to the URL...

Subtract time in r, forcing unit of results to minutes [duplicate]


r,posix,posixct
This question already has an answer here: Getting consist units from diff command in R 4 answers I successfully subtracted two POSIXct cols of df1 (below). However, since the time differences are >= 1 hour in all rows, R gives the results in hours. I know that this make...

copy a list of data.tables


r,data.table
I have the following situation: 1) a list of data tables 2) For testing purposes I deliberately want to (deeply) copy the whole list including the data tables 3) I want to take some element from the copied list and add a new column. Here is the code: library(data.table) x...

ggplot equivalent for matplot


r,ggplot2
Is there an equivalent in ggplot2 to plot this dataset? I use matplot, and read that qplot could be used, but it really does not work. ggplot/matplot data<-rbind(c(6,16,25), c(1,4,7), c(NA, 1,2), c(NA, NA, 1)) as.data.frame(data) matplot(data, log="y",type='b', pch=1) ...

Limit the color variation in R using scale_color_grey


r,colors,ggplot2
Before I start, allow me to explain my graph: I have two Genotypes (WTB and whd) and each have two conditions (0 and 7), so I have four lines. Now, I want to make a plot where each variable and its condition is the same color. Anything with whd will...

How to build a 'for' loop with input$i in R Shiny


r,loops,for-loop,shiny
In my shiny app, I build a a number of checkboxes using a for loop, like this: landelist <- c("Danmark", "Tjekkiet", "Østrig", "Belgien", "Tyskland", "Sverige", "USA", "Norge", "Island") landecheckbox <- c() for (land in landelist){ landechek <- paste0("<label class=\"checkbox inline\"><input id=\"", land, "\" type=\"checkbox\" checked><span>", land, "</span></label>") landecheckbox <- c(landechek,...

Keep the second occurrence in a column in R


r,conditional,subset,find-occurrences
I have quite a simple dataset: ID Value Time 1 censored 1 1 censored 2 1 uncensored 3 1 uncensored 4 1 censored 5 1 censored 6 2 censored 1 2 uncensored 2 2 uncensored 3 2 uncensored 4 2 censored 5 I want to keep the first uncensored occurrence,...

Store every value in a sequence except some values


r
If I do the following to a string of letters: x <- 'broke' y <- nchar(x) z <- sequence(y) How do I store every value of the z that isn't the first, last, or middle values of the sequence. In this example if z is 1 2 3 4 5...

Set a timer in R to execute a program


r,timer
I have a program to execute per 15 seconds, how can I achieve this, the program is as followed: print_test<-function{ cat("hello world") } ...

Twitter: Get followers from multiple users at once


r,twitter
I am working on a project where I need to find the reach of some social events. I want to know how many people who were exposed to comments on a festival called Tinderbox in Denmark. What I do is to get the statusses on Twitter including the word "tinderbox"...

Appending a data frame with for if and else statements or how do put print in dataframe


r,loops,data.frame,append
How do I put what I printed in a dataframe with a for loop and if else statements? Basically, this code: list<-c("10","20","5") for (j in 1:3){ if (list[j] < 8) print("Greater") else print("Less") }) #[1] "Less" #[1] "Less" #[1] "Greater" Or should it be something more like this? f3 <-...

how to get values from selectInput with shiny


r,shiny
I am playing around with the shiny packages for some hours now, and wanted to make a select input widget that enables me to download a certain data set from the server. So i figured out a way to get me this data frame containing all my IDs for downloading:...

Fitted values in R forecast missing date / time component


r,time-series,forecasting
I've been doing a variety of models in R with time series data (in XTS format) and I keep running into the same issue where there's no date / time component to the fitted values / forecasts and thus I can't graph them on the same graph as the original...

Convert strings of data to “Data” objects in R [duplicate]


r,date,csv
This question already has an answer here: as.Date with dates in format m/d/y in R 2 answers My problem is that the as.Date function does not convert the values in a "date" column of a data frame into Date objects. I have a data.frame nmmaps. Here is a short...

How to quickly read a large txt data file (5GB) into R(RStudio) (Centrino 2 P8600, 4Gb RAM)


r,large-data
I have a large data set, one of the files is 5GB. Can someone suggest me how to quickly read it into R (RStudio)? Thanks

Return Column Names when True in R


r
I am using R for a project and I have a data frame in in the following format: A B C 1 1 0 0 2 0 1 1 I want to return a data frame that gives the Column Name when the value is 1. i.e. Impair1 Impair2 1...

Sleep Shiny WebApp to let it refresh… Any alternative?


r,shiny,sleep
I have a WebApp that have some renderUI({})... and some of them depend on the input of another. This makes that, briefly, a red error in the webpage appear when I select some options. Because the if() clause of some renderUI({}) depend on the input of a selectizer. The error...

Rbind in variable row size not giving NA's


r,rbind
The initial data frame mergedDf is PROD_CODE 1 PRD0900033,PRD0900135,PRD0900220,PRD0900709 2 PRD0900097,PRD0900550 3 PRD0900121 4 PRD0900353 5 PRD0900547,PRD0900614 After calling mergedDf<-data.frame(do.call('rbind', strsplit(as.character(mergedDf$PROD_CODE),',',fixed=TRUE))) Output becomes X1 X2 X3 X4 1 PRD0900033 PRD0900135 PRD0900220 PRD0900709 2 PRD0900097 PRD0900550 PRD0900097 PRD0900550 3 PRD0900121 PRD0900121 PRD0900121 PRD0900121 4 PRD0900353 PRD0900353 PRD0900353 PRD0900353 5 PRD0900547 PRD0900614...

How to plot data points at particular location in a map in R


r,google-maps,ggmap
I have a dataset that looks like this: LOCALITY numbers 1 Airoli 72 2 Andheri East 286 3 Andheri west 208 4 Arya Nagar 5 5 Asalfa 7 6 Bandra East 36 7 Bandra West 72 I want to plot bubbles (bigger the number bigger would be the bubble) inside...

Aggregating data in R


r
user_id date datetime page 217568 6/12/2015 49:23.9 Vodafone | How to get in touch with Vodafone 135437 6/10/2015 43:35.7 My Vodafone – Manage your Vodafone Pay Monthly Account Online – Vodafone 196094 6/13/2015 33:39.4 Check the status of Vodafone’s mobile network in real-time 74197 6/6/2015 52:46.1 undefined 153501 6/5/2015 02:55.5...

Linear multivariate regression in R


r
I want to model that a factory takes an input of, say, x tonnes of raw material, which is then processed. In the first step waste materials are removed, and a product P1 is created. For the "rest" of the material, it is processed once again and another product P2...

Am I using sapply incorrectly?


r,sapply
This code is suppose to take in a word, and compute values for letters of the word, based on the position of the letter in the word. So for a word like "broke" it's suppose to compute the values for the letter "r" and "k" strg <- 'broke' #this part...

Subsetting rows by passing an argument to a function


r,subset
I have the following data frame which I imported into R using read.table() (I incorporated read.table() within read_data() which is a function I created that also throw messages in case the file name is not written appropriately): > raw_data <- read_data("n44.txt") [1] #### Reading txt file #### > head(raw_data) subject...

How can I minimize this function in R?


r,function,optimization,mathematical-optimization
I'm attempting to write a formula that will determine a value of a that minimizes the function output myfun (i.e. a-fptotal). MWE: c <- as.matrix(c(.25,.5,.25)) d <- as.matrix(c(10000,12500,15000)) e <- 700 f <- 1.1 tr <- .30 myfun <- function(a) { b <- max(a-e,0) df <- data.frame(u1=c(c*b*.40),u2=c(c*b*.60)) df$year <- 1:nrow(df)...

How (in a vectorized manner) to retrieve single value quantities from dataframe cells containing numeric arrays?


r,dataframes,vectorization
I've got a dataframe that includes columns like the one on the right here: lengthArray speed_max 1 4 24, 18, 24, 18 2 10 2, 2, 2, 2, 2, 2, 2, 2, 2, 2 3 4 -999, -999, -999, -999 4 2 -999, -999 5 2 18, 18 6 1...

Highlighting specific ranges on a Graph in R


r,graph,highlight
library(season) plot(CVD$yrmon, CVD$cvd, type = 'o',pch = 19,ylab = 'Number of CVD deaths per month',xlab = 'Time') if i wanted to highlight a region of the graph based on x values say from 1994-1998 how do i do this? Any thought would be appreciated Thanks....

Correlate by levels of a variable in R


r,correlation
I would like to correlate two variables and have the output reported separately for levels of a third variable. My data are similar to this example: var1 <- c(7, 8, 9, 10, 11, 12) var2 <- c(18, 17, 16, 15, 14, 13) categories <- c(1, 2, 3, 1, 2, 3)...

Select / subset spatial data in R


r,dictionary,spatial
I am working on a large data set with spatial data (lat/long). My data set contains some positions that I don´t want in my analysis (it makes the files to heavy to process in ArcMap- many Go of data). This is why I want to subset the relevant data for...

Histogram-like summary for interval data


r,statistics,histogram
How do I get a histogram-like summary of interval data in R? My MWE data has four intervals. interval range Int1 2-7 Int2 10-14 Int3 12-18 Int4 25-28 I want a histogram-like function which counts how the intervals Int1-Int4 span a range split across fixed-size bins. The function output should...

R — frequencies within a variable for repeating values


r,count,duplicates
I've got a column A, which has several values, some of them repeating. So, example: A = c(5, 9, 6, 5, 5). I need to go through A and count the frequencies of each of the values in A. So, for this example, for the set of 5s in A,...

Fitting a subset model with just one lag, using R package FitAR


r,time-series
I am trying to fit a subset model with only lag 4. In the manual it's written "you must use p=c(0,0,0,4) since p=4 will fit a full AR(4)". I did this. #fit a subset model with just lag 4 Fit=FitAR(p=c(0,0,0,4), lag.max = "default", ARModel = "ARz") However, I get the...

Remove quotes to use result as dataset name


r,string
I've got a vector with a long list of dataset names. E.g myvector<-c('ds1','ds2,'ds3') I'd like to use the names ds1..ds3 to write a file, taking the file name from the vector. Like this: write.csv(dataset[i],file=paste(myvector[i],'.csv',sep='') with dataset being d1...ds3, but without quotes. How can I remove the quotes and refer to...

ggplot2 & facet_wrap - eliminate vertical distance between facets


r,ggplot2
I'm working with some data that I want to display as a nxn grid of plots. Edit: To be more clear, there's 21 categories in my data. I want to facet by category, and have those 21 plots in a 5 x 5 square grid (where the orphan is by...

How to split a text into two meaningful words in R


r,string-split,stemming,text-analysis
I had a text data frame having sentences, and as I wanted the list of separate words in another dataframe I used the "qdap package" function "all_words" Words = all_words(df$problem_note_text, begins.with=NULL , alphabetical = FALSE, apostrophe.remove = TRUE, char.keep = char2space, char2space = "~~") Now have a dataframe which has...

Find multiple consecutive empty lines


r
I'm trying to chop up a text file into the articles it contains. Usually this is done by identifying a pattern each article begins with. Unfortunately the database I downloaded the articles from doesn't have that. The only pattern I can find is that after each article there are 3...

Count number of rows meeting criteria in another table - R PRogramming


r
I have two tables, one with property listings and another one with contacts made for a property (i.e. is someone is interested in the property they will "contact" the owner). Sample "listings" table below: listings <- data.frame(id = c("6174", "2175", "9176", "4176", "9177"), city = c("A", "B", "B", "B" ,"A"),...