FAQ Database Discussion Community


R: HAC by NeweyWest using dynlm

r,time-series,regression
This is what I would like to do: library("lmtest") library("dynlm") test$Date = as.Date(test$Date, format = "%d.%m.%Y") zooX = zoo(test[, -1], order.by = test$Date) f <- d(Euribor3) ~ d(Ois3) + d(CDS) + d(Vstoxx) + d(log(omo)) + d(L(Euribor3)) m1 <- dynlm(f, data = zooX, start = as.Date("2005-01-05"),end = as.Date("2005-01-24")) m2 <- dynlm(f,...

forward subset selection in R without intercept

r,statistics,regression,regression-testing
Hey so I am developing a multiple regression model and using the forward subset selection method to reduce the number of parameters and using "mallows Cp" as a selection criterion. However this is an engineering problem and it does not make sense to have an intercept,, i.e. when all the...

Used Predict function on New Dataset with different Columns

r,regression,predict
Using "stackloss" data in R, I created a regression model as seen below: stackloss.lm = lm(stack.loss ~ Air.Flow + Water.Temp + Acid.Conc.,data=stackloss) stackloss.lm newdata = data.frame(Air.Flow=stackloss$Air.Flow, Water.Temp= stackloss$Water.Temp, Acid.Conc.=stackloss$Acid.Conc.) Suppose I get a new data set and would need predict its "stack.loss" based on the previous model as seen below:...

Observation deleted due to missingness in R

r,regression
I am busy with a regression model in R and i have about 16 000 observations. One of these observations causes me to get the following error message, (1 observation deleted due to missingness) Is there a way in R so that i can identify this one observation?...

Placing Limits on Optim

r,optimization,regression,rscript
i'm trying to use an algorithm to minimise the least squares of models. I'd like to be able to confine all the parameters to within sensible ranges however when i run this script for whatever reason it is disregarding my limits. More of a debugging issue than anything else. Any...

How do you use variables in a regression formula in R?

r,regression
How do you use a variable in a regression formula? For example, using the 'Animals' dataset (in MASS), the following works fine: data(Animals) model <- lm(body ~ brain, data = Animals) But what I want to do is: data(Animals) x <- "body" y <- "brain" model <- lm(x ~ y,...

R: Isotonic regression Minimisation

r,regression,mathematical-optimization,linear-programming,minimization
I want minimize the following equation: F=SUM{u 1:20}sum{w 1:10} Quw(ruw-yuw) with the following constraints: yuw >= yu,w+1 yuw >= yu-1,w y20,0 >= 100 y0,10 >= 0 I have a 20*10 ruw and 20*10 quw matrix, I now need to generate a yuw matrix which adheres to the constraints. I am...

Java 8 change in UTF-8 decoding

java,utf-8,java-8,regression
We recently migrated our application to JDK 8 from JDK 7. After the change, we ran into a problem with the following snippet of code. String output = new String(byteArray, "UTF-8"); The byte array may contain invalid UTF-8 byte sequences. The same byte array upon UTF-8 decoding, results in two...

Regression loop in R for data frames

r,loops,statistics,data.frame,regression
rm(list=ls()) myData <-read.csv(file="C:/Users/Documents/myfile.csv",header=TRUE, sep=",") for(i in names(myData)) { colNum <- grep(i,colnames(myData)) ##asigns a value to each column if(is.numeric(myData[3,colNum])) ##if row 3 is numeric, the entire column is { ##print(nxeData[,i]) fit <- lm(myData[,i] ~ etch_source_Avg, data=myData) #does a regression for each column in my csv file against my independent variable 'etch'...

Getting coefficient at best lambda in glmnet in R

r,lambda,regression,glmnet
I am using following code with glmnet: > library(glmnet) > fit = glmnet(as.matrix(mtcars[-1]), mtcars[,1]) > plot(fit, xvar='lambda') However, I want to print out the coefficients at best Lambda, like it is done in ridge regression. I see following structure of fit: > str(fit) List of 12 $ a0 : Named...

P values from fastbw regression function of rms package

r,regression,linear-regression,rms
I am trying fastbw function of rms package for backward regression as follows (using mtcars dataset): > mod = ols(mpg~am+vs+cyl+drat+wt+gear, mtcars) > mod Linear Regression Model ols(formula = mpg ~ am + vs + cyl + drat + wt + gear, data = mtcars) Model Likelihood Discrimination Ratio Test Indexes...

SciKit-learn for data driven regression of oscillating data

python,time-series,scikit-learn,regression,prediction
Long time lurker first time poster. I have data that roughly follows a y=sin(time) distribution, but also depends on other variables than time. In terms of correlations, since the target y-variable oscillates there is almost zero statistical correlation with time, but y obviously depends very strongly on time. The goal...

Modelling interactions with only a subset of the levels of a factor in R

regression,interaction
Let's first look at lm. I have a continuous explanatory $X$ and a factor $F$ modelling seasonal aspects (in the example 8 levels). Let $\beta$ denote the slope for $X$ then I want to model interactions of the slope with the factor. It is some kind of physical model thus...

Nonlinear total least squares/Deming regression

r,regression
I've been using nls() to fit a custom model to my data, but I don't like how the model is fitting and I would like to use an approach that minimizes residuals in both x and y axes. I've done a lot of searching, and have found solutions for fitting...

Loop through various data subsets in lm() in R

r,loops,regression,subset
I would like to loop over various regressions referencing different data subsets, however I'm unable to appropriately call different subsets. For example: dat <- data.frame(y = rnorm(10), x1 = rnorm(10), x2 = rnorm(10), x3 = rnorm(10) ) x.list <- list(dat$x1,dat$x2,dat$x3) dat1 <- dat[-9,] fit <- list() for(i in 1:length(x.list)){ fit[[i]]...

Why do I get this error below while using the Cubist package in R?

r,regression,decision-tree,non-linear-regression
I have some personal dataset. So I split it into variable to predict and predictors. Following is the syntax: library(Cubist) str(A) 'data.frame': 6038 obs. of 3 variables: $ ads_return_count : num 7 10 10 4 10 10 10 10 10 9 ... $ actual_cpc : num 0.0678 0.3888 0.2947 0.0179...

Why does not GridSearchCV give best score ? - Scikit Learn

python,r,machine-learning,scikit-learn,regression
I have a dataset with 158 rows and 10 columns. I try to build multiple linear regression model and try to predict future value. I used GridSearchCV for tunning parameters. Here is my GridSearchCV and Regression function : def GridSearch(data): X_train, X_test, y_train, y_test = cross_validation.train_test_split(data, ground_truth_data, test_size=0.3, random_state =...

How to find algo type(regression,classification) in Caret in R for all algos at once?

r,machine-learning,classification,regression,caret
How do I find whether model type for all models at once? I know how to access this info if I know the algo name, e.g.: library('Caret') tail(name(getModelInfo())) [1] "widekernelpls" "WM" "wsrf" "xgbLinear" "xgbTree" [6] "xyf" getModelInfo()$xyf$type [1] "Classification" "Regression" How do I see the $type for all the algos...

Change basic assumptions of “add trendline” in excel

excel,regression,trendline
I'm plotting some interaction effects that stem from a regression in stata. I'm using excel for convenience. The data are curvilinear and I'm adding a polynomial trendline to maximize the fit. The problem I have is that the trendline function seems to assume that the x values are 1, 2,...

Partition dataset using CART regression by leaf node

r,regression
I'm currently trying to modify an existing Stata model in R, and I'm running into problems with a specific step in the process. I need to use a CART regression to divide my dataset up into individual clusters based on their leaf node, such that each leaf node becomes a...

Stata — predict after regression by group_id

regression,stata,predict
I have to run regressions by group_id and then generate the predictions. It doesn't seem like predict allows the "by" option. Is there a way I can predict after running regressions by group_id? The data are stacked by group_id. The regression command I am thinking of using is as follows:...

How to find fourth and fifth regression coefficients in R?

r,regression
I would like to compute 5 regression coefficients.I searched thru Internet but I did not find anything for this. my data: y=c(2,13,0.4,5,8,10,13) x=c(2,13,0.004,5,8,1,13) z=c(2,3,0.004,15,8,10,1) normal equation: y=a1x+a2z+a3 x, z, independent variables, y is the dependent variable, and a1 a3, and a2 are the parameters of the model. normal fit for...

R Forecast with lagged dependent variable

r,loops,regression,lag,dummy-data
Use lm function in to fit (Pt=aPt-1 + bXt + Dummy variable for each quarter) to fit the sample data. How to create n.ahead=12 forecast? Couldnt figure out how to set up dummy and Pt-1 fore iteration.Any help is appreciated!

getting fitted lines with scatterplot matrix in r

r,regression,linear
How do I get a scatterplot matrix which will also show the fitted lines in each plot. I know how to use "abline" function with individual plots but don't know how to implement it in a scatterplot matrix

How to fit an elliptic cone to a set of data?

matlab,regression,curve-fitting,ellipse,best-fit-curve
I have a set of 3d data (300 points) that create a surface which looks like two cones or ellipsoids connected to each other. I want a way to find the equation of a best fit ellipsoid or cone to this dataset. The regression method is not important, the easier...

How to rescale “linear predictor” in drawing nomogram with “rms” package in R?

r,regression
I am trying to draw a nomogram from a logistic regression in R by using the rms package, but currently I have a problem: indeed, I can get the nomogram, but the "linear predictor" axis ranges from -2.5 to +3, and I'd like to know whether I can make it...

R— repeating linear regression in a large dataset

r,regression
I'm an R newbie working with an annual time series dataset (named "timeseries"). The set has one column for year and another 600 columns with the yearly values for different locations ("L1," "L2", etc), e.g. similar to the following: Year L1 L2 L3 L4 1963 0.63 0.23 1.33 1.41 1964...

Getting statsmodels to use heteroskedasticity corrected standard errors in coefficient t-tests

python,regression,statsmodels
I've been digging into the API of statsmodels.regression.linear_model.RegressionResults and have found how to retrieve different flavors of heteroskedasticity corrected standard errors (via properties like HC0_se, etc.) However, I can't quite figure out how to get the t-tests on the coefficients to use these corrected standard errors. Is there a way...

Input format for functions in package strucchange?

r,regression,trend
I'm trying to do change point detection with ´monitor´ from the strucchange package, but I have trouble getting a useful output. My input is a time stamped dataframe, and I would like the breaks to be returned as dates, but they are returned as observation number: cDF1 <- myDF[1:80,] >...

Plotting a independent variable under a parameter of another variable in R

r,plot,regression
I have a function predictshrine<-0*rain-399.8993+5*crops+50.4296*log(citysize)+ 4.5071*wonders*chief+.02301*children*deaths+1.806*children+ .10799*deaths-2.0755*wonders-.0878*children^2+.001062*children^3- .000004288*children^4-.009*deaths^2+.0000530238*deaths^3+ 7.974*sqrt(children)+.026937*wonders^2-.0001305*wonders^3 I also have a sequence children<-seq(0,100,length=500) And a for loop for(deaths in c(0,5,10,50,100,200)) Now what i want to do is be able to plot predictshrine vs children when deaths equals certain amounts and...

An error while looping a linear regression

r,loops,data.frame,regression
I would like to run a loop that will run per each category of one of the variables and produce a prediction per each regression so that the sum of the prediction variable will be deduced from the target variable .Here Is my toy data and code: df <- read.table(text...

How to make a for loop to find interactions between several variables in R?

r,regression,linear
I have a data set with 17 variables the data is available at this link http://www.uwyo.edu/crawford/stat3050/final%20project/maxwellchandler.txt I want to find significant interactions between the variables. For example fitcivilian<-lm(Civilian~Stock+Terrorism+log(Firepower)+Payload+Bombs*Temperature+FirstAid+Spies+Personnel+IG88, data=data) where Bombs*Temperature is significant What I want to do is test EVERY varaible against EVERY OTHER variable, Like doing Bombs*Temperature Bombs*Napalm...

How do I add a trendline with categories to HighCharts?

javascript,jquery,highcharts,regression
With datasource expressed as X,Y the trendline appears correctly. However, things dont work with categories. Is there a good way to add a timeline without reformatting the data? JSFiddle:http://jsfiddle.net/9r9ba64r/5/ $(function () { // series data with Y and categories (this doesnt work!) //var sourceData = [{y:100} , {y:200}, {y:300}, {y:400}];...

Python stats.linregress syntax error

python,syntax,regression,linear
I am trying to calculate the regression of the x and y variables, trace_no and twwt, respectively. The variable are 151 x 1 arrays. The code is outputting a syntax error: File "./seabed_dip_correction.py", line 32 slope, intercept, r_value, p_value, std_err, Syy/Sxx = stats.linregress(trace_no,twtt) SyntaxError: can't assign to operator I have...

Multiple regressions with subsets of data using dplyr in R

r,regression,dplyr
I have a data frame "DF" with this glimpse(): Observations: 1244160 Variables: $ Test (fctr) 72001.txt, 72002.txt, 72003.txt, 72004.txt, 72005.txt,... $ x (int) 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1... $ y (int) 1, 1, 1, 1, 1,...

Constrained high order polynomial regression

matlab,regression
I am doing some bone segmentation whereas the result of this segmentation is points placed in a circular pattern around this bone. However as it is taken using a qCT scan, there is quite a lot noise (from e.g. flesh) on the points that i have. So the overall problem...

Determining regression coefficients for data - MATLAB

matlab,matrix,regression,numerical-methods
I am doing a project involving scientific computing. The following are three variables and their values I got after some experiments. There is also an equation with three unknowns, a, b and c: x=(a+0.98)/y+(b+0.7)/z+c How do I get values of a,b,c using the above? Is this possible in MATLAB?...

chaid regression tree to table conversion in r

r,packages,regression,decision-tree
I used the CHAID package from this link ..It gives me a chaid object which can be plotted..I want a decision table with each decision rule in a column instead of a decision tree. .But i dont understand how to access nodes and paths in this chaid object..Kindly help me.....

Regression gives error on one of the input variables “contrasts can be applied only to factors with 2 or more levels”

r,regression,categorical-data
I am running a logit regression in R with a large number of input variables. newlogit <- glm(install. ~ SIZES + GROSSCONSUMPTION.... + NETTCONSUMPTION..... + NETTGENERATION....... + GROSSGENERATION.... + Variable. + Fixed + Cost.of.gross.cons + Cost.of.net.cons + Cons.savings + generation.gains + Total.savings + Cost.of.system + Payback + Self.consumption + Total.consumption.as.solar...

How to plot a scatter plot with error bars indicating standard deviation

matlab,statistics,regression
I have a set of data Y v/s X (~20k data points) which when plotted are a scatter. I want to plot error bars for Y for a ranges of X(eg. the X axis is of length 100, then I want the errorbars to represent the standard deviation of Y...

Graphing different sets of data on same graph within a ‘for’ loop MATLAB

matlab,for-loop,plot,regression
I just have a problem with graphing different plots on the same graph within a ‘for’ loop. I hope someone can be point me in the right direction. I have a 2-D array, with discrete chunks of data in and amongst zeros. My data is the following: A= 0 0...

Tidy approach to regression models, ideally with dplyr

r,regression,dplyr,lm
Reading the documentation for do() in dplyr, I've been impressed by the ability to create regression models for groups of data and was wondering whether it would be possible to replicate it using different independent variables rather than groups of data. So far I've tried require(dplyr) data(mtcars) models <- data.frame(var...