I am using lm for MLR and CVlm for cross-validation. My data contains two categorical variables (one of them with 11 levels and the other one with only 2). Everything seems to work fine when using lm, the problem is when I try to use CVlm. I have errors because...

I would like to predict multiple dependent variables using multiple predictors. If I understood correctly, in principle one could make a bunch of linear regression models that each predict one dependent variable, but if the dependent variables are correlated, it makes more sense to use multivariate regression. I would like...

i am working on machine learning project i am doing a multivariate linear regression model in python and here is my code import matplotlib.pyplot as plt import numpy as np import pandas as pd from sklearn.linear_model import LinearRegression data = pd.read_csv("train.csv", delimiter=",", header=0) x = data['Col1'][:, np.newaxis] y = data['Expected']...

I am building a linear regression model to predict 2015 values. I have data from 2013 and 2014. My question is, how can I use both the data from 2013 and 2014 to train my linear regression model in R? I have: model1 = lm(x ~ y, data = data2013)...

I want to run linear regression for the same outcome and a number of covariates minus one covariate in each model. I have looked at the example on this page but could that did not provide what I wanted. Sample data a <- data.frame(y = c(30,12,18), x1 = c(7,6,9), x2...

I have implemented a simple Linear Regression (single variate for now) example in C++ to help me get my head around the concepts. I'm pretty sure that the key algorithm is right but my performance is terrible. This is the method which actually performs the gradient descent: void LinearRegression::BatchGradientDescent(std::vector<std::pair<int,int>> &...

I'm running a phylogenetic analysis using the caper package, where the regression function (which uses phylogeneticaly independent contrasts) is crunch. The crunch function uses an object internal to the caper package called caic. The model is started via: crunchMod <- crunch(y ~ f(x), data = comparison) When I run summary(crunchMod)...

How do I perform a regression with Date contraints? I only want to perform a regression on the "non-zero" part of the data set. The main issue is that columns 2 & 3 start at different Dates & i have written a loop that will perform a regression from a...

I am trying to have output 2 different graphs with a regression line. I am using the mtcars data set which I believe you can load into R. So, I am comparing 2 different pairs of information to create a regression line. And the problem seems to be that the...

I am new to Python and I am trying to build a simple linear regression model. I am able to build the model and see the results, but when I try to look at the parameters I get an error and I am not sure where I am going wrong....

Here is my data Comparing the mean seems to yield to some interested results. And it indeed does as revealed by the linear model: lm(data=data, y~factor(x))) Now, it also looks like the variances are not equal in all groups. Here is a plot of the variance in y for each...

Plotting a single variable function in Python is pretty straightforward with matplotlib. But I'm trying to add a third axis to the scatter plot so I can visualize my multivariate model. Here's an example snippet, with 30 outputs: import numpy as np np.random.seed(2) ## generate a random data set x...

Matlab defines LinearModel and GeneralizedLinearMixedModel classes. Browsing the documentation indicates that either (i) one is derived from the other, or (ii) there is automatic conversion. These are complex objects, and I am just starting to explore them, so I apologize if their relationship is obvious, but what exactly is their...

I want to identify data points with high leverage and large residuals. My aim is to remove them and repeat linear regression analyses. Specifically I want to remove studentized residuals larger than 3 and data points with cooks D > 4/n. How could I perform that in the sample data...

In R, is model.matrix(~ Treatment + Time + Treatment*Time, table_design) equivalent to model.matrix(~ Treatment*Time, table_design) Thanks....

I am newbie to java and now I want to apply the ordinary linear regression to two series, say [1, 2, 3, 4, 5] and [2, 3, 4, 5, 6]. I learn that there is a library called common math. However, the documentation is difficult to understand, is there any...

I want to perform a moving window regression on every pixel of two raster stacks representing Band3 and Band4 of Landsat data. The result should be two additional stacks, one representing the Intercept and the other one representing the slope of the regression. So layer 1 of stack "B3" and...

I have a linear model: mod=lm(weight~age, data=f2) I would like to input an age value and have returned the corresponding weight from this model. This is probably simple, but I have not found a simple way to do this....

I'm trying to build a Linear regression Shiny app with a custom file input. I have a problem with the reactive function in Server.R. The reactive function data returns a data frame called qvdata. When data() is called in renderPlot and I plot from the qvdata I get the following...

In case, there are 2 inputs (X1 and X2) and 1 target output (t) to be estimated by neural network (each nodes has 6 samples): X1 = [2.765405915 2.403146899 1.843932529 1.321474515 0.916837222 1.251301467]; X2 = [84870 363024 983062 1352580 804723 845200]; t = [-0.12685144347197 -0.19172223428950 -0.29330584684934 -0.35078062276141 0.03826908777226 0.06633047875487]; I...

Below are 4 datasets (I've just created them randomly for the sake of providing a reproducible code). I created a list of these so I could apply "lm" to these multiple datasets at once : H<-data.frame(replicate(10,sample(0:20,10,rep=TRUE))) C<-data.frame(replicate(5,sample(0:100,10,rep=FALSE))) R<-data.frame(replicate(7,sample(0:30,10,rep=TRUE))) E<-data.frame(replicate(4,sample(0:40,10,rep=FALSE))) dsets<-list(H,C,R,E) models<-lapply(dsets,function(x)lm(X1~.,data=x)) lapply(models,summary) The variables in each of the...

Can anyone explain to me the difference between ols in statsmodel.formula.api versus ols in statsmodel.api? Using the Advertising data from the ISLR text, I ran an ols using both, and got different results. I then compared with scikit-learn's LinearRegression. import numpy as np import pandas as pd import statsmodels.formula.api as...

I'm doing some regression analysis and I've come across some strange behavior from the lda function in the MASS library. Specifically, it seems to be unable to accept a string as it's formula argument. This doesn't appear to be a problem for the base glm functions. I've constructed a small...

I have a dataframe like below : a1 a2 a3 a4 1 3 3 5 5 2 4 3 5 5 3 5 4 6 5 4 6 5 7 3 I want to do linear regression for every two columns in the dataframe, and set intercept as 0. In...

I am trying to build a dynamic regression model and so far I did it with the dynlm package. Basically the model looks like this y_t = a*x1_t + b*x2_t + ... + c*y_(t-1). y_t shall be predicted, x1_t and x2_t will be given and so is y_(t-1). Building the...

I have a large data set of the following format: First column is type, and the subsequent columns are different times that 'type' happens. I want to calculate the slope of each row (~7000 rows) for subset T0-T2 and then t0-t2 and output that information, then get the average of...

I am using scikit linear regression - single variable to predict y from x. The argument is in float datatype. How can i transform the float into numpy array to predict the output ? import matplotlib.pyplot as plt import pandas import numpy as np from sklearn import linear_model import sys...

Hi I have a linear regression model that i am trying to optimise. I am optimising the span of an exponential moving average and the number of lagged variables that I use in the regression. However I keep finding that the results and the calculated mse keep coming up with...

I have the following 2D list list = [[1,1,a],[2,2,b],[3,3,c]] and I want to convert this 2D list to one 2D list and an array sublist = [[1,1],[2,2],[3,3]] subarray = [a,b,c] Is there any convenient way to do that in python. I'm new to python, so I dont know if there's...

I am trying fastbw function of rms package for backward regression as follows (using mtcars dataset): > mod = ols(mpg~am+vs+cyl+drat+wt+gear, mtcars) > mod Linear Regression Model ols(formula = mpg ~ am + vs + cyl + drat + wt + gear, data = mtcars) Model Likelihood Discrimination Ratio Test Indexes...

I have used Statsmodels to generate a OLS linear regression model to predict a dependent variable based on about 10 independent variables. The independent variables are all categorical. I am interested in looking closer at the significance of the coefficients for one of the independent variables. There are 4 categories,...

The function MASS::lm.gls fits a linear model using generalized least squares, and returns an object of class "lm.gls", but is has no print, summary or other methods. I could define these simply by hijacking the methods for "lm" objects print.lm.gls <- function(object, ...) { class(object) <- "lm" print(object, ...) }...

If it has one feature it's easy. Just graph it. One of the records there looks like (18, 15). Simple. But if we have multiple features that adds more dimensions to the graph, right? So how can you visualize your data set and determine whether or not linear regression is...

I am new to machine learning and trying to write a linear regression algorithm where I have a categorical feature - Keywords. I can have around 10 million keywords in my model. As per the instructions given here - http://www.psychstat.missouristate.edu/multibook/mlt08m.html It seems like I should dichotomize categorical features. Does it...