FAQ Database Discussion Community


how to generate a linear regression matrix like cor()

r,data.frame,linear-regression
I have a dataframe like below : a1 a2 a3 a4 1 3 3 5 5 2 4 3 5 5 3 5 4 6 5 4 6 5 7 3 I want to do linear regression for every two columns in the dataframe, and set intercept as 0. In...

Calculating the slope of each row in a large data set using R

r,data.frame,subset,linear-regression
I have a large data set of the following format: First column is type, and the subsequent columns are different times that 'type' happens. I want to calculate the slope of each row (~7000 rows) for subset T0-T2 and then t0-t2 and output that information, then get the average of...

Predict y value for a given x in R

r,linear-regression,predict
I have a linear model: mod=lm(weight~age, data=f2) I would like to input an age value and have returned the corresponding weight from this model. This is probably simple, but I have not found a simple way to do this....

Does scikit-learn perform “real” multivariate regression (multiple dependent variables)?

python,machine-learning,scikit-learn,linear-regression,multivariate-testing
I would like to predict multiple dependent variables using multiple predictors. If I understood correctly, in principle one could make a bunch of linear regression models that each predict one dependent variable, but if the dependent variables are correlated, it makes more sense to use multivariate regression. I would like...

Statsmodels - Wald Test for significance of trend in coefficients in Linear Regression Model (OLS)

python,statistics,linear-regression,statsmodels
I have used Statsmodels to generate a OLS linear regression model to predict a dependent variable based on about 10 independent variables. The independent variables are all categorical. I am interested in looking closer at the significance of the coefficients for one of the independent variables. There are 4 categories,...

Moving window regression

r,linear-regression,raster
I want to perform a moving window regression on every pixel of two raster stacks representing Band3 and Band4 of Landsat data. The result should be two additional stacks, one representing the Intercept and the other one representing the slope of the regression. So layer 1 of stack "B3" and...

Multiple Regression lines in R

r,linear-regression
I am trying to have output 2 different graphs with a regression line. I am using the mtcars data set which I believe you can load into R. So, I am comparing 2 different pairs of information to create a regression line. And the problem seems to be that the...

writing a wrapper for a linear modeling function [MASS::lm.gls()]

r,closures,wrapper,linear-regression
The function MASS::lm.gls fits a linear model using generalized least squares, and returns an object of class "lm.gls", but is has no print, summary or other methods. I could define these simply by hijacking the methods for "lm" objects print.lm.gls <- function(object, ...) { class(object) <- "lm" print(object, ...) }...

How to perform linear regression on the starting points of a dataset using R

r,linear-regression
How do I perform a regression with Date contraints? I only want to perform a regression on the "non-zero" part of the data set. The main issue is that columns 2 & 3 start at different Dates & i have written a loop that will perform a regression from a...

R: Base functions cannot use object from the package Caper

r,package,linear-regression
I'm running a phylogenetic analysis using the caper package, where the regression function (which uses phylogeneticaly independent contrasts) is crunch. The crunch function uses an object internal to the caper package called caic. The model is started via: crunchMod <- crunch(y ~ f(x), data = comparison) When I run summary(crunchMod)...

Multiplicative design matrix in R

r,design,matrix,linear-regression
In R, is model.matrix(~ Treatment + Time + Treatment*Time, table_design) equivalent to model.matrix(~ Treatment*Time, table_design) Thanks....

Linear regression poor gradient descent performance

c++,algorithm,machine-learning,artificial-intelligence,linear-regression
I have implemented a simple Linear Regression (single variate for now) example in C++ to help me get my head around the concepts. I'm pretty sure that the key algorithm is right but my performance is terrible. This is the method which actually performs the gradient descent: void LinearRegression::BatchGradientDescent(std::vector<std::pair<int,int>> &...

Any Ideas for Predicting Multiple Linear Regression Coefficients by using Neural Networks (ANN)?

matlab,neural-network,linear-regression,backpropagation,perceptron
In case, there are 2 inputs (X1 and X2) and 1 target output (t) to be estimated by neural network (each nodes has 6 samples): X1 = [2.765405915 2.403146899 1.843932529 1.321474515 0.916837222 1.251301467]; X2 = [84870 363024 983062 1352580 804723 845200]; t = [-0.12685144347197 -0.19172223428950 -0.29330584684934 -0.35078062276141 0.03826908777226 0.06633047875487]; I...

How to divide a 2D list into one 2D list and one array (remove the last column)

python,linear-regression
I have the following 2D list list = [[1,1,a],[2,2,b],[3,3,c]] and I want to convert this 2D list to one 2D list and an array sublist = [[1,1],[2,2],[3,3]] subarray = [a,b,c] Is there any convenient way to do that in python. I'm new to python, so I dont know if there's...

OLS using statsmodel.formula.api versus statsmodel.api

python,linear-regression
Can anyone explain to me the difference between ols in statsmodel.formula.api versus ols in statsmodel.api? Using the Advertising data from the ISLR text, I ran an ols using both, and got different results. I then compared with scikit-learn's LinearRegression. import numpy as np import pandas as pd import statsmodels.formula.api as...

series object not callable with linear regression in python

python,scikit-learn,linear-regression,statsmodels
I am new to Python and I am trying to build a simple linear regression model. I am able to build the model and see the results, but when I try to look at the parameters I get an error and I am not sure where I am going wrong....

Identify and remove data points with high leverage and large residuals

r,linear-regression,outliers
I want to identify data points with high leverage and large residuals. My aim is to remove them and repeat linear regression analyses. Specifically I want to remove studentized residuals larger than 3 and data points with cooks D > 4/n. How could I perform that in the sample data...

Linear regression of same outcome, similar numbe of covariates and one unique covariate in each model

r,formula,linear-regression
I want to run linear regression for the same outcome and a number of covariates minus one covariate in each model. I have looked at the example on this page but could that did not provide what I wanted. Sample data a <- data.frame(y = c(30,12,18), x1 = c(7,6,9), x2...

need finite 'xlim' values using reactive function in Shiny

r,shiny,linear-regression
I'm trying to build a Linear regression Shiny app with a custom file input. I have a problem with the reactive function in Server.R. The reactive function data returns a data frame called qvdata. When data() is called in renderPlot and I plot from the qvdata I get the following...

P values from fastbw regression function of rms package

r,regression,linear-regression,rms
I am trying fastbw function of rms package for backward regression as follows (using mtcars dataset): > mod = ols(mpg~am+vs+cyl+drat+wt+gear, mtcars) > mod Linear Regression Model ols(formula = mpg ~ am + vs + cyl + drat + wt + gear, data = mtcars) Model Likelihood Discrimination Ratio Test Indexes...

CVlm with categorical variables: factor has new levels

r,statistics,linear-regression
I am using lm for MLR and CVlm for cross-validation. My data contains two categorical variables (one of them with 11 levels and the other one with only 2). Everything seems to work fine when using lm, the problem is when I try to use CVlm. I have errors because...

multivariate linear regression inputs fitting

python-2.7,numpy,machine-learning,linear-regression
i am working on machine learning project i am doing a multivariate linear regression model in python and here is my code import matplotlib.pyplot as plt import numpy as np import pandas as pd from sklearn.linear_model import LinearRegression data = pd.read_csv("train.csv", delimiter=",", header=0) x = data['Col1'][:, np.newaxis] y = data['Expected']...

Getting different result each time I run a linear regression using scikit

python,pandas,scikit-learn,linear-regression
Hi I have a linear regression model that i am trying to optimise. I am optimising the span of an exponential moving average and the number of lagged variables that I use in the regression. However I keep finding that the results and the calculated mse keep coming up with...

R: Dynamic linear regression with dynlm package, how to predict()?

r,dynamic,linear-regression,predict
I am trying to build a dynamic regression model and so far I did it with the dynlm package. Basically the model looks like this y_t = a*x1_t + b*x2_t + ... + c*y_(t-1). y_t shall be predicted, x1_t and x2_t will be given and so is y_(t-1). Building the...

How can I perform a linear regression on my group variances in R?

r,statistics,linear-regression
Here is my data Comparing the mean seems to yield to some interested results. And it indeed does as revealed by the linear model: lm(data=data, y~factor(x))) Now, it also looks like the variances are not equal in all groups. Here is a plot of the variance in y for each...

How to plot a multivariate function in Python?

python,numpy,matplotlib,linear-regression
Plotting a single variable function in Python is pretty straightforward with matplotlib. But I'm trying to add a third axis to the scatter plot so I can visualize my multivariate model. Here's an example snippet, with 30 outputs: import numpy as np np.random.seed(2) ## generate a random data set x...

How to manage a huge number of values for a categorical feature in linear regression

machine-learning,linear-regression
I am new to machine learning and trying to write a linear regression algorithm where I have a categorical feature - Keywords. I can have around 10 million keywords in my model. As per the instructions given here - http://www.psychstat.missouristate.edu/multibook/mlt08m.html It seems like I should dichotomize categorical features. Does it...

How do you know if a data set is right for linear regression if it has multiple features?

machine-learning,statistics,linear-regression
If it has one feature it's easy. Just graph it. One of the records there looks like (18, 15). Simple. But if we have multiple features that adds more dimensions to the graph, right? So how can you visualize your data set and determine whether or not linear regression is...

How to use multiple data to train a linear regression model in R

r,linear-regression,data-analysis
I am building a linear regression model to predict 2015 values. I have data from 2013 and 2014. My question is, how can I use both the data from 2013 and 2014 to train my linear regression model in R? I have: model1 = lm(x ~ y, data = data2013)...

applying lm to multiple datasets

r,loops,linear-regression,lm
Below are 4 datasets (I've just created them randomly for the sake of providing a reproducible code). I created a list of these so I could apply "lm" to these multiple datasets at once : H<-data.frame(replicate(10,sample(0:20,10,rep=TRUE))) C<-data.frame(replicate(5,sample(0:100,10,rep=FALSE))) R<-data.frame(replicate(7,sample(0:30,10,rep=TRUE))) E<-data.frame(replicate(4,sample(0:40,10,rep=FALSE))) dsets<-list(H,C,R,E) models<-lapply(dsets,function(x)lm(X1~.,data=x)) lapply(models,summary) The variables in each of the...

Why won't lda() accept a string as it's 'formula' argument?

r,linear-regression
I'm doing some regression analysis and I've come across some strange behavior from the lda function in the MASS library. Specifically, it seems to be unable to accept a string as it's formula argument. This doesn't appear to be a problem for the base glm functions. I've constructed a small...

Relationship between LinearModel & GeneralizedLinearMixedModel classes

matlab,oop,time-series,linear-regression,superclass
Matlab defines LinearModel and GeneralizedLinearMixedModel classes. Browsing the documentation indicates that either (i) one is derived from the other, or (ii) there is automatic conversion. These are complex objects, and I am just starting to explore them, so I apologize if their relationship is obvious, but what exactly is their...

use common math library in java

java,linear-regression
I am newbie to java and now I want to apply the ordinary linear regression to two series, say [1, 2, 3, 4, 5] and [2, 3, 4, 5, 6]. I learn that there is a library called common math. However, the documentation is difficult to understand, is there any...

How to pass float argument in predict function of scikit linear regression?

python,numpy,scikit-learn,linear-regression
I am using scikit linear regression - single variable to predict y from x. The argument is in float datatype. How can i transform the float into numpy array to predict the output ? import matplotlib.pyplot as plt import pandas import numpy as np from sklearn import linear_model import sys...