FAQ Database Discussion Community

## Why does not GridSearchCV give best score ? - Scikit Learn

python,r,machine-learning,scikit-learn,regression
I have a dataset with 158 rows and 10 columns. I try to build multiple linear regression model and try to predict future value. I used GridSearchCV for tunning parameters. Here is my GridSearchCV and Regression function : def GridSearch(data): X_train, X_test, y_train, y_test = cross_validation.train_test_split(data, ground_truth_data, test_size=0.3, random_state =...

## Input format for functions in package strucchange?

r,regression,trend
I'm trying to do change point detection with ´monitor´ from the strucchange package, but I have trouble getting a useful output. My input is a time stamped dataframe, and I would like the breaks to be returned as dates, but they are returned as observation number: cDF1 <- myDF[1:80,] >...

## Graphing different sets of data on same graph within a ‘for’ loop MATLAB

matlab,for-loop,plot,regression
I just have a problem with graphing different plots on the same graph within a ‘for’ loop. I hope someone can be point me in the right direction. I have a 2-D array, with discrete chunks of data in and amongst zeros. My data is the following: A= 0 0...

## Change basic assumptions of “add trendline” in excel

excel,regression,trendline
I'm plotting some interaction effects that stem from a regression in stata. I'm using excel for convenience. The data are curvilinear and I'm adding a polynomial trendline to maximize the fit. The problem I have is that the trendline function seems to assume that the x values are 1, 2,...

## Getting coefficient at best lambda in glmnet in R

r,lambda,regression,glmnet

## forward subset selection in R without intercept

r,statistics,regression,regression-testing
Hey so I am developing a multiple regression model and using the forward subset selection method to reduce the number of parameters and using "mallows Cp" as a selection criterion. However this is an engineering problem and it does not make sense to have an intercept,, i.e. when all the...

## Modelling interactions with only a subset of the levels of a factor in R

regression,interaction
Let's first look at lm. I have a continuous explanatory $X$ and a factor $F$ modelling seasonal aspects (in the example 8 levels). Let $\beta$ denote the slope for $X$ then I want to model interactions of the slope with the factor. It is some kind of physical model thus...

## How do you use variables in a regression formula in R?

r,regression
How do you use a variable in a regression formula? For example, using the 'Animals' dataset (in MASS), the following works fine: data(Animals) model <- lm(body ~ brain, data = Animals) But what I want to do is: data(Animals) x <- "body" y <- "brain" model <- lm(x ~ y,...

## Regression gives error on one of the input variables “contrasts can be applied only to factors with 2 or more levels”

r,regression,categorical-data
I am running a logit regression in R with a large number of input variables. newlogit <- glm(install. ~ SIZES + GROSSCONSUMPTION.... + NETTCONSUMPTION..... + NETTGENERATION....... + GROSSGENERATION.... + Variable. + Fixed + Cost.of.gross.cons + Cost.of.net.cons + Cons.savings + generation.gains + Total.savings + Cost.of.system + Payback + Self.consumption + Total.consumption.as.solar...

## How to find fourth and fifth regression coefficients in R?

r,regression
I would like to compute 5 regression coefficients.I searched thru Internet but I did not find anything for this. my data: y=c(2,13,0.4,5,8,10,13) x=c(2,13,0.004,5,8,1,13) z=c(2,3,0.004,15,8,10,1) normal equation: y=a1x+a2z+a3 x, z, independent variables, y is the dependent variable, and a1 a3, and a2 are the parameters of the model. normal fit for...

## Constrained high order polynomial regression

matlab,regression
I am doing some bone segmentation whereas the result of this segmentation is points placed in a circular pattern around this bone. However as it is taken using a qCT scan, there is quite a lot noise (from e.g. flesh) on the points that i have. So the overall problem...

## How to rescale “linear predictor” in drawing nomogram with “rms” package in R?

r,regression
I am trying to draw a nomogram from a logistic regression in R by using the rms package, but currently I have a problem: indeed, I can get the nomogram, but the "linear predictor" axis ranges from -2.5 to +3, and I'd like to know whether I can make it...

## Python stats.linregress syntax error

python,syntax,regression,linear
I am trying to calculate the regression of the x and y variables, trace_no and twwt, respectively. The variable are 151 x 1 arrays. The code is outputting a syntax error: File "./seabed_dip_correction.py", line 32 slope, intercept, r_value, p_value, std_err, Syy/Sxx = stats.linregress(trace_no,twtt) SyntaxError: can't assign to operator I have...

## Why do I get this error below while using the Cubist package in R?

r,regression,decision-tree,non-linear-regression
I have some personal dataset. So I split it into variable to predict and predictors. Following is the syntax: library(Cubist) str(A) 'data.frame': 6038 obs. of 3 variables: $ads_return_count : num 7 10 10 4 10 10 10 10 10 9 ...$ actual_cpc : num 0.0678 0.3888 0.2947 0.0179...

## chaid regression tree to table conversion in r

r,packages,regression,decision-tree
I used the CHAID package from this link ..It gives me a chaid object which can be plotted..I want a decision table with each decision rule in a column instead of a decision tree. .But i dont understand how to access nodes and paths in this chaid object..Kindly help me.....

## Determining regression coefficients for data - MATLAB

matlab,matrix,regression,numerical-methods
I am doing a project involving scientific computing. The following are three variables and their values I got after some experiments. There is also an equation with three unknowns, a, b and c: x=(a+0.98)/y+(b+0.7)/z+c How do I get values of a,b,c using the above? Is this possible in MATLAB?...

## Getting statsmodels to use heteroskedasticity corrected standard errors in coefficient t-tests

python,regression,statsmodels
I've been digging into the API of statsmodels.regression.linear_model.RegressionResults and have found how to retrieve different flavors of heteroskedasticity corrected standard errors (via properties like HC0_se, etc.) However, I can't quite figure out how to get the t-tests on the coefficients to use these corrected standard errors. Is there a way...

## Loop through various data subsets in lm() in R

r,loops,regression,subset

## Tidy approach to regression models, ideally with dplyr

r,regression,dplyr,lm
Reading the documentation for do() in dplyr, I've been impressed by the ability to create regression models for groups of data and was wondering whether it would be possible to replicate it using different independent variables rather than groups of data. So far I've tried require(dplyr) data(mtcars) models <- data.frame(var...

## Placing Limits on Optim

r,optimization,regression,rscript
i'm trying to use an algorithm to minimise the least squares of models. I'd like to be able to confine all the parameters to within sensible ranges however when i run this script for whatever reason it is disregarding my limits. More of a debugging issue than anything else. Any...

## getting fitted lines with scatterplot matrix in r

r,regression,linear
How do I get a scatterplot matrix which will also show the fitted lines in each plot. I know how to use "abline" function with individual plots but don't know how to implement it in a scatterplot matrix

## How to make a for loop to find interactions between several variables in R?

r,regression,linear
I have a data set with 17 variables the data is available at this link http://www.uwyo.edu/crawford/stat3050/final%20project/maxwellchandler.txt I want to find significant interactions between the variables. For example fitcivilian<-lm(Civilian~Stock+Terrorism+log(Firepower)+Payload+Bombs*Temperature+FirstAid+Spies+Personnel+IG88, data=data) where Bombs*Temperature is significant What I want to do is test EVERY varaible against EVERY OTHER variable, Like doing Bombs*Temperature Bombs*Napalm...

## R: Isotonic regression Minimisation

r,regression,mathematical-optimization,linear-programming,minimization
I want minimize the following equation: F=SUM{u 1:20}sum{w 1:10} Quw(ruw-yuw) with the following constraints: yuw >= yu,w+1 yuw >= yu-1,w y20,0 >= 100 y0,10 >= 0 I have a 20*10 ruw and 20*10 quw matrix, I now need to generate a yuw matrix which adheres to the constraints. I am...

## SciKit-learn for data driven regression of oscillating data

python,time-series,scikit-learn,regression,prediction
Long time lurker first time poster. I have data that roughly follows a y=sin(time) distribution, but also depends on other variables than time. In terms of correlations, since the target y-variable oscillates there is almost zero statistical correlation with time, but y obviously depends very strongly on time. The goal...

## R: HAC by NeweyWest using dynlm

r,time-series,regression

## Partition dataset using CART regression by leaf node

r,regression
I'm currently trying to modify an existing Stata model in R, and I'm running into problems with a specific step in the process. I need to use a CART regression to divide my dataset up into individual clusters based on their leaf node, such that each leaf node becomes a...

## P values from fastbw regression function of rms package

r,regression,linear-regression,rms
I am trying fastbw function of rms package for backward regression as follows (using mtcars dataset): > mod = ols(mpg~am+vs+cyl+drat+wt+gear, mtcars) > mod Linear Regression Model ols(formula = mpg ~ am + vs + cyl + drat + wt + gear, data = mtcars) Model Likelihood Discrimination Ratio Test Indexes...

## Observation deleted due to missingness in R

r,regression
I am busy with a regression model in R and i have about 16 000 observations. One of these observations causes me to get the following error message, (1 observation deleted due to missingness) Is there a way in R so that i can identify this one observation?...

## How to plot a scatter plot with error bars indicating standard deviation

matlab,statistics,regression
I have a set of data Y v/s X (~20k data points) which when plotted are a scatter. I want to plot error bars for Y for a ranges of X(eg. the X axis is of length 100, then I want the errorbars to represent the standard deviation of Y...

## Plotting a independent variable under a parameter of another variable in R

r,plot,regression
I have a function predictshrine<-0*rain-399.8993+5*crops+50.4296*log(citysize)+ 4.5071*wonders*chief+.02301*children*deaths+1.806*children+ .10799*deaths-2.0755*wonders-.0878*children^2+.001062*children^3- .000004288*children^4-.009*deaths^2+.0000530238*deaths^3+ 7.974*sqrt(children)+.026937*wonders^2-.0001305*wonders^3 I also have a sequence children<-seq(0,100,length=500) And a for loop for(deaths in c(0,5,10,50,100,200)) Now what i want to do is be able to plot predictshrine vs children when deaths equals certain amounts and...

## An error while looping a linear regression

r,loops,data.frame,regression
I would like to run a loop that will run per each category of one of the variables and produce a prediction per each regression so that the sum of the prediction variable will be deduced from the target variable .Here Is my toy data and code: df <- read.table(text...

## Regression loop in R for data frames

r,loops,statistics,data.frame,regression
rm(list=ls()) myData <-read.csv(file="C:/Users/Documents/myfile.csv",header=TRUE, sep=",") for(i in names(myData)) { colNum <- grep(i,colnames(myData)) ##asigns a value to each column if(is.numeric(myData[3,colNum])) ##if row 3 is numeric, the entire column is { ##print(nxeData[,i]) fit <- lm(myData[,i] ~ etch_source_Avg, data=myData) #does a regression for each column in my csv file against my independent variable 'etch'...

## How to find algo type(regression,classification) in Caret in R for all algos at once?

r,machine-learning,classification,regression,caret
How do I find whether model type for all models at once? I know how to access this info if I know the algo name, e.g.: library('Caret') tail(name(getModelInfo())) [1] "widekernelpls" "WM" "wsrf" "xgbLinear" "xgbTree" [6] "xyf" getModelInfo()$xyf$type [1] "Classification" "Regression" How do I see the \$type for all the algos...

## Stata — predict after regression by group_id

regression,stata,predict
I have to run regressions by group_id and then generate the predictions. It doesn't seem like predict allows the "by" option. Is there a way I can predict after running regressions by group_id? The data are stacked by group_id. The regression command I am thinking of using is as follows:...

## R— repeating linear regression in a large dataset

r,regression
I'm an R newbie working with an annual time series dataset (named "timeseries"). The set has one column for year and another 600 columns with the yearly values for different locations ("L1," "L2", etc), e.g. similar to the following: Year L1 L2 L3 L4 1963 0.63 0.23 1.33 1.41 1964...