I have a data frame "DF" with this glimpse(): Observations: 1244160 Variables: $ Test (fctr) 72001.txt, 72002.txt, 72003.txt, 72004.txt, 72005.txt,... $ x (int) 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1... $ y (int) 1, 1, 1, 1, 1,...

Reading the documentation for do() in dplyr, I've been impressed by the ability to create regression models for groups of data and was wondering whether it would be possible to replicate it using different independent variables rather than groups of data. So far I've tried require(dplyr) data(mtcars) models <- data.frame(var...

I've been using nls() to fit a custom model to my data, but I don't like how the model is fitting and I would like to use an approach that minimizes residuals in both x and y axes. I've done a lot of searching, and have found solutions for fitting...

I have a dataset with 158 rows and 10 columns. I try to build multiple linear regression model and try to predict future value. I used GridSearchCV for tunning parameters. Here is my GridSearchCV and Regression function : def GridSearch(data): X_train, X_test, y_train, y_test = cross_validation.train_test_split(data, ground_truth_data, test_size=0.3, random_state =...

I have a set of 3d data (300 points) that create a surface which looks like two cones or ellipsoids connected to each other. I want a way to find the equation of a best fit ellipsoid or cone to this dataset. The regression method is not important, the easier...

With datasource expressed as X,Y the trendline appears correctly. However, things dont work with categories. Is there a good way to add a timeline without reformatting the data? JSFiddle:http://jsfiddle.net/9r9ba64r/5/ $(function () { // series data with Y and categories (this doesnt work!) //var sourceData = [{y:100} , {y:200}, {y:300}, {y:400}];...

How do you use a variable in a regression formula? For example, using the 'Animals' dataset (in MASS), the following works fine: data(Animals) model <- lm(body ~ brain, data = Animals) But what I want to do is: data(Animals) x <- "body" y <- "brain" model <- lm(x ~ y,...

Use lm function in to fit (Pt=aPt-1 + bXt + Dummy variable for each quarter) to fit the sample data. How to create n.ahead=12 forecast? Couldnt figure out how to set up dummy and Pt-1 fore iteration.Any help is appreciated!

rm(list=ls()) myData <-read.csv(file="C:/Users/Documents/myfile.csv",header=TRUE, sep=",") for(i in names(myData)) { colNum <- grep(i,colnames(myData)) ##asigns a value to each column if(is.numeric(myData[3,colNum])) ##if row 3 is numeric, the entire column is { ##print(nxeData[,i]) fit <- lm(myData[,i] ~ etch_source_Avg, data=myData) #does a regression for each column in my csv file against my independent variable 'etch'...

How do I find whether model type for all models at once? I know how to access this info if I know the algo name, e.g.: library('Caret') tail(name(getModelInfo())) [1] "widekernelpls" "WM" "wsrf" "xgbLinear" "xgbTree" [6] "xyf" getModelInfo()$xyf$type [1] "Classification" "Regression" How do I see the $type for all the algos...

I am trying fastbw function of rms package for backward regression as follows (using mtcars dataset): > mod = ols(mpg~am+vs+cyl+drat+wt+gear, mtcars) > mod Linear Regression Model ols(formula = mpg ~ am + vs + cyl + drat + wt + gear, data = mtcars) Model Likelihood Discrimination Ratio Test Indexes...

We recently migrated our application to JDK 8 from JDK 7. After the change, we ran into a problem with the following snippet of code. String output = new String(byteArray, "UTF-8"); The byte array may contain invalid UTF-8 byte sequences. The same byte array upon UTF-8 decoding, results in two...

I am trying to calculate the regression of the x and y variables, trace_no and twwt, respectively. The variable are 151 x 1 arrays. The code is outputting a syntax error: File "./seabed_dip_correction.py", line 32 slope, intercept, r_value, p_value, std_err, Syy/Sxx = stats.linregress(trace_no,twtt) SyntaxError: can't assign to operator I have...

I've been digging into the API of statsmodels.regression.linear_model.RegressionResults and have found how to retrieve different flavors of heteroskedasticity corrected standard errors (via properties like HC0_se, etc.) However, I can't quite figure out how to get the t-tests on the coefficients to use these corrected standard errors. Is there a way...

I am running a logit regression in R with a large number of input variables. newlogit <- glm(install. ~ SIZES + GROSSCONSUMPTION.... + NETTCONSUMPTION..... + NETTGENERATION....... + GROSSGENERATION.... + Variable. + Fixed + Cost.of.gross.cons + Cost.of.net.cons + Cons.savings + generation.gains + Total.savings + Cost.of.system + Payback + Self.consumption + Total.consumption.as.solar...

I'm trying to do change point detection with ´monitor´ from the strucchange package, but I have trouble getting a useful output. My input is a time stamped dataframe, and I would like the breaks to be returned as dates, but they are returned as observation number: cDF1 <- myDF[1:80,] >...

I am doing some bone segmentation whereas the result of this segmentation is points placed in a circular pattern around this bone. However as it is taken using a qCT scan, there is quite a lot noise (from e.g. flesh) on the points that i have. So the overall problem...

i'm trying to use an algorithm to minimise the least squares of models. I'd like to be able to confine all the parameters to within sensible ranges however when i run this script for whatever reason it is disregarding my limits. More of a debugging issue than anything else. Any...

I am busy with a regression model in R and i have about 16 000 observations. One of these observations causes me to get the following error message, (1 observation deleted due to missingness) Is there a way in R so that i can identify this one observation?...

I'm currently trying to modify an existing Stata model in R, and I'm running into problems with a specific step in the process. I need to use a CART regression to divide my dataset up into individual clusters based on their leaf node, such that each leaf node becomes a...

I used the CHAID package from this link ..It gives me a chaid object which can be plotted..I want a decision table with each decision rule in a column instead of a decision tree. .But i dont understand how to access nodes and paths in this chaid object..Kindly help me.....

Long time lurker first time poster. I have data that roughly follows a y=sin(time) distribution, but also depends on other variables than time. In terms of correlations, since the target y-variable oscillates there is almost zero statistical correlation with time, but y obviously depends very strongly on time. The goal...

How do I get a scatterplot matrix which will also show the fitted lines in each plot. I know how to use "abline" function with individual plots but don't know how to implement it in a scatterplot matrix

I would like to run a loop that will run per each category of one of the variables and produce a prediction per each regression so that the sum of the prediction variable will be deduced from the target variable .Here Is my toy data and code: df <- read.table(text...

I have a data set with 17 variables the data is available at this link http://www.uwyo.edu/crawford/stat3050/final%20project/maxwellchandler.txt I want to find significant interactions between the variables. For example fitcivilian<-lm(Civilian~Stock+Terrorism+log(Firepower)+Payload+Bombs*Temperature+FirstAid+Spies+Personnel+IG88, data=data) where Bombs*Temperature is significant What I want to do is test EVERY varaible against EVERY OTHER variable, Like doing Bombs*Temperature Bombs*Napalm...

Hey so I am developing a multiple regression model and using the forward subset selection method to reduce the number of parameters and using "mallows Cp" as a selection criterion. However this is an engineering problem and it does not make sense to have an intercept,, i.e. when all the...

I would like to loop over various regressions referencing different data subsets, however I'm unable to appropriately call different subsets. For example: dat <- data.frame(y = rnorm(10), x1 = rnorm(10), x2 = rnorm(10), x3 = rnorm(10) ) x.list <- list(dat$x1,dat$x2,dat$x3) dat1 <- dat[-9,] fit <- list() for(i in 1:length(x.list)){ fit[[i]]...

I'm plotting some interaction effects that stem from a regression in stata. I'm using excel for convenience. The data are curvilinear and I'm adding a polynomial trendline to maximize the fit. The problem I have is that the trendline function seems to assume that the x values are 1, 2,...

Using "stackloss" data in R, I created a regression model as seen below: stackloss.lm = lm(stack.loss ~ Air.Flow + Water.Temp + Acid.Conc.,data=stackloss) stackloss.lm newdata = data.frame(Air.Flow=stackloss$Air.Flow, Water.Temp= stackloss$Water.Temp, Acid.Conc.=stackloss$Acid.Conc.) Suppose I get a new data set and would need predict its "stack.loss" based on the previous model as seen below:...

I just have a problem with graphing different plots on the same graph within a ‘for’ loop. I hope someone can be point me in the right direction. I have a 2-D array, with discrete chunks of data in and amongst zeros. My data is the following: A= 0 0...

I have a function predictshrine<-0*rain-399.8993+5*crops+50.4296*log(citysize)+ 4.5071*wonders*chief+.02301*children*deaths+1.806*children+ .10799*deaths-2.0755*wonders-.0878*children^2+.001062*children^3- .000004288*children^4-.009*deaths^2+.0000530238*deaths^3+ 7.974*sqrt(children)+.026937*wonders^2-.0001305*wonders^3 I also have a sequence children<-seq(0,100,length=500) And a for loop for(deaths in c(0,5,10,50,100,200)) Now what i want to do is be able to plot predictshrine vs children when deaths equals certain amounts and...

I would like to compute 5 regression coefficients.I searched thru Internet but I did not find anything for this. my data: y=c(2,13,0.4,5,8,10,13) x=c(2,13,0.004,5,8,1,13) z=c(2,3,0.004,15,8,10,1) normal equation: y=a1x+a2z+a3 x, z, independent variables, y is the dependent variable, and a1 a3, and a2 are the parameters of the model. normal fit for...

I am using following code with glmnet: > library(glmnet) > fit = glmnet(as.matrix(mtcars[-1]), mtcars[,1]) > plot(fit, xvar='lambda') However, I want to print out the coefficients at best Lambda, like it is done in ridge regression. I see following structure of fit: > str(fit) List of 12 $ a0 : Named...

I am trying to draw a nomogram from a logistic regression in R by using the rms package, but currently I have a problem: indeed, I can get the nomogram, but the "linear predictor" axis ranges from -2.5 to +3, and I'd like to know whether I can make it...

This is what I would like to do: library("lmtest") library("dynlm") test$Date = as.Date(test$Date, format = "%d.%m.%Y") zooX = zoo(test[, -1], order.by = test$Date) f <- d(Euribor3) ~ d(Ois3) + d(CDS) + d(Vstoxx) + d(log(omo)) + d(L(Euribor3)) m1 <- dynlm(f, data = zooX, start = as.Date("2005-01-05"),end = as.Date("2005-01-24")) m2 <- dynlm(f,...

I'm an R newbie working with an annual time series dataset (named "timeseries"). The set has one column for year and another 600 columns with the yearly values for different locations ("L1," "L2", etc), e.g. similar to the following: Year L1 L2 L3 L4 1963 0.63 0.23 1.33 1.41 1964...

I have to run regressions by group_id and then generate the predictions. It doesn't seem like predict allows the "by" option. Is there a way I can predict after running regressions by group_id? The data are stacked by group_id. The regression command I am thinking of using is as follows:...

I want minimize the following equation: F=SUM{u 1:20}sum{w 1:10} Quw(ruw-yuw) with the following constraints: yuw >= yu,w+1 yuw >= yu-1,w y20,0 >= 100 y0,10 >= 0 I have a 20*10 ruw and 20*10 quw matrix, I now need to generate a yuw matrix which adheres to the constraints. I am...

I am doing a project involving scientific computing. The following are three variables and their values I got after some experiments. There is also an equation with three unknowns, a, b and c: x=(a+0.98)/y+(b+0.7)/z+c How do I get values of a,b,c using the above? Is this possible in MATLAB?...

I have a set of data Y v/s X (~20k data points) which when plotted are a scatter. I want to plot error bars for Y for a ranges of X(eg. the X axis is of length 100, then I want the errorbars to represent the standard deviation of Y...

I have some personal dataset. So I split it into variable to predict and predictors. Following is the syntax: library(Cubist) str(A) 'data.frame': 6038 obs. of 3 variables: $ ads_return_count : num 7 10 10 4 10 10 10 10 10 9 ... $ actual_cpc : num 0.0678 0.3888 0.2947 0.0179...

Let's first look at lm. I have a continuous explanatory $X$ and a factor $F$ modelling seasonal aspects (in the example 8 levels). Let $\beta$ denote the slope for $X$ then I want to model interactions of the slope with the factor. It is some kind of physical model thus...