i'm using weka to do some text mining, i'm a little bit confused so i'm here to ask how can i ( with a set of comments that are in a some way classified as: notes, status of work, not conformity, warning) predict if a new comment belong to a...

Long time lurker first time poster. I have data that roughly follows a y=sin(time) distribution, but also depends on other variables than time. In terms of correlations, since the target y-variable oscillates there is almost zero statistical correlation with time, but y obviously depends very strongly on time. The goal...

I work on calibration of probabilities. I'm using a probability mapping approach called generalized additive models. The algorithm I wrote is: probMapping = function(x, y, datax, datay) { if(length(x) < length(y))stop("train smaller than test") if(length(datax) < length(datay))stop("train smaller than test") datax$prob = x # trainset: data and raw probabilities datay$prob...

So I have two models and I want to calculate these statistics. Is there any package to calculate them in Stata? PRESS statistic (wiki) And, if I am not mistaken. $$ R^2_{predicted} = 1 - \frac{RESET}{ESS} $$. ...

I've written a GA to model a handful of stocks (4) over a period of time (5 years). It's impressive how quickly the GA can find an optimal solution to the training data, but I am also aware that this is mainly due to it's tendency to over-fit in the...

I have a "csv " file which contains the user id, the book he/she has read, the rating for each book. I want to use Lenskit to predict a book rating for a user. For example, the user A has read 3 books,A,B,C, I want to predicate the rating for...

quick question on prediction. The value I’m trying to predict is either 0 or 1 (it is set as numeric, not as a factor) so when I run my random forest: fit <- randomForest(PredictValue ~ <variables>, data=trainData, ntree=50) and predict: pred<-predict(fit, testData) all my predictions are between 0 and 1...

I hope I have come to the right forum. I'm an ecologist making species distribution models using the maxent (version 3.3.3, http://www.cs.princeton.edu/~schapire/maxent/) function in R, through the dismo package. I have used the argument "replicates = 5" which tells maxent to do a 5-fold cross-validation. When running maxent from the...

I am working on machine learning and prediction for about a month. I have tried IBM watson with bluemix, amazon machine learning and predictionIO. What I want to do is to predict a text field based on other fields. My csv file have four text fields named Question,Summary,Description,Answer and about...

New to R. Looking to limit the range of values that can be predicted. df.Train <- data.frame(S=c(1,2,2,2,1),L=c(1,2,3,3,1),M=c(400,450,400,700,795),V=c(423,400,555,600,800),G=c(4,3.2,2,2.7,3.4), stringsAsFactors=FALSE) m.Train <- lm(G~S+L+M+V,data=df.Train) df.Test <- data.frame(S=c(1,2,1,2,1),L=c(1,2,3,1,1),M=c(400,450,500,800,795),V=c(423,475,555,600,555), stringsAsFactors=FALSE) round(predict(m.Train, df.Test, type="response"),digits=1) #seq(0,4,.1) #Predicted values should fall in this range I've experimented with the...

With my data (2 variables, Xt and Yt), I performed a Linear model in R Commander, which is named as LinearModel.1 Then, I wanted to predict the values that Yt would acquire when using different values of Xt, as in their 95% of confidence limits. After the linear model was...