FAQ Database Discussion Community


Find top deciles from dataframe by group

r,data.frame,rank,quantile,split-apply-combine
I am attempting to create new variables using a function and lapply rather than working right in the data with loops. I used to use Stata and would have solved this problem with a method similar to that discussed here. Since naming variables programmatically is so difficult or at least...

Quantize Integers into discrete buckets

d3.js,charts,quantile,quantization
I have a list of ~7500 items which all have a similar signature: { revenue: integer, title: string, sector: string } The revenue will range from 0 to ~1 Billion. I'd like to construct a scale such that, given a particular company's revenue..it returns its position relative to the following...

Plot quantiles of distribution in ggplot2 with facets

r,ggplot2,quantile,density-plot
I'm currently plotting a number of different distributions of first differences from a number of regression models in ggplot. To facilitate interpretation of the differences, I want to mark the 2.5% and the 97.5% percentile of each distribution. Since I will be doing quite a few plots, and because the...

Matlab Calculating mean of distribution quantile in a for-loop

matlab,loops,for-loop,distribution,quantile
I am trying to calculate portfolio cVaR (conditional value at risk) levels from my simulated data for various portfolios. I am able to do that for one single portfolio using the following code: % Without a for-loop for series 1 test2 = test(:,1) VaR_Calib_EVT = 100 * quantile(test2, VarLevel_Calib); help1...

r get value only from quantile() function

r,quantile
I'm sorry for what may be a silly question. When I do: > quantile(df$column, .75) #get 3rd quartile I get something like 75% 1234.5 Is there a way to just get the value (1234.5) without the descriptive "75%" string? Thank you very much....

R function with functions as arguments, each with variable arguments

r,function,arguments,simulation,quantile
In answer to a question on Cross Validated, I wrote a simple function that used arbitrary quantile functions as its arguments etacor=function(rho=0,nsim=1e4,fx=qnorm,fy=qnorm){ #generate a bivariate correlated normal sample x1=rnorm(nsim);x2=rnorm(nsim) if (length(rho)==1){ y=pnorm(cbind(x1,rho*x1+sqrt((1-rho^2))*x2)) return(cor(fx(y[,1]),fy(y[,2]))) } coeur=rho rho2=sqrt(1-rho^2) for (t in 1:length(rho)){ y=pnorm(cbind(x1,rho[t]*x1+rho2[t]*x2)) coeur[t]=cor(fx(y[,1]),fy(y[,2]))} return(coeur) } However, both fx and fy may...

R qqplot argument “y” is missing error

r,plot,quantile
I am relatively new to R and I am struggling with a error messages related to qqplot. Some sample data are at the bottom. I am trying to do a qqplot on some azimuth data, i.e. like compass directions. I've looked around here and the ?qqplot R documentation, but I...

How can I get a percentile value for each dataframe row considering a subset of the data?

r,data.frame,quantile
I have a dataframe obs with 145 rowns and more than 1000 columns. For each row I would like to extract the value of the 95th percentile but calculated only on the data greater or equal to 1. I managed calculating a value for each row, considering all data, as...

Quantile per group in ddply

r,classification,plyr,quantile
I tried to classify my grouped data into quartiles, therefore adding a column "diam_quart" to the dataframe z assigning each row one of the four classes 1, 2, 3, or 4: quart = ddply(z, .(Code), transform, diam_quart = ifelse(Diameter <= quantile(Diameter , 0.25), 1, ifelse(Diameter <= quantile(Diameter , 0.5), 2,...

Assigning Percentile Based Groups to Dataframe in R

r,quantile
I am having trouble figuring out how to take on this particular problem. Suppose I have the following data frame: set.seed(123) Factors <- sample(LETTERS[1:26],50,replace=TRUE) Values <- sample(c(5,10,15,20,25,30),50,replace=TRUE) df <- data.frame(Factors,Values) df Factors Values 1 H 5 2 U 15 3 K 25 4 W 5 5 Y 20 6 B...

Applying a function to each quantile of an R dataframe

r,data.frame,quantile
I have an R dataframe and I want to apply an estimation function for each of its quantiles. Here's an example with lm(): df <- data.frame(Y = sample(100), X1 = sample(100), X2 = sample(100)) estFun <- function(df){lm(Y ~ X1 + X2, data = df)} If I split that in two...

Quantile-Quantile plot using two vectors with ggplot

r,ggplot2,quantile
I am looking for a quantile-quantile plot using ggplot2. I have these data: model <- c(539.4573, 555.9882, 594.6838, 597.5712, 623.6659, 626.7169, 626.8539, 627.9992, 629.1139, 634.7405, 636.3438, 646.4692, 654.3024, 663.0181, 670.0985, 672.8391, 680.5557, 683.2600, 683.5159, 692.0328, 695.7185, 698.9505, 702.3676, 707.4271, 726.6507, 726.8524, 732.1197, 741.6183, 750.3617, 752.5978, 757.1609, 762.2874, 767.0678, 776.9476, 779.2352,...

create quantile category variables using defined cut-points in Stata

category,stata,quantile
I am trying to create indicator variables using different quantile levels. I am creating a variable that contains categories corresponding to quantiles. For one variable, the code I am using is xtile PH_scale = PH, nq(4) tab PH_scale, gen(PH_scale_) Also, I know that if I want to use my own...