Library: | gplm |
See also: | discrete sort cumsum paf |
Quantlet: | replicdata | |
Description: | replicdata reduces a matrix x to its distinct rows and gives the number of replications of each rows in the original dataset. An optional second matrix y can be given, the rows of y are summed up accordingly. replicdata does in fact the same as discrete but provides an additional index vector to identify the reduced data with the original. It takes a little longer due an additonal sort. |
Usage: | {xr,yr,ro} = replicdata(x{,y}) | |
Input: | ||
x | n x p matrix, the data matrix to reduce, in regression usually the design matrix. | |
y | optional , n x q matrix, in regression usually the observations of the dependent variable. | |
Output: | ||
xr | m x p matrix, reduced data matrix (sorted). | |
yr | m x 1 vector or m x (q+1) matrix, contains in the first column the number of replications. If y was given, sums of y-rows with same x-row are contained in the other q columns of r. | |
ro | n x 1 vector, index vector to rearrange estimators obtained by xr, yr according to the original observations. |
library("gplm") n=100 x=2*sort(ceil(10*uniform(n))./10)-1 y=cos(pi*x) + normal(n) ; -------------------------------------- ; data reduction ; -------------------------------------- {xr,yr,o}=replicdata(x,y) r =yr[,1] yr=yr[,2] rows(r) ; -------------------------------------- ; kernel regression of yr on xr ; -------------------------------------- sr =sker(xr,1,"qua",r~yr) mhr=sr[,2]./sr[,1] ; -------------------------------------- ; get prediction for all obs. from mhr ; -------------------------------------- mh=mhr[o] ; -------------------------------------- ; compare ; -------------------------------------- s = sker(x,1,"qua",matrix(n)~y) mc = s[,2]./s[,1] sum(abs(mc-mh))
Matrices x, y with 100 rows are reduced to a matrix xr (containing distinct rows of x) and yr (sums of y with same rows in x). r gives the number of replications. The regression of y on x coincides with the weighted regression of yr on xr.