Keywords - Function groups - @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Library: gplm
See also: discrete sort cumsum paf

Quantlet: replicdata
Description: replicdata reduces a matrix x to its distinct rows and gives the number of replications of each rows in the original dataset. An optional second matrix y can be given, the rows of y are summed up accordingly. replicdata does in fact the same as discrete but provides an additional index vector to identify the reduced data with the original. It takes a little longer due an additonal sort.

Usage: {xr,yr,ro} = replicdata(x{,y})
Input:
x n x p matrix, the data matrix to reduce, in regression usually the design matrix.
y optional , n x q matrix, in regression usually the observations of the dependent variable.
Output:
xr m x p matrix, reduced data matrix (sorted).
yr m x 1 vector or m x (q+1) matrix, contains in the first column the number of replications. If y was given, sums of y-rows with same x-row are contained in the other q columns of r.
ro n x 1 vector, index vector to rearrange estimators obtained by xr, yr according to the original observations.

Example:
library("gplm")
n=100
x=2*sort(ceil(10*uniform(n))./10)-1
y=cos(pi*x) + normal(n)
; --------------------------------------
;  data reduction
; --------------------------------------
{xr,yr,o}=replicdata(x,y)
r =yr[,1]
yr=yr[,2]
rows(r)
; --------------------------------------
;  kernel regression of yr on xr
; --------------------------------------
sr =sker(xr,1,"qua",r~yr)
mhr=sr[,2]./sr[,1]
; --------------------------------------
;  get prediction for all obs. from mhr
; --------------------------------------
mh=mhr[o]
; --------------------------------------
;  compare
; --------------------------------------
s  = sker(x,1,"qua",matrix(n)~y)
mc = s[,2]./s[,1]
sum(abs(mc-mh))

Result:
Matrices x, y with 100 rows are reduced to a matrix xr
(containing distinct rows of x) and yr (sums of y with same
rows in x).
r gives the number of replications.
The regression of y on x coincides with the weighted
regression of yr on xr.



Author: M. Mueller, 20010228
(C) MD*TECH Method and Data Technologies, 05.02.2006