We will demonstrate an example of processing of real data in this section. We can use two data sets of Wisconsin farm data, 1987, from originally 1000 data. Selected are middle sized animal farms, outliers were removed. The first data set animal.dat contains 250 observations (rows) of family labor, hired labor, miscellaneous inputs, animal inputs and intermediate run assets. The response variable livestock is contained in the second data set goods.dat . Detailed description of data, source, possible models of interest and some nonparametric analysis can be found in Sperlich (1998).
In this example we will deal with the first three inputs, i.e. family labor, hired labor, miscellaneous inputs and animal inputs. We will store them into the variable t and also we must read the response variable y:
data=read("animal.dat") t1 = data[,1] t2 = data[,2] t3 = data[,3] t4 = data[,4] t=t1~t2~t3~t4 y=read("goods.dat")Now we can calculate approximately bandwidth
h1=0.5*sqrt(cov(t1)) h2=0.5*sqrt(cov(t2)) h3=0.5*sqrt(cov(t3)) h4=0.5*sqrt(cov(t4)) h=h1|h2|h3|h4Finally we set up the parameters for estimation and run the partial integration procedure
g=h loc=0 opt=gamopt("shf",1) m = intest(t,y,h,g,loc,opt)For an objective view of the results we create the graphical output on Figure 7.1. It is produced by the following statements:
const=mean(y)*0.25 m1 = t[,1]~(m[,1]+const) m2 = t[,2]~(m[,2]+const) m3 = t[,3]~(m[,3]+const) m4 = t[,4]~(m[,4]+const) setmaskp(m1,4,4,4) setmaskp(m2,4,4,4) setmaskp(m3,4,4,4) setmaskp(m4,4,4,4) setmaskl(m1,(sort(m1~(1:rows(m1)))[,3])',4,1,1) setmaskl(m2,(sort(m2~(1:rows(m2)))[,3])',4,1,1) setmaskl(m3,(sort(m3~(1:rows(m3)))[,3])',4,1,1) setmaskl(m4,(sort(m4~(1:rows(m4)))[,3])',4,1,1) yy=y-mean(y)-sum(m,2) d1=t[,1]~(yy+m[,1]) d2=t[,2]~(yy+m[,2]) d3=t[,3]~(yy+m[,3]) d4=t[,4]~(yy+m[,4]) setmaskp(d1,1,11,4) setmaskp(d2,1,11,4) setmaskp(d3,1,11,4) setmaskp(d4,1,11,4) pic = createdisplay(2,2) show(pic,1,1,m1,d1) show(pic,1,2,m2,d2) show(pic,2,1,m3,d3) show(pic,2,2,m4,d4)
For better understanding the data we can use backfitting algorithm for
estimation
(quantlet
backfit
) and compare the results.
kern="qua" {mb,b,const} = backfit(t,y,h,loc,kern,opt)For graphical output we can use the similar approach as above with several differences.
m1 = t[,1]~mb[,1] m2 = t[,2]~mb[,2] m3 = t[,3]~mb[,3] m4 = t[,4]~mb[,4] setmaskp(m1,4,4,4) setmaskp(m2,4,4,4) setmaskp(m3,4,4,4) setmaskp(m4,4,4,4) setmaskl(m1,(sort(m1~(1:rows(m1)))[,3])',4,1,1) setmaskl(m2,(sort(m2~(1:rows(m2)))[,3])',4,1,1) setmaskl(m3,(sort(m3~(1:rows(m3)))[,3])',4,1,1) setmaskl(m4,(sort(m4~(1:rows(m4)))[,3])',4,1,1) yy=y-const-sum(mb,2) d1=t[,1]~(yy+mb[,1]) d2=t[,2]~(yy+mb[,2]) d3=t[,3]~(yy+mb[,3]) d4=t[,4]~(yy+mb[,4]) setmaskp(d1,1,11,4) setmaskp(d2,1,11,4) setmaskp(d3,1,11,4) setmaskp(d4,1,11,4) pic2 = createdisplay(2,2) show(pic2,1,1,m1,d1) show(pic2,1,2,m2,d2) show(pic2,2,1,m3,d3) show(pic2,2,2,m4,d4)