ECONOMIC STUDIES DEPARTMENT OF ECONOMICS SCHOOL OF ECONOMICS AND COMMERCIAL LAW GÖTEBORG UNIVERSITY 141 _______________________ FOUR ESSAYS ON THE MEASUREMENT OF PRODUCTIVE EFFICIENCY Dag Fjeld Edvardsen ISBN 91-85169-00-5 ISSN 1651-4289 print ISSN 1651-4297 online Four Essays on the Measurement of Productive Efficiency Doctoral thesis by Dag Fjeld Edvardsen ( dfe@byg g f o r s k . n o ) Preface A ft e r gradua t i n g in econo mi c s from the Univ ers i t y of Oslo I st arted working as a resear c h assis t a n t at the Frisc h Centre in January 1999. My firs t task was trying to underst a n d a strange method I had never heard of before. It was referee d to as Data Envelo p me n t Analys i s (DEA). Soon I was working with Finn R. Førsund and Sverre A.C. Kittels e n on applie d projec t s where DEA was used to me as ur e techni c a l effici e nc y . Exampl e s were nursin g homes and home care, employ me n t offi ce s , colle g es , elect ri c i t y distri b u t i o n utili t i e s , and physical therapis t s . In 2000 Norwegia n Building Research Institut e (NBI) in cooperat i o n with the Frisch Centre wrote an applica t i o n to the Norwegi a n Resear c h Counci l (NFR). The topic to be investigated was the efficiency of the Norwegian construction industr y . When the applica t i o n was accept e d I was hired at NBI as a doctor a l studen t . I would like to thank NFR for financ i n g the three years it has taken to write the four essays in this thesis . I am deeply gratef u l to NBI for offeri n g me the opport u n i t y to be part of the project “Produ c t i v i t y in Cons tr u c t i o n ” (and for provid i n g a very large amoun t of coffee ) . Frank Hennin g Holm (now head of NBI) and Gret he Bergly (now at Multico n s u l t ) deserve thank s for hiring me. In 2001 Thor bj ø r n Ingval d s e n became leader of the projec t when Grethe Bergly went to Multic o ns u l t . His encour a g e me n t , humou r , and patien t s have been enormo u s , as is his knowledge of the Norwegian construction industr y. Jon Rønning became head of PROS (the depar t me n t this proj e ct is locat e d at) when Frank Henning Hol m left to become head of NBI. Jon’s support has been without pa rall e l . I would also like to thank my patient and understanding colleges at NBI for encouragement. My thesis adviso r s Lennar t Hjalma r s s o n and Finn R. Førsund have been the best advisors a doctoral student could ever wish for. Lennart has been very suppor tive and helped me every time things seemed very diffic u l t . Finn has always been there for me, and his advi ce and support have been a necessary condition fo r this thesis to exist. Finn and Lennart ’ s knowle d g e of microe c o n o mi c s and effici e n c y analys i s are withou t doubt world class. Sverr e A.C. Kittel s e n (Fris c h Centr e ) has been an enormo u s resour ce for me. His knowledg e of the subtle and difficu lt parts of efficiency analysis is more than impres s i v e . He has also been invaluab l e when it comes to the development of the software used for the bootstrap calculations us ed in this thesis . The me mber s of the refere n c e group for “Produ c t i v i t y in Constru c t i o n ” deserve thanks for understanding that some task are worth doing even if they require time: Rolf Albri k t s e n (Veidek k e ASA), Finn R. Førsund (Univer s i t y of Oslo/Fri s c h Centre), Frank Henning Holm (NBI), Sverre Larsen (BNL), Knut Samset (N TNU ) , Arild Thomma s e n (Stat i st i c s Norwa y at Kongsvin g e r ) , and Grethe Bergly (Multico n s u l t ) . Last, but not least, I would like to thank my fa mil y : My late mothe r Laila (who died last year), my father Johny, my sister Janne, my aunt Unni, and my uncle Hugo. Their suppor t has been invalua b l e , and without it this thesis would not exist. Oslo, November 2004 Dag Fjeld Edvard s e n ii Contents 1. Abstract 2. Introduction 3. Essay I. Internat i o n a l Benchma r k i n g of electr i c i t y distri b u t i o n utilit i e s 4. Essay II. Far out or alone in the crowd: clas sification of self evaluators in DEA 5. Essay III. Climbin g the efficie n c y stepla d d e r : r obustne s s of efficien c y scores in DEA 6. Essay IV. Efficien c y of Norwegi a n constru c t i o n firms iii A b s tract This collection of essays c ontains two kinds of contribu t i ons. All four essays include applications of the existing DEA (Da t a Envelopm e n t Analys i s ) toolbo x on real world datase t s . But the main contri b u t i o n is that they also o ffer new and useful tools for practitioners doing efficiency and productivity analysis. Essay I is about benchmarking by means of applyin g the DEA model on electri c i t y distrib u t o r s . A sample of larg e elect r i ci t y distr i b ut i o n utili t i e s from Denmar k , Finlan d , Norway , Sweden and the Nether l a n d s for the year 1997 is studie d by assumi n g a common produc t i o n fronti e r for all countr i es. The peers supporting the benc hmark frontier are from all countri e s . New indices descri b i n g cross-country connections at the level of individ u a l peers and their ineffic i e n t units as well as between countri e s are de velope d , and novel applicat i o n s of Malmqui s t product i v i t y indices comparing units from diffe ren t countrie s are performe d . The contribution of Essay II is to develo p a method for classify ing self-evaluators based on the addit i ve DEA model into inter i or and exteri o r ones. The exteri o r self- e v a l u at o rs are efficie n t “by default ” ; ther e is no firm evidence from obs ervations for the classification. These units should therefo r e not been regarde d as efficie n t , and should be removed from the obser v a t i o n s of effic i e nc y score s when perf or m i n g a two-sta g e analys i s of explain i n g the distrib u t i o n of the scores. The applica t i o n to munici p a l nursin g - and home care servic e s of Norway shows signif i c a n t effect s of removin g ex teri o r self-e v a l u a t o r s from the data when doing a two-sta g e analysi s . The robustne s s of the efficien c y scores in DEA has been address e d in Essay III. It is of cruci a l impor t a n c e for the pract i c a l use of effic i e n c y score s . The purpos e is to demons t r a t e the usefuln e s s of a new way of getting an indicat i o n of the sensi t i vi t y of each of the effi c i e n c y scor e s to measu r e me n t erro r . The main id ea is to investigate a DMU’s (Decision Making Unit) sensi t i v i t y to seque n t i al remov a l of its most infl u e n t i al peer (with new peer identif i c a t i on as a part of each of the iterati o n s ) . The Efficie n c y stepla d d e r approa c h is shown to provide relevant and useful informa t i o n when appli e d on a datas e t of Nordi c and Dutch elect r i c i t y distr i b ut i o n utili t i es . Some of the em pir i c a l effic i e n c y estima t i o n s are shown to be very sensi t i ve to the valid i t y and exist e n c e of one or a low numbe r of other obser v at i on s in the sampl e . The main comp e t i n g method is Peeli n g , which consi s t s of remo v i n g all the front i er units in each step. The new method has some streng t h s and some weakne s s e s in compar i s o n . All in all, the Effici e n c y stepla d d e r measur e is simple and crude, but it is shown that it can provid e useful informa t i o n for practi t i o ne r s abou t the robust n e s s of the effici e n c y scores in DEA. Essay IV is an attempt to perform an effi cien c y study of the c onstruction industry at the micro level . In this essay infor m a t i o n on multipl e output s is utiliz e d by applyi n g DEA on a cross section dataset of Norweg i a n constru c t i o n firms. Bootstr a pping is applied to select the scale specif i c a t i o n of the model. Consta n t retur n s to scale was rejec t e d . Furth e r mo r e , bootstr a p p i n g was used to estimat e and correc t for the sampli n g bias in the DEA efficiency scores . One import a n t lesson that can be learne d from this appli c a t i o n is the dange r of takin g the efficie n c y scores from uncorr e c t e d DEA cal cul a t i o n s at face value. A new contri b u t i o n is to use the inver s e of the stand a r d error s (from the bias correct i o n of the efficie n c y scores) as weigh t s in a regres s i o n to expla i n the effic i e n c y scores . Severa l of the hypoth e s e s invest i g a t e d concern i n g the latter are found to have statis t i c a l l y signif i c a n t empiri c a l releva n c e . iv Introduction A key paradig m in neo-cla s s i c a l product i o n theory is that firms operate on the product i o n frontie r . However , even a superficial observation of real produc t i o n units indica t e s that this is most often not the case. It is then rather odd that economi s t s contin u e to believe in this paradi gm, and that so littl e effor t is spen t on revea l i n g ineffi c i e n c i e s and their cause s . Most of the old tricks learned in microec o n o mi c s become inval i d since they adopt the assumpt i o n from neo-c l a s s i c a l econo mi c s that firms behave as if they were techni c a l l y effici e n t . A natural starting point for developing methods for the study of pr oductive efficiency is the semi n a l 1957 paper by Micha e l J. Farre l l with the approp r i a t e title “The measur e me n t of produc t i v e effic i e n c y . ” Farre l l ’ s key contri bution was introducing a non-pa r a me t r i c metho d for estima t i n g the effici e n t produc t i o n fronti e r as a refere n c e for his effici e n c y measu r e s , based on envelo p i n g data “from above. ” This appro a c h gener a l i z e s natur a l l y to multi p l e inputs and multip l e output s . The four essay s in this thesi s are modes t attempt s to follo w up the Farrel l tradi t i o n as it has been developed both within economi c s and operat i o n s res earch where the term DEA wa s coined in Charne s et al. (1978). Essay I. International benchmarking of electricity distribution utilities 1 I mp r o v e me n t of effic i e nc y in elect r i c i t y di stribution utilities has come on the agenda, as an increas i n g number of count rie s moved towards deregula t i o n of the sector in the last decade . A key elemen t in assessi ng potent i a l s for effici e n c y impro v e me n t is to establ i s h benchma r k s for efficie n t operati o n . A standard definit i o n of be nch ma r k i n g is a comp a r i s o n of some measu r e of actua l perfo r ma n c e again s t a re fere n c e perfo r ma n c e . One way of obtain i n g a compreh e n s i v e benchma r k i n g as opposed to pa rti a l key ratio s is to establ i s h a front i er product i o n functio n for utiliti e s , and then calcula t e efficien c y scores relative to the frontier . In this study a piecew i s e linea r front i e r is used, and techni c a l effici e n c y measur e s (Farre l l , 1957) and Malmqu i s t produc t i v i t y measur e s (Caves et al., 1982) are calculated by emplo y i n g the DEA model (Char n e s et al., 1978) . The DEA model ha s been used in severa l studie s of the utilit i e s sector recent l y . A specia l featur e of the present cross sectio n study is that the data (for 1997) is ba sed on a sample of utiliti e s from five count r i e s : Denma r k , Finlan d , The Netherl a nd s , Norway and Swede n. Most of the effici e nc y studi e s of utilit i e s have been focusi n g on utilit i e s within a singl e country (Førsun d and Kittels e n , 1998), but a few studie s have also compar e d utilit i e s from differ e n t count r i e s (Jama s b and Pollit t , 2001) . In some cases an intern a t i o n a l basis for benchm a r k i n g is a necess i t y due to the limite d numbe r of si mila r firms withi n a countr y . When the number of units is not the key motivat i o n for an intern a t i o n a l sample for benchma r k i n g , the motiva t i o n may be to ensure that the national best pract i c e utili t i e s are also benchmar k e d . There are some addit i o n a l probl ems with using an interna t i o n a l data set for benchma r k i n g . The main proble m is that of compar ab i l i t y of data. One is forced to use the strat e g y of the least commo n denomi n a t o r . A speci a l issue is the corre c t handl i n g of curre n c y exchan g e rates. There are really only two pract i ca l alter n a t i ve s ; the aver age rates of excha n g e and the Purcha s i n g Power Parity (PPP) as meas ur ed by OECD. The latter approach is chosen here. Relativ e differe n c e s in i nput prices like wage rates and rates of return on capita l ma y also creat e probl e ms as to disti n g ui sh betwe e n subst i t ut i o n effec t s and ineffi c i e n c y . 1 This essay was pub lish ed in Reso ur ce and Energ y Econ omics in 2003 . v Accordi n g to the finding s in Jamasb and Pollitt (2001) interna t i o n a l compari s o n s are often restri c t e d to compar i s o n of operat i n g costs becau s e of the hetero gen e i t y of capita l . As a precon d i t i o n for intern a t i o n a l compar i s o n s they focus on improv i n g the qualit y of the data collection process, auditing, and standardization within and across countries. Our data have been collect e d specifi c a l l y for this study by natio na l regula t o r s , and spec ial attent i o n has been paid to standa r d i z e the capital input as a replac e me n t cost concep t . When doing intern a t i o n a l benchma r k i n g for the same type of product i o n activi t y in several countrie s , applying a common frontier technolo g y seems to yield the most satisfac t o r y envir o n me n t for ident i fyi n g multi n a t i o n a l peers and assess i ng the exten t of ineffi c i enc y . In our exercis e for a samp le of large electr i c i t y distribu t i o n utilitie s from Denmark, Finland Norway, Sweden and the Netherlands it is remarka b l e that peer s come from all countries. The import a n c e of exposi n g natio n a l unit s, and especially units that would have been peers within a nationa l techno l o g y , to interna t i o n a l be nchma r k i n g is clearl y de mons t r a t e d . The multi n a t i o n al setti n g has calle d for the devel op me n t of new indic e s to captur e the cross - c o u n t r y patter n of the nation a l i t y of peers and the nation a l i t y of units in their refere n c i n g sets. Bilate r a l Malmqu i s t produc t i v i t y compar i s o n s can be performe d between units of particul a r intere s t in additi o n to country origin, e.g. sorting by size, or locati o n of utilit y (urba n - rural ) , etc. We have focus e d on a singl e unit again s t the (geome t r i c ) avera g e perfo r ma n c e of all units, as well as bilate r a l compar i s o n s of (geo me t r i c ) averag e s of each countr y . Our result s point to Finlan d as the most produc t i v e coun try within the common techno l o g y . This result reflec t s the more even distri b u t i o n of the Finni sh units and the high share of units above the total sampl e mean of effici e n c y score s . Essay II. Far out or alone in the crowd: cl assification of self evaluators in DEA T h e DEA method classif i e s units as efficie n t or ineffi c i e n t . The units found strong l y effici e n t in DEA studie s on effici e n c y can be divide d into self-e v a l u a t o r s and active peers, depending on whether the peers are referencing any ineffi c i e n t units or not. Self- e va l u a t or s was introduced by Charnes et al. (1985). The contrib u t i o n of the paper starts with subdividing the self-e v a l u a t o r s into interi o r and exteri o r one s. The exteri or self-e v a l u a t o r s are effici e n t “by defaul t ” ; there is no firm eviden c e from obser v a t i o n s for the classi f i cation. Self-evaluators may most natu r a l l y appe a r at th e “edges ” of the techno l o g y , but it is also possib l e that self- e v a l u a t o r s appear in th e interior. It may be of importan c e to disting u i s h between the self- evaluators being exterior or in terior. Finding the in fluence of some variables on the level of efficie n c y by running regres s i o n s of efficie n c y scores on a set of potenti a l explan a t o r y variable s is an approach often follow e d in actual inves t i g a tions. Using exterior self-evaluators with effic i e n c y score of 1 in such a “two- s t a g e ” proced u r e may the n distor t the results , becaus e to assign the value of 1 to these self-e va l u a t or s is arbit r a r y . Inter i o r self- e va l u a t or s , on the other hand, may have peers that are fair ly simila r . They should then not be droppe d when applying the two- stage approach . A method for classif y i n g self-e v a l u a t o r s based on the additive DEA model , either CRS or VRS, is developed. The exterior str ongly efficient units ar e found by running the envelo p i n g proced u r e “from belo w ” , i.e. revers i n g the signs of the slack vari a bl es in the additi v e model, after removi n g all the ineffi c i e n t units from the data set. Which units of the stron g l y effic i e n t units from the addit i ve model th at turn out to be self- e v a l u a t o r s or activ e peers, will depend on the orienta t i o n of the efficie n c y analys i s , i.e. whethe r input- o r output orientation is adopted. The classi f i c a t i o n into exterio r and interi or peers is determined by the strongl y effici e n t units turnin g out to be exterior ones running the “rever s e d ” additi v e model. The exterior self-eva l u a t o r s units should be removed from the obs erv a t i o n s on effici e n c y scores when perfor mi n g a two-st a ge analysis of explaining the distribution of the vi score s . The applica t i o n to municip a l nursin g - and home care servi c e s of Norwa y shows signific a n t effects of removing ex ter i o r self- e val u a t or s from th e data when doing a two-stage analy si s . Thus the concl u s i o n s as to expla n a t i o ns of the effi c i e n c y scor e distr i b ut i o n will be qualif i e d taking our new taxono my into use. Essay III. Climbing the effici ency stepladder: robustness of efficiency scores in DEA The robust n e s s of the effici e n c y scores in DEA has been addre s s e d in a numbe r of researc h papers. There are several potenti a l probl e ms that can dist u r b preci se effic i e n c y estima t i o n , such as sampli n g error, specif i c a t i o n error , and measu r e me n t error . It is almos t exclu si v e l y the latte r that is dealt with in this paper . It has been prove n analy t i c a l l y that the DEA effic i e n c y estima t o r s are asymp t o t i c a l l y consistent given that a set of assumpt i o n s is satis fi e d . The mo st critic a l assump t i o n might be that there are no measure me n t errors. The DE A method estima tes the production possibility set by envelo p i n g the data as close as possib l e , in the sense that the fron t i e r consi s t s of conve x combi n a t i o n s of actual obser v a t i o n s , given that the front i e r estima t e can never be “belo w ” an observe d value. If the assumpt i o n of no measur e me n t error is broken we might observe input- o u t p u t vector s that are outside the true produ c t i o n possi bi l i t y set, and the DEA front i e r estima t e will be too optimi s t i c . Ca lculating the efficiency of a correc t l y measur e d observ a t i o n again st this optimi s t i c front i er will lead to e fficie n c y scores that are biased downwa r d s . In other words, even symme t r i c measu r e me n t error s can produc e effici e n c y estima t e s that are too pessi mi s t i c . It is of cruci a l impor t a n c e for the pract i c a l use of the effici e n c y scores that informa t i o n about their sensit i v i t y is availa b l e . The reason why measuring sensitivity is a chall e n ge is in a sense relat e d to the diffi c u l t y with looki n g at n-dime n s i o n a l space . In two di men s i o n s , and possi b l y three , one can get an idea of the sensi t i vi t y of one obser v at i on effi c i e n c y score by visua l l y inspe c t i n g a scatter diagram. But when the number of dimensi o n s is highe r than three , help is neede d . The Efficie n c y Stepla d d e r method intr oduc e d in this paper is an offer to empi rically oriented DEA appl i ca t i o ns . This paper is not about dete cting outliers; it is about i nvestigating the robustness of each DMUs efficiency score. The main inspiration is Timmer (1971), and the intent i o n is to offer a crude and simple method that works relati v e l y quickl y and is avail a b l e to pract i t i o ne r s as a freely downlo a d a b l e softwa r e packag e . In the follow i n g only DEA relat e d appro a c h e s are consi de r e d . There are mainl y two ways sensit i v i t y to measur e me n t error in DEA has been examin e d : (1) pertur b a t i o n s of the observat i o n s , often with strong focus on the underlying LP model, and (2) exclusion of one or more of the observa t i o n s of the dataset . The Efficie n c y Stepla dd e r is based on th e latter alter n at i v e . The ma in idea is to examin e how the effici e n c y score of a given ine fficient DMU develops as the most infl ue n t i al other DMU is remov e d in each of the itera t i ve step s . The first step is to deter mi n e which of the peers whose remov a l is associ at e d with the large st incre a s e in the effic i e n c y score . This peer is permane n t l y removed , and the DEA m odel is recalc ul a t e d givin g a new effici e n c y score and a new set of peers. The remova l conti n u e s in this fashi o n until the DMU in quest i o n is fully effi c i e n t . This serie s of iterati v e DMU exclusi o n s provide s an “effici e n c y curve” of the increa si ng effici e n c y values connec t e d with each step. There are few altern a t i ve appro a c h e s avail a bl e that provid e infor ma t i o n about the sensit i vi t y of efficien c y scores. Related methods in th e literature are Peeli ng (Barr et al., 1994), Effici e n c y Order (Sinua n y - S t e r n et al., 1994) and Efficiency Depth (Cherc h y e et al., 2000). Peeli n g consi s t s of remov i n g all the front i er un it s in each step. Ther e are also simil a r i t i es between the Efficiency stepladder and the Efficien c y order/Ef f i c i e n c y Depth methods. The vii mai n diffe r e n c e is that the Effic i e nc y stepl a d d e r appr o a c h is conce r ne d with the stepw i se incre a s e in the effic i e nc y score s after each iterativ e peer removal, while the Efficien c y Order/Ef f i c i e n c y Depth methods are more concer n e d with the number of observa t i o n remova l s that is require d for the DMU in questio n to reach full effici e nc y . The empir i c a l appli c at i o n is mainl y used as an illust ra t i on on how the Effi c i e n c y stepl a d d e r metho d wor ks on real world data. The applica t i o n is used to show what kind of analysi s can be perform e d using this metho d . To carry out a full s cale empiric a l analysi s is an extensive undertaking, and is outsi de the scope of this paper. Ideall y sensi t i v i t y analy si s , detection of potential outlier s, and estima tion of sampling bias should be carrie d out simu l t an e o u s l y . It is easie r to dete ct outliers if we have some informat i o n about the sampling bias , and it is easier to estima t e sampli n g bias if we have first iden t i fi e d the outli er s . There have been devel op me n t s made on all these area s in the last few years, but at the time of wr iting no single method offers a solutio n to all the mention e d challe n g e s . The Efficie n c y stepla d d e r method is simple and crude, but it can still be useful for applied DEA investi g a t i o n s . It s hould be though t of as one way safe: An Efficie n c y steplad d e r that is very steep is a clear indic a t i on that th e DEA estima t e d effic i e n c y is stron gl y depen d e n t on the correct n e s s of a low number of other observ a t i o n s . A slow increase on the other should not be interpreted as a strong indica t i o n that the effici e n c y is at least this lo w. The reason is that the metho d is only one-s t e p- o p t i ma l . In a ddit i o n to measu r i n g the sensi t i v i t y of the e- sco r e s for effic i e n t and ineff i c i e n t units , it might be used in combi n a t i o n with bootst r a p p i n g to identify possible outliers . The necessar y softwa re for carrying out the Efficien c y stepladde r calcul a t i o n s will be made avail a b l e from the autho r ’ s websi t e . The purpos e of the ESL method is to exa min e the sensit i v i t y of the effi ci e n c y scores for measure me n t errors. Bootstr a p p i n g on the othe r hand is in the DEA conte xt (pri ma r i l y ) used to measu r e sens i t i vi t y to sampl i n g error s . We would expect that a DMU with a large ESL(1) value would also have a large standard error of the bias correc t e d effic i e n c y score . The reason is that we expect the part of the (input , output ) space where the DMU is locate d to be sparsel y popula t e d . Tentati v e runs have shown st atis t i c a l l y signif i c an t and positi v e correl a t i on betwee n the ESL(1) values and the standa r d errors of the bootstra p p e d bias correct e d efficie n c y scores. Furthe r mo r e , there is strong empiri c a l associ a tion between the ESL(1) values for the fully effici e n t DMUs (=supe r e f f i c i e n cy) and the sampling bias estim ated using bootstrapping. This is a promis i n g topic for furthe r resear c h . Essay IV. Efficiency of Norwegian construction firms Low produc t i v i t y growt h of the constr u c t i o n in dus t r y in the nineti e s (base d on nation a l accoun t i n g figure s ) is causing substa n t i a l con cern in Norway. To identif y the underly i n g cause s invest i g a t i o n s at the micr o level are needed . Howeve r , effici e n c y studie s at the micro level of the of the constr u c t i o n indust r y are very rare. The objecti v e of this study is to analyze product i v e efficie n c y in the Norwegi a n constru c t i o n industry . A piecewis e linear frontier is used, and techni c a l effici e n c y measur e s (Farre l l , 1957) are calcul a t e d on cross secti on data followi n g a DEA (data e nvelo p me n t analys i s ) approa c h (Charne s et al., 1978). The DEA efficiency scores are bias co rrected by bootstrapping (Sima r and Wilso n , 1998, 2000), and a bootstr a p p e d scale specif i c a t i o n test is perfor me d (Sima r and Wilson , 2002). A new contribution is to use weights based on the standard errors from the bootst r a p p e d bias correc t i o n in the two stage model when search i n g for explan a t i o n s for the efficie n c y scores. viii On e reaso n for the small numbe r of effic i e n c y an alyse s of the constru c t i o n industry may be the proble m to “ident i fy ” the activi t i e s in terms of technology, inputs and outputs in this indus t r y . It is well known that there are larg e organ i z a t i o n a l and tech nological differences betwee n buildi n g firms. Even whe n the produc t s are seemin g l y simila r there are large differ e n c e s in the way project s are carried out. Fo r inst a n c e some build i ng projects use a large share of prefa b r i c a t e d eleme n t s , while other projec t s produc e almost everyt h i n g on the buildin g site. This often happens even when the result i n g constr u c t i o n is seemin g l y simila r . It is intere s t i n g to note that proj ec t s with such large differ e n c e s in the techn o l o g i c a l appro a c h can exist at the same time. More ov e r, the comp o s i t i o n of outpu t varie s a lot betwe e n diffe r e n t construction companies so the definition of the ou tput vector may also be a problem. Thus to captu r e such indus t r y chara c t e r i s t i c s , a multip l e input multi p l e outpu t appr o a c h is requi r e d . Large differ e n c e s in the efficie n c y and product i v i t y scores were discov e r e d . One impor t a n t lesso n that can be learn e d from th is applic a t i o n is the danger of taking the effici e n c y scores from uncorr e c t e d DEA calcul a t i ons at face value. If one decided to learn from a few DMUs based on their uncorrec t e d effi ci e n c y scores , one mi ght get into troubl e . It is not unreas o n a b l e to think that simila r things have happened in the last few years as DEA has been embrac e d by a very large number of practit i o n e r s (resear c h e r s and consult a nt s ) . It would be intere s t i n g if the large number of empirical DEA papers were recalculated using the bootst r a p method o l o g y . Anecdo t a l observat i o n s indicate that very few practitioners use bootst r a p p i n g . The reason for this might be that bootstr a p p i n g is not yet available in the standar d DEA softwar e package s . Based on a scale specifi c a t i o n test, a vari abl e retur n s to scale speci f i c a t i o n was selected . A scale chart indicate d that firms with total produc t i o n value s lower than 100 mill. NOK might be operating at a subopt i ma l scale level. The differ e n c e s in the effici e n c y scores may be explai n e d by enviro n me n t a l and manage r i a l varia b l e s . Such variab l e s have been tried in a two stage approach. A new contribu t i o n is the demonstr a t i o n of how one can use the stand a r d error s from the bias corre c t i o n in stage one to improve the power of the regres s i o n model in stage two. Five possible explana t i on s were examin e d fo r empirica l relevanc e , and four of them were found to be statist i c a l l y si gnifi c a n t in a multiva r i a t e wei ghte d regress i o n setting . More detailed data would be necessar y before st rong conclusions can be made, but there are indica t i o n s that the most effi ci e n t buildi n g firms are charac t e r i z e d by high averag e wage s, low numbers of apprent i c e s , dive r s i f i e d produ c t mixes and high numbe r s of hours worke d per emplo y e e . ix References Barr, R.S., M.L. Durchholz and Seiford, L ., 1994, Peeling the DEA Onion. Layering and Rank-Ordering DMUs Using Tiered DEA, Souther n Methodi s t Un iver s i t y techn i ca l repor t , Dallas, Texas. Caves, D.W., L.R. Christensen and E. Di ewer t , 1982a , The economi c theor y of index numbe r s and the measu r e me n t of input, output , and produc t i v i t y , Econometrica 50, 1393-14 1 4 . Charnes, A., Cooper, W.W. and Rhodes, E., 1978, Measur i n g the effici ency of decision makin g units , European Journal of Operations Research 2, 429-444. Charnes, A., Cooper, W. W., Lewin, A.Y. , Mo rey, R.C., and Rousse a u , J.J.., 1985. Sensit i vi t y and Stabil i t y Analys i s in DEA. Annals of Operations Research 2 139-150. Cherchy e , L. Kuosma n e n , T. and Post, G.T ., 2000, New Tools for De aling with Errors- I n - Variables in DEA, Katholike Universiteit Le uven, Center for Economi c Studies, Discussi o n Paper Series DPS 00.06. Farrell , M.J.,1 9 5 7 , The measure me n t of product i v e effici e n c y , J.R. Statis. Soc . Series A 120, 253-281 . Førsund , F. R. and S. A. C. Kittelse n , 1998, Produc t i v i t y develop me n t of Norweg i a n elect r i c i t y distr i b ut i o n utili t i es , Resource and Energy Economics 20(3), 207-22 4 . Jamasb , T. and M. Pollit t , 2001, Bench ma r k i n g and regula t i o n : intern ational electricity experi e n c e , Utilities Policy 9(3), 107-130. Sinuany - S t e r n , Z., A. Mehrez an d A. Barboy, 1994, Academic Departme nts Effi ciency via DEA, Computers Ops. Res . , vol. 21, No. 5, pp. 543-556. Simar , L. and Wilso n , P. W., 1998, Sensi t i v i t y anal ysi s of efficie n c y scores: How to bootstr a p in nonpara me t r i c frontie r models. Management Science, 44, 49–61. Simar, L., and Wilson, P., 2000, A general me thodol o g y for bootstra p p i n g in nonparame t r i c front i e r model s , Journal of Applied Statistics 27, 779--80 2 . Simar, L. and Wilson , P., 2002, Nonpa r a m e t r i c Tests of Return s to Scale, European Journal of Operational Research, 139, 115-132 Timmer, C.P., 1971, Using a Probibalistic Frontie r Production Function to Measure Technical Efficien c y , Journal of Political Economy, Vol. 79, No. 4 (Jul. – Aug. 1971), 776-794. INTERNATIONAL BENCHMARKING OF ELECTRICITY DISTRIBUTION UTILITIES∗ by Dag Fjeld Edvardsen T h e Norwe g i a n Buildi n g Resea r c h Insti t ut e Forskningsvn. 3 b P.O. Box 123, Blindern, 0314 Oslo, Norway and Finn R. Førsund ± D e p a r t me n t of Economi c s , Univers i t y o f Oslo, and the Frisch Centre P.O. Box 1095, Blindern, 0317 Oslo, Norway Abstract: B e n c h ma r k i n g by mean s of apply in g the DEA model is appeari n g as an interes t i n g alter n a t i v e for regula t o r s under the new regim e s for elect r i c i t y distr i b u to r s . A samp l e of large elect r i c i t y distr i b u t i o n utilit i e s fro m Denma r k , Finla n d , Norwa y , Swede n and the Nethe r l a n d s for the year 1997 is studi e d by assu mi n g a co mmo n pr odu c t i o n front i e r for all count r i e s . The peer s suppo r t i n g the benchm a r k front i e r are from all co unt r i e s . New indic e s descr i b i n g cross - c o u n t ry conne c t i o n s at the level of indiv id u a l peers and th eir ineff i c i e n t units as well as betwe e n count r i e s are devel o p e d , and novel appli c a t i o n s of Mal mqu i s t produ c t i v i ty indic e s co mp a r i n g units from differ e n t countr i e s are perfor m e d . Key words: E l e c t r i c i t y distr i b ut i on utili t y , benc hma r k i n g , effici e n c y , DEA, Malmqu i s t productivity index JEL classification: C43, C61, D24, L94. ∗ The study is done within the resear ch project “Efficien cy in Nord ic Electricity Distrib utio n ” at the Frisch Cen tr e, fin an ced by the Nord ic Econo mic Re sear ch Coun cil. Finn R. Førsun d was visiting fellow at ICER during the fall 2001 and spring 2002 when comp letin g the paper . We are indeb te d to a group of Danish , Dutch , Finn ish , Norwe g ian and Swed is h elec tr ic ity regu la tor s fo r coop er atio n and commen ts on earlier draf ts at project meetin g s in Denmark , Norway, Fin land and the Neth erla n ds. We will esp ecially than k Susann e Han sen , Kari Lav aste and Victo r ia Shestalov a for written commen ts. We are ind eb ted to Sverre A. C. Kittelsen for valu ab le commen ts on the last draf t, and a refer e e for stimu la tin g furth e r impr o v emen ts . The electricity regulators, head ed by Arne Martin Torg e rs e n and Eva Nœss Karls e n from NVE, have don e extensiv e work on data collectio n . Howev er, notice that the resp on sib ility for the fin al mod el cho ice and focu s of the stud y rests with the author s. Furth er mo r e, the analysis is only addre ssing technical efficiency measureme n t , and in particul a r not cost efficien c y . The stud y is not intend ed for regu lato r y purp o ses. ± Corr espon d ing autho r. Tel.:+4 7-2 285 -5 132 ; fax : +47- 228 5-5 035 Email address : f.r.f or s und @ ec o n. u io .n o (F.R. Førsu n d ) . 2 1. Introduction Improve me n t of effici e n c y in electr i c i t y distri b ut i o n utili t i e s has come on the agenda , as an increas i n g nu mber of countri e s moved toward s de regu l a t i o n of the sector in the last decade. A key element in assessi n g potent i a l s for effic i e n c y impro v e me n t is to estab l i s h benchmarks for effici e n t operat i o n . A standa r d defini t i o n of benchma r k i n g is a compar i s o n of some measur e of actual perfor ma n c e agains t a refere n c e perfor ma n c e . One way of obtainin g a comprehe n s i v e benchmar k i n g as opposed to pa rti a l key ratio s is to estab l i s h a front i e r product i o n functio n for utiliti e s , and then calcula t e efficien c y scores relative to the frontier . In this study a piecew i s e linear fronti e r is used , and techni c a l effici e nc y measur e s (Farre l l , 1957) and Malmq u i s t produ c t i v i t y measu r e s (Caves et al., 1982a) are calcul a t e d by employing the DEA mo del (Charnes et al., 1978) . The DEA model has been used in severa l studies of the utilities sector recent l y (see a review in Jama sb and Pollitt, 2001). A special feature of the present cross sec tion study is that the data (for 1997) is based on a sample of utiliti e s from five countri e s : Denmark, Finland, The Netherlands, Norway and Sweden. Most of the effic i e n c y studi e s of utili t i e s have b een focusi n g on utilit i e s within a single countr y (Førsun d and Kittels e n , 1998), but a few studies ha ve also compar e d utilit i e s from differ e n t count r i e s (Jama s b and Polli t t , 2001) . In some cases an inter n a t i o n a l basi s for bench ma r k i n g is a necess i t y due to the limit e d nu mb e r of simil a r firms withi n a count r y . When the numbe r of units is not the key motiva t i o n for an intern a t i o n a l sample for benchma r k i n g , the motiva t i o n may be to ensure that the nationa l best pract i c e utili t i es ar e also benchma r k e d 1 . There are some additi on a l probl e m s with using an intern a t i o n a l data set for benchma r k i n g . The main proble m is that of compar a b i l i t y of da ta. One is force d to use the stra t e g y of the least common denominator. A spec ia l issue is the corre c t handl i n g of curre n c y excha n g e rates . There are reall y only two pract i ca l alte r na t i v e s ; the avera g e rates of excha n g e and the Purcha s i n g Power Parity (PPP) as measur e d by OECD. The latter approach is chosen here. Relati v e differ e n c e s in input prices like wage rates and rates of return on capita l may also create problems as to disting u i s h between substitution effects and inefficiency. 1 An altern ativ e is to use hypo th etical units based on eng in eering informatio n , as men tio n ed alread y in Farrell (195 7) . In Chile and Spain hypoth e tic a l model best practice units are used for bench ma r k in g (Jama s b and Pollitt, 2001 ). 3 Accor d i n g to the findings in Jama sb and Pollit t (2001 ) inter n a t i o n a l compa r i s o n s are often restr i c t e d to compa r i s o n of operat i n g costs becaus e of the he teroge n e i t y of capital. As a precon d i t i o n for intern a t i o n a l compar i s o n s they focus on improv i n g the qualit y of the data collecti o n process, auditing , and standard i z a t i o n within and across countr i e s . Our data have been collec t e d specif i c a l l y for this study by natio na l regula t o r s , and spec ial attention has been paid to standa r d i z e the capital input as a replac e me n t cost concep t . Regarding the extent of intern at i o n a l studies Jamasb and Polli tt (2001) found that 10 of the countrie s covered in the surv ey (OECD- and some non-OECD countri e s ) have used some for m of benchma r k i n g , and about half of th ese use the frontie r - o r i e n t e d method s : DEA, Correc t e d Least Square s (COLS) and the Stocha st i c Frontier Approach (SFA). They predict that benchmar k i n g is likely to become more common as more countrie s imple men t power sector reforms . (For an opposi ng view, see Shuttleworth, 1999.) The rest of the paper is orga ni z e d in the followi n g way: In Sectio n 2 the DEA model is introd u c e d and new indice s are develo p e d to captur e the cr oss-country pattern of the national i t y of peers and the natio na l i t y of units in their se ts of associ a t e d ineffi c i e n t units. Malmqu i s t produc t i v i t y approa c h e s are deve lop e d for cross sectio n intern a t i o n a l compar i s o n s . In Section 3 the theory of distri b u t i o n of electri c i t y as product i o n is briefly review e d with regard s to the choice of variab l e speci f i c at i o n. Struc t u r a l diffe r e n c e s betwe e n the countri e s reveal e d by the data are illustr a t e d . The results on efficien c y distribu t i o n s and inter-country productivity differences using Ma lmqu i s t indice s are presen t e d in Sectio n 4. Conclu s i o n s and furthe r resear c h options are offered in Section 5. 2. The methodological approach 2.1. The DEA model A s a basis for bench ma r k i n g we will emplo y a piecew i se linea r fron t i e r produ c t i o n funct i o n exhibi t i n g the transf o r ma t i o n s betwee n output s , ym ( m = 1,.., M ) and the subst i t ut i o ns betwe e n input s , x s ( s = 1,.., S) . We will assume const a n t retur n s to scale (CRS) . The front i er is envelo p i n g the data as tightl y as possible , and observed utilitie s , termed best practi c e , will for m the benchma r k i n g techn o l o g y . The Farrel l t echni c a l effici e n c y mea sur e s are calcula t e d 4 sim u l t a n e o u s l y with deter m i n i n g the natur e of th e envelo p m e n t , subjec t to basic proper t i e s of the genera l transf o r m a t i o n of inputs into out put s (Färe and Prim on t , 1995). The efficienc y scores for the input oriente d DEA m o del, Ei f o r utili t y no i ( i N∈ = set of units ) are found by solving the followin g linear program: (1) . . 0 , 1 , .., 0 , 1 , .., 0 , i i ij mj mi j N i si ij sj j N ij E Min s t y y m M x x s j N S θ λ θ λ λ ∈ ∈ = − ≥ = − ≥ = ≥ ∈ ∑ ∑ T h e point 1 1( , .., , , .., )ij j ij Sj ij j ij Mjj N j N j N j Nx x yλ λ λ λ∈ ∈ ∈ ∈∑ ∑ ∑ ∑ y i s on the f r ont i e r and is term e d the reference point. In the CRS case the input- a nd output oriented scores are identica l . However, we m a y need to keep non- discretionary variables fixed when calculating the efficie n c y scores . Then, in the case of an out put fixed , the input- o r i e n t e d m odel (1) and the sco r e s re m a i n the sam e . But if one of the in put s is f i x e d the ef f i c i e n c y corr e c t i o n of that input constr a i n t in (1) is droppe d and the nume r i c a l resul t s for effic i e n c y score s m a y be differe n t . 2 2.2. The Peers T h e effic i e n t units iden t i f i e d by solvin g the proble m (1) are define d as peers if the effic i e n c y score is 1 and all the output- and input constra i n t s in (1) are binding . Each ineffi c i e n t unit will be relat e d to one or m o re bench m a r k or peer units . Let P be the set of peers and I th e set of inef f i c i e n t u n it s , P ∪ I = N . A Reference set or Peer group set for an ineffic i e n t unit, i , (Cooper , Seiford and Tone, 2000), is defined as: { }: 0 ,i ipP p P iλ= ∈ > ∈ I (2) Each inefficient unit, i , has a positive weight, λip , associated with each of its peers, p, fro m the solution of the DEA model (1). The weights, λip , are zero for ineffic i ent units not having unit p as a p eer. Since all peers hav e the efficiency score of one there is a need to discrim i nate betw een pe e r s as to im portanc e as r o le m odels . Measu r es u s ed in the literature are a pure 2 Co rresp ond ing l y, an ou tpu t -o rien ted m o d e l will b e d i fferen t if on e o f t h e o u t p u t s is fi x e d (bu t n o t if one of th e in pu ts is fix e d ) , sin c e th e con s train t inv o l v i ng th is v a riab le will b e refo rm u l ated to ho l d with ou t th e efficien cy cor r e c t i o n o f t h e out p u t va ri a b l e f o r t h e u n i t bei n g i nve s t i g a t e d . 5 count m easure based on the num ber of peer gr oup sets (2) that a peer is a m e mber of, calcu lating a Super-Efficiency m easure for a peer against a fr ontier recalculated without this peer in the data set sup porting the f r ont ier (A ndersen and P e ters en, 199 3), and a Peer index (Torgersen et al., 1996) showing the im portance of a peer as a role m o del based on the share of the input savings of the in efficien t units referenced by a peer, weigh t ed by th e w e ights λip found by solving (1). 2.3. Cross group influence of peers For our situation with units from different coun tries w e a r e m o re inter e sted in dev e loping m easures that show the interconnections betwee n peers and inefficient units from different countries. W e will ne e d to con s ide r a p eer and th e set of inefficient u n its tha t a r e referenced by the p eer. W e w ill ter m this appar e ntly n e w se t in the liter a ture, I p , the Referencing set for a peer, p: { }: 0 ,p ipI i I p Pλ= ∈ > ∈ (3) One approach is to focus on the country distri bution of the inefficient units in a peer’s referencing set. Units must now be identified by country. Let L be the set of countries and I q the set of inefficient units of country q ( q q L I I∈∪ = ). Partitioning the R e f e rencing set (3) by grouping the inefficient units a ccording to country yields: { }: 0 , , ,q q qp ip pq L pI i I p P q L I Iλ ∈= ∈ > ∈ ∈ ∪ = (4) Let the number of units in the Referencing set (3) be #I p , th e num b er of units in the set (4) be and the set of peers from country q be Pq ( ). The Degree of peer lo calness index, q pI# PP q Lq =∪∈ q pD L , for peer p in country q , is then defined as: # , , # q pq q p p I DL p P q L I = ∈ ∈ (5) The index v a ries between zero and o n e. Zero m e an s th at the peer is “extr em e- intern a tiona l”, only referencing inefficient units from other countries, and one m e ans that the peer is “extrem e- national”, only referencing inefficient units from own country. In Schaffnit et al. (1997) a count m easure was de veloped describing the nu m b er of inefficient units belonging to a group referenced by peers from another group, re lative to the total 6 num b er of units of the first group (m ay be the num ber of inefficient units would be m o re appropriate ) . In order to obtain m o re detailed in f o rm a t i o n w e w ill in stea d develop a n index for Cross-country peer importance by using characteristics of the inefficient units analogous to the Peer index m e ntioned above. In the case of input orientation 3 the index, sqrρ , can be established by weighing the sa ving potential of an input, s, f o r the inef f i c i e n t units f r om a country, q (= (1 ) , qks kx E k I− ∈ ), w ith the relevan t kpλ - weights ass o ciated with peers from another country, r ( rp P∈ ), being in the peer group set of the inefficient units from country q , and then com p aring with the total saving pot ential of all ineffici ent units in country q : '' ( / ) ( 1 ) , 1 , .., , , (1 ) r q q k p kp ks kp P k I p Ps qr ks kk I x E s S q r x E λ λρ ∈ ∈ ∈ ∈ −= =− ∑ ∑ ∑ ∑ L∈ (6) The weights in the num erat o r are n o rm ali z e d w ith th e sum of weigh t s f o r all p eer s f o r the inef f i c i e n t u n it k from country q . In the variab l e return s to s c ale ca se th is sum is restr i c t e d to be one, bu t not in th e CRS case we are wor k in g with. This index will be inp u t (ou t pu t ) variab l e specif i c , as is the case for the Peer i ndex. The m a xi m a l value of the index is 1. Thi s will be th e case if peers belong i n g to countr y r reference all the ineffici e n t units of co untry q , and that they are not ref e ren c e d by peers from a ny other countr y . The m i nim a l index value of zero is obtained if peers from country r do not referenc e any ineffici e n t unit from country q . 2.4. The Malmquist productivity index The Malm quist productivity index, introduced in Caves et al. ( 1982a), is a binary comparison of the productivity of two entitie s, usually the sam e unit at different points in tim e , but we m a y also compare different units at the sam e poi nt in tim e . Let the set of units in cou n tr y q be N q , etc. ( ). The output- and input vectors of a unit, j , are written , q q L N N∈∪ = )x,..,x(x,)y,..,y(y jS1jjjM1jj == Nj ∈ . The Malmquist productivity index, ,qk lM , for the two units k and l from country q and r respec t i v e l y , is: (7) , ( , )( , , , ) , , , ( , ) q q ql l l k l k k l l q k k k E y x rM y x y x k N l N q L E y x = ∈ ∈ ∈ T h e Malm q u i s t index is the ratio of the Farre l l techn i c a l effici e n c y m easur e s for the two units, as calculated by solving the program (1 ). The superscript on the indexes shows the 3 An o u t p u t -orien ted C r oss-cou n t ry p e er ind e x can b e fo rm u l ated an al o gously fo llow i ng t h e d e fi n itio n of th e Peer i nde x i n T o r g e r s e n et al . (1 9 9 6 ) fo r out p u t o r i e n t a t i o n . 7 reference technology base (relevant for one of the units being com p ared, i.e. q m eans that th e efficie n c y measur e s are calcula t e d w ith respect to the frontier for country q ). W e fol l ow the conventi o n of having the unit indicate d first in the subscript of the Malmquist index on the lhs of (7) in the denom inator and th e second in the num erato r , thus unit l is m o re produc t i v e than unit k if , q k lM > 1, and vice versa. If it is appropri a t e t o operat e wi t h di fferent reference technologies for countries, following Färe et al. (1994) the Malmquist index can be decom po s e d m u ltipl i c a t i v e l y into a term refl ec t i n g each unit catch i n g up with its re feren c e technology, and a term r e flecting the distan ce between the tw o referenc e technolo g i e s . 4 Since we are dealing with countries it m a y also be of interest to com p are productivity levels betwee n cou n tr i e s . The c r ucia l point concerning how to construc t ind i c e s f o r com p a r i s o n s is the assum p tion about production technologies . T h ere are two basic alternatives: i) A comm on frontier technology m a y be assumed, allowing utilities from different countries to support the DEA envelope. ii) The technologies are national, i.e. only ow n country units m a y be best practice ones. 2.5. Common inter- country technology A s pointed out in Caves et al. (1982b) it is an advantage to use a circular index when com p aring productivities of two countries (uni ts). Berg et al. (1992 ), (1993), and Førsund (1993) demonstrate that the Malmquist index (7) is not circular (see also the general discussion in Førsund, 2002). In the case of the sam e frontier technology being valid for all countries, corresponding to assum p tion i) above, the index is th en circular. The calculation of the Malm qui s t produc t i v i t y index is greatly sim p li f i e d , since the benchmark technology will be com m o n f o r all pro du c t i v i t y ca lcu l a t i o n s . The notation of the expressions below is sim p lified by rem oving the technology index. A useful characterization of the product i v i t y of a unit, k , m a y be obtained by com p a r ing the efficie n c y score for this unit with the geom et r i c m ean of a ll the other scores, followi n g up Caves et al. (1982b), (p. 81, Eq. (34)), wher e the productivity of one unit was measured agains t the g e om et r i c m e an of the pro duc t i v i t i e s o f all units: 4 A n a ppl i c a t i o n of suc h dec o m p o s i t i o n i n a st ud y of N o rw e g i a n el ec t r i c i t y di st r i b u t o r s i s f o u n d i n F ø rs u n d a n d Kittelsen (1 998 ). 8 (8) [ ] 1 / # ( , ) , ( , ) k k k k N l N l l l E y xM k N E y x∈ = ∈Π where #N is the to tal nu m b er of all utilit i e s . This geom et r i c m ean-b a s e d Malm qu i s t index is a function of all observations. To focus on b ila t e r a l pro du c t i v i t y c o m p a r i s o n s betwe e n countries, one way of form ulating this is to com p are the geom et r i c m eans of efficie n c i e s over units for each country, q and r, s y m b o l i z e d by the sub-inde x g(r,q) : 1 / # ( , ) 1 / # ( , ) , , ( , ) q q r r N k k k k N g r q N l l l l N E y x M q r L E y x ∈ ∈ ⎡ ⎤Π⎢ ⎥⎣ ⎦= ⎡ ⎤Π⎢ ⎥⎣ ⎦ ∈ (9) where # N q and # N r are the total num ber of utilit i e s within co untr y q and r respectively. Thi s geom et r i c m ean-b a s e d Malm qu i s t index is a f unction of all the observations in countries r and q . The index m a y be term ed the bilate ral country productiv ity index , and is circular, in the sense th at the ind e x is invar i a n t with respe c t to which third coun tr y ef f i ci e n c y score average we m a y wish to com p are with countries q and r . If we want to express how, on the average, the units within a country, q , are doing co mpared with the average over all units, the country r spe c if ic index in the de nominator of (9) can be subst i t u t e d with the g e om e t r i c av erage of the efficiency s c ores of all the utilities , i.e. th e denom in a t o r in (8). T h e geom etr i c m ean of efficiencies for units within a country, sym bolized by the sub index g(q) , is com p ar e d with the geo m et r i c m ean over all units: (10) 1 / # ( ) 1 / # ( , ) , ( , ) q q N k k k k N g q N l l ll N E y x M q L E y x ∈ ∈ ⎡ ⎤Π⎢ ⎥⎣ ⎦= ∈⎡ ⎤Π⎣ ⎦ 3. Model specification and data 3.1. Distribution as production In the review of transmission and distribution efficiency st udies Jam a sb and Pollitt (2001) point to the variety of variable s th at have been used as an indicati o n that there is no fir m 9 consensus on how the basic functions of electr ic utilities are to be modeled as production activi t i e s . Howeve r , they mentio n that this may, to some extent , be explai n e d by the lack of data. Modeling the production activity of transportation of electricit y has old traditions within engineering economics (see e.g. Førsund (1999) for a review). On a general abstract level the outputs of distribution utilitie s are the energy delivered th rough a network of lines and trans f o r me r s to the consu mp t i on nodes of the network and losses in lines and transfor me rs. The inputs are the energy receiv ed by the utility, real capita l in the for m of lines and transforme rs, and labor and materials used for ge nera l distr i but i o n activi t i e s . Due to the high number of custome r s for a standar d utilit y it is imposs i b l e to impleme n t the concep t u a l i z a t i o n of a multi- out p u t product i o n funct i on to the full exten t . The usual appr oxi ma t i o n is to opera t e with total energ y delive r e d a nd number of cus t omers separately as outputs (Salvanes and Tjøtta, 1994). The latter variable is also often used in enginee r i n g studies as the key dime nsi o n i n g output variab l e , and taken as the absolut e size of a utility (Weiss, 1975). In engine e r i n g studie s the load densit y may be a ch ara ct e ri z at i on of capit a l . Load densi t y is the product of custome r density and coincid e n t peak load per custome r (kWh per square mile). The maxi mu m peak load may also descri b e capit a l as a qualit y attri bu t e , or be used as an output attribute characteriz i n g energy delive r e d . In the short run the utilities take the existing lines, transformer capacity and the geographical distribu t i o n and number of customer s as given. But, as pointed out in Neuberg (1977), this is not the same as sayin g that these varia b l e s must be regarded as constants in our analysis. Past decisions reflected in configur a t i o ns of lines and transfo rme rs may give rise to current differences in efficiency. These variables that are exogenous for the firm, may be seen as endogen o u s from the point of view of societ y. Even distribution jurisdictions can be rearran g e d , making the number of customer s endogeno u s . The role of lines varies. It can be regarded as a capital input, but it is also used as a proxy for the geographical extent of th e service area. For fixed geogra phical distribution of customers the miles of distribu t i o n line would be appr oxi ma t e l y set (but note the possibi l i t i e s of ineffic i e n t configu r a t i o n s ) , thus line length ma y serve as a proxy for service area. The service area can be measure d in differe n t ways. The pr oblem is to find a measure the utility cannot influe n c e (see Kittel s e n (1993) and Langse t and Kittel se n , 1997). Due to probab i l i t y of wire- 10 outa g e and cost of servi c i n g the exten t of custo me r area will infl u e nc e distri b u t i o n costs . Non-traditional variables such as size of servi ce area may also be used to specify differ e n c e s in the produc t i o n syst e m or tec hnol og y from utilit y to utilit y . Accordi n g to the extens i v e review in Jamasb and Pollitt (2001) the mo st frequent l y used inputs are operatin g costs, number of employee s , transformer capacity, and network length. The most wi dely used outputs are units of ener gy delive r e d , number of custome r s , and size of service area. 3.2. Choice of model specification Concerning our choice of input variables it has not been possibl e to use a volume measure of labor due to the lack of this informa t i o n for one country (Den ma r k ) . Inste a d a cost measu r e has been adopted. Labor cost, other operatin g costs and maintena n c e have been aggregat e d to total operat i n g and mai nten a n c e costs (TOM). We then face the problem mentione d in the introduction about national diffe rences in wages for labor. It has been chosen to measure TOM in Swedish (SEK) prices. A measure for real capital volume has been established for 1997 by th e involved regulators by first creatin g for the sample utiliti e s a physic al inventory of existin g real capital in the for m of length of types of lin es (air, under ground and sea) distributed in three classes accordi n g to voltag e , catego r i e s of transf o r me r s accordi n g to type (distr i b u t i o n , ma in) and capacity in kV, transfor me r kiosks for distrib u t i o n , and transfo r me r statio n s for main transf o r me r s . The numb er of capital items for each countr y has been in the range of 60 to 100. As a measure of real capita l the replacement value (RV) is the theoretically correct measure (Johansen and Sørsveen, 1967). To obtain such a measure, aggregation over the categor i e s has been necessa r y due to the large number of items. The same weight s should be used, i.e. using national prices will not yield a correct picture if prices differ. It has been chosen to use Norweg i a n prices for all countr ies. A more preferred set of weights may be average prices for all countrie s , but it has not b een feasi bl e to estab l i s h such a database for this study. Although lines and transfor me r s have been used separately as inputs in the literat u r e (see e.g. Hjalma r s s o n and Veiderp a ss (1992a), (1992b) and Jamasb and Pollitt, 2001), the groups have been aggregat e d into a single aggreg a t e d cap it a l volume mea su r e in this study, partly due to different cla ssif i c a t i o n systems used by the countri e s . 11 We will simpli f y on the energy input side and only use the loss in MWh in the syst em as a proxy for input. This variabl e will also captu re a quality component of the distribution system. A proble m is that data on losses ma y be measured with less precision due to measuring periods not coinciding with the calendar year. For some countries an average loss for the last three year s is used, while loss for the last year or its estima t e is used for other countri e s . On the output side energy deliver ed and the number of customers are used as outputs. The countries have information on low and high volta ge, but since the cla ssification of high and low voltage differs, we had to use the aggr ega t e figure s . Some m easure of geograph i c a l configuration of the distribution networks should also be included for a relevant analysis of efficien c y . In this study the total length of di stri b u t i o n lines is the only availa b l e me asur e for servic e area. In additi o n to servic e area the de nsity of custome r s of a distrib u t o r is usually conside r e d to influen c e the efficie nc y . But when using absol u t e numbe r of custome r s and energy delivered as separate outpu t s there is no room for an addi tional density variable of the type energy per custome r . By nature of the radi al efficie n c y measure , the refe r e n c e poin t on the fronti e r has the same ener gy-per-custome r density as the observation in question. The countries involved have very different population densitie s . But it is not so obvious how this will infl u e nc e effi c i e n cy in distr i but i o n . A rural dist ri but o r in Norway may serve a communi t y locate d along a valley bottom with pe opl e living fairl y close to each other , while the geogr a ph i c a l area of the munic i p a l i t y may in clude vast uninhab i t e d area of mountai n s and forests above the valley floor. A densely p opulate d area in the Netherla n d s may not necessarily save on lines per unit of area if low-rise housing domi nate s . 3.3. The data structure An overview of key characteristic s of the data is presented in Table 1. The differe n c e in size between utiliti es is large, as reveal e d by the last two columns . A summary of the struct u r e of the data of the indivi d u a l countr i e s is shown in the radar diagram in Fi gure 1, where country averages relative to the tota l sample averages (the 100% contour) are portraye d . By using the contour curves for pe rcent a g e s , relati v e compar i s o n s can also be done between countries. The domination in size of the Netherlands is obvious in all dimensi o n s except for energy deliver e d . The Neth er l a n d s is especi a l l y large in number of custome r s , but also in replac e me n t value. It is relativ e l y smalle r in length of lines. Norway is 12 Table 1 S ummary statistics. Cross- section 1 997. Nu mber of units 122 Average Median Standard Deviation Minimum Max imum TOM(kSEK) 152388 97026 182923 11274 981538 LossMWh 91449 52318 104777 7020 615281 RV (kSEK) 2826609 1907286 3288382 211789 22035846 NumCu st 109260 55980 163422 20035 1052096 TotLines 7640 4948 8824 450 54166 MWhDelive red 2110064 1003472 2815025 166015 178054730 largest with respec t to energy delivered and also corr espondingly large in energy loss, although with a sm aller value than the Netherla n d s . Sweden stands out with relatively high operating and m a intenance costs (T OM), while Fi nland stands out with a high number for length of lines. Denm ark has the sm allest num ber for length of lines and energy loss, and has a relativ e l y high num ber of custom e r s . The com b in a t i o n s of num b er of custom e r s and length of line show the highest custom e r density in the Netherlands and then D e nm ark second, and the lowes t d e nsit y in Fin l an d . 0 % 100 % 200 % 300 % Opex LossMWh RV NumCust TotLines MWhDelivered Denmark (24 units) Finland (25 units) Sweden (42 units) Norway (16 units) Netherlands (15 units) Figure 1. The average structure of the countries 13 4. The results 4.1. Effic iency scores T h e distri b u t i o n of efficie n c y sco r es 5 for m odel (1) is shown in Figure 2. The units fo r each country are grouped together a n d sorted acco rd i n g to ascendi n g valu es of the ef fici e n c y score . E a c h bar repres en t s a un it, an electric i t y di strib u t i o n utility com p any . The size of each unit, m easu r e d as tota l opera ting and m a intenance costs (TOM) (including labor costs), is proport i o n a l to the wid t h of each bar. 6 The efficie n c y score is m easure d on the vertica l ax is and the TOM values m easure d in SEK (in 1000) are accum u l a t e d on th e horiz o n t a l axis. As a 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2 000 000 4 000 000 6 000 000 8 000 000 10 000 000 12 000 000 14 000 000 16 000 000 18 000 000 Si z e in T O M E Common Local Geometric mean Denmark Finland Netherlands Norway Sweden Figure 2. Country distribution of efficiency scores 5 The ef fi c i e n c y sco r e val u e s a r e gi ve n i n E d v a r d s e n a n d F ø r s u n d ( 2 0 0 2 ) . O n e Dut c h uni t has bee n rem o v e d fro m th e o r i g inal d a ta set after p e rfo r m i n g a sen s itiv ity test an d con s i d ering th e atyp ical stru cture o f t h e un it. Notice that service area m a y be rega rde d a s a fixe d no n - d i s c r e t i o n a r y v a ri a b l e wi t h o u t a n y co ns e q u e n c e f o r t h e values of the efficiency scores sin c e i n p u t - o r i e n t a t i o n i s ad o p t e d , cf . t h e di s c u s s i o n bel o w t h e m ode l ( 1 ) . 6 Th e regu lato rs ch ose th is inp u t -b ased si ze m easu r e as bei ng m o st rel e v a n t to them . Other ca ndidat e s for size v a riab les are men tio n e d i n Sectio n 3. It do es no t m a tter m u ch , wh ich o n e is cho s en for th e pu rpo s e o f g e tting inform a t i o n about t h e loc a tio n of units accordi n g to size. 14 genera l charact e r i z a t i o n the units are distr i b u t e d in the inter va l from 0.44 to 1, and the share of TOM of fully effici e n t units is rather sma ll, repres e n t i n g about 5 percen t of accumu l a t e d TOM. When looking at the country dist r i b u t i o n s it is remar k a b l e th at all countries have fully efficien t units. This supports the use of a comm on techno l o g y , in the sense that no countr y is comple t e l y domi na t e d by anothe r , and all countries contribute to spanning the frontier. There are two aspect s that the figure sheds light on: th e size of the effic i e n t units – measu r e d by the input total operat i n g costs – an d how the efficient units sta nd out in the country specific distri b u t i o n s . For the three countr i e s Denmar k , the Nether l a n d s and Sweden , the effici e n t units are quite small compar e d to averag e size wi thin each countr y . This is especi a l l y striki n g for the Nether l a n d s with the most pronou n c e d di chot o my in size with on e group of la rge units and the other with considerably sma ller ones. The units within the group of large units have about equal efficiency levels, while the group with small units has units both at the least effi ci ent part and the most efficient part of the distribution. The least efficie n t units have only half the value of the efficie n c y score than the average . For Finland and Norway the effici e n t units are closer to the medium size (disre g ar di n g the large Norweg i a n self- evalu a t o r ) . The Swedis h distr i b ut i on is charac t e r i z e d by an even distrib u t i o n of efficie n c y scores with large units being at the upper end of the ineffic i e n c y distri b u t i o n , and medium- and small sized units being evenly lo cated over the entire distribu t i o n . The ineffici e n t units with the highest efficien c y scores have values quite a bit lower than 1 for Denmark, the Netherla nd s and Norway, while the values are much closer to the fully efficien t ones in Finland and Sweden. The Norwegi a n distrib u t i o n has no marked size pattern , but has a much more narrow range of the efficiency sc ores for the inefficient units than for Sweden. The range of the distrib u t i o n for Finland is the narrowes t without one or two extremel y ineffi c i e n t units like the case for the Nether l a n d s , Norway and Sweden . Both for Finlan d and Denma r k the large s t units are locat ed centra l l y in the dist ri but i o n s . A rough measur e of the total potent i a l impr ove me n t for each country may be read off graphi c a l l y in Figure 2 by the area betwee n the va lue 1 for the effici e n c y score and the top of the bars represen t i n g the individu a l units for each country. The total savings potential for operating and maintenance costs is about 20 per cent (the potential for the other two inputs cannot be seen so accura t e l y sin ce TOM is used as the size vari a bl e ) . Finl a n d has the small e s t 15 poten t i a l while Sweden and the Nether l a n d s ha ve the highes t . As a summar y expres s i o n for the differ e n t shapes of th e effici e n c y distri b u t i ons, different num b er of units and absolute size between units and location of size classes within country distributions, the country share of the savings potent i a l ( ( 1 ) / ( 1 ) ,q is i j s ji I j I )x E x E q∈ ∈ L= − −∑ ∑ ∈ for each of the three inputs are s e t ou t in Table 2, u s ing the radial projections. 7 Due to th e la rge, inef f i c i e n t Dutc h units that we se e in Figure 2, the Neth er l a n d s has a higher savings potential than the other countr i e s , especi a l l y for replace m e n t value of capital. Sweden has a high potentia l for total operati n g - and m a inten a n c e costs savings , and Norway for saving s in energy loss. Denm ar k com e s second to the Netherlands in saving potent i a l for replace m e n t valu e of capital, and has the sm alle s t share for energy loss, roughl y on the sam e level as Finlan d . Finlan d has signi f i c a n t l y lower sav i n g s poten t i a l f o r tota l oper a t i n g - and m a int e n a n c e co sts and repla c e m e n t value of ca pit a l than th e other coun t r i e s . In order to assess the effici e n c y of countr i e s m easur i n g an i ndivi d u a l unit ag ainst the total (geom e t r i c ) m ean was intro d u c e d in Equat i o n (8) . The line of this geom e t r i c m ean is inser t e d in Figure 2 ( E = 0.82). The figure gives a visual impre ssi o n of such com p a r is o n s . As overall charact e r i z a t i o n s we m a y note th at the m e dian efficien c y scor e of Denm ar k and Norway is below the total m ean, while the m e dian va lue of Finlan d , the Nethe r l a n d s and Sweden ar e higher . The Nether l a n d s is a sp ecial case since all the large un its are less produc t i v e than the sam p l e (geom e t r i c ) avera g e . 4.2. Structural features of best- and worst practice units From the effici e n c y dis t r i b u t i on shown in Figure 2 we identify the 12 active peers (exclu d i n g the self-eva l u a t o r ) and the 12 worst practice un its and calculat e the av erage input- and output values. Since we have 121 units this num b er rep r es e n t s the u pper and lo wer decile s of the Table 2. Country distribution of savings potential shares TOM LossMWh RV Denmark 0.19 0.14 0.22 Finland 0.08 0.14 0.10 Netherlands 0.29 0.28 0.33 Norway 0.16 0.25 0.18 Sweden 0.28 0.19 0.17 7 If t h e refe r e n c e p o i n t s o n t h e fr on t i e r ha d be en use d s o m e differ e n c e s i n s h ares m a y occur if slacks on the input constraints i n (1) are p r es e n t a n d u n e v e n l y di st r i b u t e d on co un t r i e s . 16 distribution. The com p arison is sh own in Fig u re 3. It is the re lati v e positi o n in the rada r diagram that reveals the structu r e . Both th e best practice units (BP) and the worst pra c ti c e un its (WP) are sm aller than the sam p le average (the 100% contour), except the input RV for the WP units. The BP units have, on aver age , higher values for all out pu ts than the W P units, and relat i v e l y lo wer num b e r of custo m e r s com p a r e d with the W P units. Concer n i n g inputs , the WP units have a signif i c a n t over-u s e of capita l (m easu r e d by the replac e m e n t value) leadin g to a much higher use of this input than for BP units, and also higher f o r total operati n g and m a inte n a n c e costs (TOM ) , while en er g y loss is ac tual l y a little lower than f o r BP units . 4.3. The degree of localness of peers W e have alread y seen in Figure 2 that peers ar e found in all countri e s . An interes t i n g questio n is the natu r e of the peer s : ar e they m u lti n a t i o n a l or pu re na tio n a l p eers ? If all the p eers turn out to be nation a l , the comm on techno l o g y is pa rtit i o n e d into countr y parts, and there is no founda t i o n for intern a t i o n a l be nchm a r k i n g . W e will use the Degree of peer loca lness index (5) to describ e the con n ect i o n between a peer and its ass o c i a t e d ineff i c i e n t units , and the scope for international benchm arking. Th e inform a t i o n is f ound by partitioning the Referencing sets (3) on countries according to (4). This is done in the co lumns of Table 3 for 0 % 100 % 200 % TOM LossMWh RV NumCust TotLines MWhDelivered Average 12 best Average 12 worst Figure 3. Str uctural comparison of best - and worst p ractice units] 17 each peer, entered acco rding to nationality. An in effic i e n t u n it m a y appear in one o r m o re of the peer co lum n s . 8 All the activ e peers are referen c i n g one or more ineffi c i e n t units fro m their own country. W e use as a criterion for a nati ona l peer that 50 percent or m o re of the ineffici e n t units in its Referenc i n g set are from its own country. The Degree of localness index in the last row of Table 3 shows that thr ee peers ar e nation a l . B o th the two Swedis h units (5022 and 5047), and one Da nish unit (1023) have national roles as peers. T h e two Swedish peers have the highest Deg r ee of peer localness ind e x values of all peers, 1.00 and 0.73. A Finnish unit (2026) is clos e to being national, with an index value of 0.48. There are five truly multin a t i o n a l peers in th e group of 13 effici e n t units, in th e sense that they are referencing inefficient units from all five count rie s . Three of these stand out as referenc i n g a considerably higher number of in effici e n t un its, as seen fro m the secon d row from below. This is the pure count m easur e of peer im por tan c e . Only one peer (S wedish unit 5047) is truly national in the sense that it is only referenc ing inefficient units from its own country. Based on the pattern of country origin of peer s and referen c e d units, S w eden has the m o st national peers with only one of two peers refe renci ng a few i n effi ci en t units from Norway, Finlan d and Denm ark . Denm ar k and Sweden seem to be furthes t apart with refere n c e to th e common technology frontier, since two of De nm ark’ s peers have on ly a single Swedis h inef f i c i e n t u n it in the i r Ref e re n c i n g its sets, a nd only one Danish inef f i c i e n t unit has a Swedish peer. Two of the four Finnish peers ha ve no Swedish units in their referencing sets. Three peers, one each from Finland, the Nether lands and No rway, have the m a xi m a l num b er of inef f i c i e n t Swedis h u n its in the i r Ref e re n c i n g sets. Actua l l y the Finn is h and Norwegi a n Table 3. The degree of localness index (5). Country partitioning (4) of Referencing sets (3) Denmark 1009 1023 Finland 2014 2016 2026 2124 the Netherlands 3005 3010 3017 Norway 4192 4462 Sweden 5022 5047 Denmark 10 21 13 4 4 8 5 12 4 6 0 1 0 Finland 8 3 15 3 13 2 2 12 0 9 0 2 0 Netherlands 6 11 6 0 7 0 2 6 1 7 0 0 0 Norway 2 3 12 4 3 1 0 5 0 15 0 8 0 Sweden 1 1 33 0 0 8 0 28 2 34 0 30 6 Total count 27 39 79 11 27 19 9 63 7 71 0 41 6 Localness index 0.37 0.54 0.19 0.27 0.48 0.11 0.22 0.10 0.14 0.21 - 0.73 1.00 8 The m a xim a l num b er for ea ch ine ffi c i en t un it is fiv e peers, sin c e th ere are six con s train t s in (1) and th e so lu tion for t h e efficiency score is always positiv e. 18 peers are referencing m o re Swedish units than th e Swedish p eers them selv e s , and the Dutch peer just a f e w Swedish units less th an the Sw edish peer with the highest num ber of Swedish inef f i c i e n t u n its in its Ref e re n c i n g se t. We interpret the obtained values of the Degree of peer localness index as em pirical support for the i m porta n c e of interna t i o n a l benchm a r k i n g . Furthe r m o r e , the frontie r seem s t o be well supported by the data in the sense that ther e is only one self-evaluator am ong the peers (Norwegian peer 4462). 4.4. Cross-country peer patterns W h i l e the Degree of localn e s s in dex is peer -s p e c i f i c , th ere m a y also be a need for a description of how countries are interconnected. The results for the Cross-country peer im port a n c e index, sqrρ , defined in (6), are set out in T a ble 4 (a-c). As explained in Section 2 the index is base d on com b ini n g the num bers in Table 3 on the occurren c e of ineffici e n t units in th e referencing sets, and the weights, λip , which are part of the solu t i o n to m odel (1). 9 The orig in of the in ef f i c i e n t units is g i ven by the rows , and th e colum n s give the origin of peer s. The interpr e t a t i o n of a cell num b er , e.g. 17.5 in the second cell in th e f i rst row of Panel a, is th e relat i v e sh are of the w e ight e d input saving (in percent) of replacement value of capital of in efficient Danish units referenced by Finnish peers. If we look at the most influential peer country for the inefficient units we see that for two of the three inpu t s Dutch pee r s a r e m o re im port a n t than Danish ones f o r inef f i c i e n t u n its in Denm ark, while for Dutch inefficient peers Dani sh peers are the m o st importa n t for one input and Dutch peers for two inputs. For Finnish in efficient units Finnish peers are the m o st im por t a n t b y a large m a rgi n , a nd this national influence is al so th e case for Norway. For Swedish inefficient units Finnish peers are the m o st im portant for all inputs, and then the Dutch peers, the Swedish peers com i ng third. 9 Th e nu m b er s ar e r e po r t ed in Edv a rd sen an d Før s u n d ( 200 2) . 19 Table 4.Cross-country peer importance index (6) in percent. Panel a. Replacement value of capital Denmark Finland the Netherlands Norway Sweden Denmark 39.8 17.5 37.8 0.3 4.7 Finland 2.1 72.4 20.3 0.7 4.5 Netherlands 45.4 10.9 40.6 3.1 0.0 Norway 15.5 23.7 5.7 43.1 12.0 Sweden 0.2 46.5 26.1 4.9 22.3 Panel b. Total operating and maintenance costs Denmark Finland the Netherlands Norway Sweden Denmark 34.6 15.2 40.1 0.4 9.8 Finland 3.6 57.5 32.2 1.2 5.4 Netherlands 38.4 14.0 45.2 2.4 0.0 Norway 10.2 22.9 8.1 40.6 18.2 Sweden 0.6 36.7 33.0 4.5 25.3 Panel c. Energy loss Denmark Finland the Netherlands Norway Sweden Denmark 36.7 16.1 38.4 0.3 8.4 Finland 3.1 60.1 31.3 1.2 4.3 Netherlands 38.3 14.9 44.4 2.5 0.0 Norway 9.1 22.6 8.7 44.0 15.5 Sweden 0.4 35.4 30.3 4.4 29.5 Inspecting the peer groups we see that the Dani sh peer s are more important for Dutch inefficient units than for Danish ones, the la tter comi ng sec ond for all inputs. The Finnish peers are most impor t a n t for Finni s h ineffi c i ent units and then for Swedi s h units for all inputs. The Dutch peers are most importa n t for Dutch units, and then come Danish inefficient units. Norway and Sweden are most strong ly connected, with Norwegian peers having Swedish ineffici e n t units as the second most impo rta n t group of ineffic i e n t units after its own. Swedish peers also have Norwegian inefficient units as the second most important group after its own ineffi c i e nt units. The location of small and high values of the index values shows the pattern of cross-country connections. The Norwegian peers have a very low importa n c e for ineffi c i e n t units in all 20 other countries than Norway itself and Sweden. As seen also from Table 3 there is no connection between Swedish peers and Dutch in effici e n t units. The impact on Danish and Finni s h peers is small compa r e d with the imp act on Norwegian units and its own inefficient units. The connections between Denmark and the Netherlands work symmetrically both ways, while Finnish peers influence Swedish ine ffi c i e n t units much more than Swedish peers are of import a n c e for Finnis h ineffi c i e n t uni ts, and Dutch peers are more impor tan t for Finnish ineffi c i e n t units than Finnis h peers are for Dutch ineffi c i e n t units. 4.5. Local versus common technology We have investigated the possibility of operating with individual country technology by running the DEA model for the th ree output- and thr ee input variab les. However, we ma y have a problem of di mensionality with Denmar k, Finland, the Netherlands and Norway, since this sample includes 24, 25, 16 and 14 units re spectively. The ad hoc rule (Coope r et al., 2000) that there are dimensi o n a l i t y problems if the number of dimensi o n s multipl i e d with three is higher than the number of observ a t i ons, apply to the Netherlands and Norway. A run of country specific technologies is presente d together with th e common frontier in Figure 2. The ordering of units within the countries fr om the common technology run is kept, and the scores for country specific technologies shown by the step curve above the bars are ordered identi c a l l y . As expect e d the number of effici e n t units in the Nether l a n d s and Norway increa s e drastica l l y , and also for Denmark. The indi vi d u a l changes for the units can be large, illust ra t i n g the dimensio n a l i t y problem for all countrie s except Sweden. The distribut i o n for Sweden with 42 observa t i o n s is much more stable , and we see a more or less parall el shift upwards of the whole distribution. Of the 11 un its being effi c i e n t withi n the local front i e r , only two remain so within th e common frontier, and only one peer has other countries’ ineffi c i e n t units in its refere n c i n g set. Othe r countr i e s ’ peer s settin g a higher standa r d for Swedis h units cause th e downward shift the efficiency distribution. The importance of exposing national peers to in ternational benchmarking is clearly demonstrated. 4.6. Productivity comparisons of countries T a b l e 5 shows the ratios of the geometr i c averag e of the efficie n c y scor es for each countr y relat i v e to all other count r i e s and also to th e total geomet r i c mean (cf. Equati o n s (9) and (10)). Finland seems to be the most pr oductive country within the common te chnology, having a bilater a l index value compare d with al l the other countries hi gher than one. Sweden comes closes t , while Norway and the Nether l a nds are on about the same level, and Denmar k 21 is the least productive country . S t arting with Denm ark; Finl and and Sweden are the m o st productive countries relative to it, while the Netherlands and Norway are ahead by 4 to 6 percentage points. Norw ay’s perform a nce is cl osest to the N e therl a n d s ’ , lagging it by about 1 percen t a g e p o int . It is in tere s t i n g to note, in view of the special situati o n for Sweden reveale d earlier , that Sweden, after all, on average, is in front of all countries with the exception of Finland. W e can use the perform a nce against th e total sam p l e avera g e as a final ranking. The last row sho w s that the ranking has Finland at the t op, then Sweden, the Netherlands, Norway and Denm ark, the two fi rst countries being in front of the total (geom e t r i c ) averag e and the other three behind. The use of the Malm quist i ndices to rank countries corresponds closely to the form of the country efficien cy distributions discussed abov e in connection with F i gure 2, where it was pointed out that Finland and Swed en had the most even distributions and the highest share of units above the to tal sam p l e m e an. 5. Conclu sions When doing international benchm arking for the sam e type of product i o n activity in several countries, applying a co mm on frontier technology seem s to yield the m o st satisfactory envir o n m e n t f o r identi f y i n g m u lti n a t i o n a l pee r s and assess i n g the extent of ineff i c i e n c y . I n our exercis e for a sam p le of large electricit y distribution utilities from Denm ark, Finland Norway, Sweden and the Netherlands it is rem a rk a b l e that peers com e from all countri e s . The im porta n c e of exposin g nationa l unit s, and especially un its that would have been peers within a national technology, to international be nchm arking is clearly de m onstrated. The m u ltin a t i o n a l settin g ha s called f o r th e develo p m e n t of new indices to capture the cross - Table 5. Pro ductivity comparisons of co untries. Malmquist productivity indices (9), (10) calculated as ratios of geometric means Denmark Finland the Nethe rlandsNorway Sweden Denmark 1.00 1.16 1.06 1.04 1.12 Finland 0.86 1.00 0.91 0.90 0.97 Netherlands 0.95 1.10 1.00 0.99 1.06 Norway 0.96 1.11 1.01 1.00 1.08 Sweden 0.89 1.04 0.94 0.93 1.00 Average all units 0.92 1.07 0.97 0.96 1.03 22 country pattern of the nationality of peers and the nationa l i t y of units in their refer e nci n g sets. Bilater a l Malmqui s t product i v i t y compar i s o n s can be perfor me d between units of particul a r inter e s t in addit i o n to count r y origi n , e.g. sorti ng by size, or location of utility (urban - rural), etc. We have focused on a single unit agains t the (geome tric) averag e perfor ma n c e of all units, as well as bilatera l comparis o n s of (geo metric) averages of each country. Our results point to Finland as the most productive coun try within the common techno l o g y . This result reflects the more even distribution of the Finnish units and the high share of units above the total sample mean of efficie n c y scores . The advantage of working with the DEA model is the richnes s of details availa bl e from the model solutions and the concrete connec t i o ns to actual units. Howeve r , this may also be a problem becaus e it is not always so easy to fi nd explana t i o ns for specifi c finding s such as why some units are efficie n t . The main practic al purpose of the paper is to serv e as a pilot study for the Nordic electricity regulators and the Dutch regulator as a start of a process of finding tools for regulation. The quality of the data and the accept a n c e of the model fra mewo r k are of crucial importa n c e for regul ation since the units are regulated based on their indi vidua l performa n c e as portraye d by the model result s . There is at present some disagree me n t about the possibil i ty of basing regulation of utili ties on the approach use here (Shuttleworth (1999), Nillesen and Telling, 2001). In order to improve upon the model approach as a benchmarking tool we would like to follow up the develop me n t s and results of the present study with the followi n g resear c h agenda : i) Find explanations for the cross country peer and ineffi c i e n t unit patter n s reveal e d by the novel cross country p eer importance indices ii) I mp r o v e the compar a b i l i t y of data betwee n countr i e s by harmon i z i n g defi ni t i o ns of variables and extending collec t i o n to cover envi ron me n t a l varia bl e s iii) Define financial variables and collect data for cost efficien c y exercise s iv) Investigate the scale properties by specify ing variable return s to scale technology v) Increase the number of cro ss section observations to cove r all units within a country enabli n g countr y specif i c technol og i e s also to be studied (if the total number of national units allows) vi) Establish time series of cross sections enabling producti v i t y developme n t s to be studie d vii) Develop a more general transitive Malm quis t index for the latter two cases 23 References Andersen, P. and N. C. Petersen, 1993, A pro cedure for ranking efficient units in data envelopment analysis, Manageme nt Science 39, 1261-1264. Berg, S. A., F. R. Førsund and E. S. Jansen , 1992, Malmquis t indices of productivity growth during the deregulation of Norwegian banki ng, Scandinavian Journal of Economics 94, Supplement, 211-228. Berg, S. A., F. R. Førsund, L. Hjalma rsson and M. Suomi nen, 1993, Banking efficiency in the Nordic countries, Journal of Banking and Finance 17, 371-388. Caves, D.W., L.R. Christensen and E. Di ewert, 1982a, The economi c theory of index number s and the measure me n t of input , output, and productivity, Econometrica 50, 1393-1414. Caves, D. W., L.R. Christensen and W. E. Diewert, 1982b, Multilat eral comp arisons of output, input, and productivity using superl ative index numbe rs, Economic Journal 92 (March), 73-86. Charnes, A., W.W. Cooper and E. Rhodes, 1978, Measuring the efficiency of decision making units, European Journal of Operational Research 2(6), 429-444. Cooper, W. W., L.M. Seiford and K. Tone, 2000. Data envelopment analysis. A compre h e n s i v e text with models , applica t i o ns, references and DEA – solver software (Kluwer Academic Publishers, Boston/Dordrecht/London). Edvardsen, D. F. and F. R. Førsund, 2002, International benchmar king of electricity distribu t i o n utilitie s , Work ing Paper No 08/02 ICER [ http.//www.icer.it/docs/wp2002/ forsund08-02.pdf]. Farrell, M. J., 1957, The measure me n t of pr oductive efficiency, Journal of the Royal Statisti c a l Society, Series A, 120, III, 253-281. Färe, R. and D. Primont, 1995, Multi-output prod uction and duality: theo ry and applications (Kluwer Academic Publishe r s , Boston). Färe, R., S. Grosskopf, B. Lindgren and P. Roos, 1994, Productivity developments in Swedish hospitals: a Malmquist output index approach, in: A. Charnes, W. W. Cooper , A.Y. Lewin and L. M. Seiford, eds., D a t a envelopme n t analysis : theory, methodol o g y , and appli ca t i o ns ( K l u w e r Academi c Publish e r s , Boston/Dordrecht/London), 253-272. Førsund, F. R., 1993, Productivity gr owth in Norwegian ferries, in: H. Fried, C. A. K. Lovell, and S. Schmidt, eds., The meas urement of productive efficiency, techniques and applications (Oxford University press, Oxford), 352-373. Førsund, F. R., 1999, On the contribution of Ragnar Frisch to production theory, Rivista Interna z i o na l e di Scienze Economi c h e e Commer ciali (International Review of Ec onomi cs and Business) XLVI (1), 1-34. 24 Førsund, F. R., 2002, On the circularity of th e Malmqui s t produc t i v i t y index, Workin g Paper No 29/02, ICER [ http.//www.icer.it/docs/wp2002/ forsund29-02.pdf]. Førsund, F. R. and S. A. C. Kittelsen, 1998, Productivity develo p me n t of Norweg i a n electricity distribution utilities, Resource and Ener gy Economi cs 20(3), 207-224. Hjalma rsson, L. and A. Veiderpass, 1992a, Effi c i ency and ownership in Swedish electricity retail distribution, Journal of Productivity Analysis 3, 7-23. Hjalma rsson, L. and A.Veiderpass, 1992b, Pr oductivity in Swedish electricity retail distribution, Scandinavian Journal of Economics 94, Supplement, 193-205. Jamasb, T. and M. Pollitt, 2001, Benchmarking and regulati o n : internat i o n a l electric i t y experience, Utilities Po licy 9(3), 107-130. Johansen, L. and Å. Sørsveen, 1967, Notes on th e measur e m e n t of real capital in relatio n to economic planning models, The Review of Income and Wealth, Series 13, 175-197. K i t t e l s e n , S. A. C., 1993, Stepwise DEA; choosing variables for measuring technical efficiency in Norwegian electricity distri bution, Memorandum No. 6/1993. Departme nt of Economics, University of Oslo. Langset, T. og S. A. C. Kittelsen, 1997, Fors ynin g s a r e a l og metodeva l g ved beregnin g av effek t i v i t et i elekt r i si t et s for d e l i n g [Serv i c e area and choi ce of method when calculating efficiency in electricity distribution], Rapport 85 . S t i ft e l se n for samfu n n s - og næringslivsforskning, Oslo. Neuberg, L. G., 1977, Two issues in the munici pa l ownershi p of electric power distribu t i o n systems, Bell Journal of Economics 8(1), 303-323. Nillese n, P. and J. Telling, 2001, Benchmarki ng distribution companies, EPRM Electricity March 2001, 10-12 [http.//www.icfconsu l t i n g . c om/ P u b l i c a t i on s / d o c_files/ BenchmarkingDistributionCompanies.pdf]. Salvanes, K. G. and S. Tjøtta, 1994, Productivity differ e n c e s in multip l e output indust r i e s : an empiric a l applica t i o n to el ectricity distribution, Jour nal of Producti v i t y Analysis 5, 23-43 . Schaffnit, C., D. Rosen and J. C. Paradi, 1997, Best practice an alysis of bank branches : an application of DEA in a large Canadian bank, European Journal of Operatio n a l Research 98, 269-289. Shuttleworth, G., 1999, Energy regulation brief, National Economic Research Associat e s , n/e/r/a, London, 1-4 [http.//www.nera . c o m/ w w t / n e w s l e t t e r_issues/4030.pdf]. Torgersen, A. M., F. R. Førsund and S. A. C. Kittelsen, 1996, Slack adjust ed efficiency measures and ranking of efficien t units, J ournal of Productivity Analysis 7, 379-398. Weiss, L.W., 1975, Ant itrust in the electric po wer industr y , in: A. Phillip s , ed., Promoti n g compet i t i o n in regulat e d market s (Brook i n g s Instit u t e , Washin g t o n , DC). FAR OUT OR ALONE IN THE CROWD: CLASSIFICATION OF SELF-EVALUATORS IN DEA∗ by Dag Fjeld Edvard s e n The Norwe g i a n Buildi n g Resea r c h Insti t ut e , Finn R. Førsund † Depart me n t of Economi c s Univer s i t y of Oslo/ The Frisc h Centr e and Sverre A. C. Kittel s e n The Frisc h Centr e Abstract : The units found stron g ly effic i e n t in DEA studi e s on effic i e n c y can be divid e d into self- e v a l u a t o r s and activ e peers , depen d i n g on wheth e r th e peers ar e referenc i n g any ineffici e n t units or not. The contr i b u t i o n of the paper is to devel o p a metho d for class i f y in g self- e v a l u a t o r s based on the addit i v e DEA model into inter i o r and exter i o r on es. The exteri o r self- e v a l u a t o r s are effi ci e n t “by defaul t ” ; there is no fir m eviden c e fr om obser v a t i o n s for the classi f i c a t i o n . These units shoul d theref o r e not been regard e d as effici e n t , and shou l d be remov e d from the obser v a t i o n s of effic i e n c y scores when performi n g a two-sta g e analy si s of expla i n i ng the distr i b u t io n of the score s . The applic a t i o n to munici p a l nursin g - and home car e services of Norway shows significant effects of removi n g exteri o r self-e v a l u a t o r s from the data when doing a two-st a g e analy si s . Keywords: Self-evaluator, interior and exterior se lf-evaluator, DEA, efficiency, referencing zone, nursin g home s JEL classification: C44, C61, D24, I19, L32 ∗ The pap er is based on resu lts fro m the NFR-fin an ced Frisch Cen tre proj ect “Better and Cheap er? ” and written within the project “Efficiency analyses of the nursing and home care sect or of Norway” at the Health Economic s Research Progra mme at the Univ ersity of Oslo (HERO) and the Frisch Centre. † Corr e spo nd ing auth or . Emai l: f.r. for s und @e c o n. u io .n o, posta l addr e s s : Depar tmen t of Econo mics , Univ er s ity of Oslo , Box 1095 , 031 7 Blind ern , Oslo , Norw ay. 1. Introduction The calcul a t i o n of effici e n c y scores for produc t i o n units based on a non-par a m e t r i c piecewi s e linea r front i e r produ c t i o n funct i o n , is well estab l i s h e d withi n the last tw o decades . Origin a l l y introdu c e d by Farrell (1957) the m e thod was fu rth e r devel o p e d in Charn e s , Coope r and Rhodes (1978) , where the term the Data envelopment analysis (DEA) model was co ined. The effici e n t units span th e fronti e r , but the class i f i c a tion of som e of these units as efficient is not based on other observations being sim ilar, but is due to the m e thod. W e are referring to units, which are classi f i e d as b e ing self-evaluators in th e lite r a t u r e a c oncep t introd u c e d by Charne s et al. (1985 a ) . Self- e v a l u a t o r s m a y most natur a l l y appear at the “edges ” of the techno l o g y , but it is a l so p o ssi b l e that self - e v a l u a t o r s appe a r in the interi o r . It m a y be of i m port a n c e to disti n g u i s h betwe e n tho s e self - e v a l u a t o r s th at ar e exter ior and those that are interior . Findi n g the influe n c e of som e variab l e s on the level of efficie n c y by running regress i o n s of efficie n c y scores on a set of potential explanatory variable s , is an a pproa c h of ten f o llo w e d in a c tua l investigations. 1 Using ex ter i o r se lf - e v a l u a t o r s with ef f i c i e n c y score of 1 m a y then disto r t the resul t s , be ca u s e to ass i g n the v a lue of 1 to the s e se lf - e v a l u a t o r s is ar bit r a r y . Inte r i o r self - evaluators, on the other hand, m a y have peers that are fairly si m ilar. T h ey should therefore not necess a r i l y be d r opp e d when applyi n g th e tw o- stage app r o a c h . The plan of the paper is to rev i ew th e DEA m o dels in Sectio n 2 and define the new concep t s of interio r and exterio r self-e valuators. In Sec tio n 3 the m e thod for classi f y i n g the self- e v a l u a t o r s is introd u c e d . Actua l data are presen t e d in Sectio n 4 and the method for classif y i n g self-e v a l u a t o r s is applied . The ef fect of re m ovin g exteri o r self-e va l u a t o r s is shown. Section 5 conclu d e s . 1 Th e ap pro a ch was orig i n ally in trod u c ed in Seitz ( 1 96 7) , in sp ir ed b y N e r l ov e (19 65) , see Førsu n d and Sara f o g l o u ( 2 0 0 2 ) . Si m a r and W i l s o n ( 2 0 0 3 ) revi e w t h e app r o a c h a n d fi n d i t at faul t i n ge ne r a l d u e t o seri a l correlation bet w een the e ffici ency sc o r e s , a n d p r o v i d e s a new st at i s t i c a l l y sou n d pr oce d u r e base d on speci f y i n g ex p licitly th e data g e n e rating p r o cess and boo tstrapp i ng t o o b t ain confid en ce in terv als . 2 2. Self-evaluators DEA models C o n s i d e r a s e t, J , of pro duc t i o n un its trans f o r m i n g m u lti p l e inputs into multip l e ou tput s . Let ymj b e an outpu t and x nj an input ),( JjMm ∈∈ ),( JjNn ∈∈ . As the refer e n c e for the units in effici e n c y analys e s we want to calcul a t e a piecewise linear fron tier based on observations, fittin g as closel y as pos sib l e and ob eyin g som e funda m e n t a l assum p t i o n s , like free dispos a l , and the techno l o g y set being conve x and closed as usually en tertained (Banker et al., 1984, Färe and Prim o n t , 1995) . This front i e r can be found by solving the followi n g LP problem , term e d the additive model in the DEA litera t u r e (Charn e s et al., 1985b) : . . 0 , 0 , , 0 0 1 mi ni m M n N ij mj mi mi j J ni ij nj ni j J mi ni ij ij j J Max s s s t y y s m M x x s n N s s λ λ λ λ + − ∈ ∈ + ∈ − ∈ + − ∈ ⎧ ⎫+⎨ ⎬⎩ ⎭ − − = ∈ − − = ∈ ≥ ≥ = ∑ ∑ ∑ ∑ ∑ ( 1 ) The last eq uali t y constr a i n t in (1) im pose s va ria b l e retu r n s to scale ( V RS) on the f r ont i e r , while dropp i n g this con s tr a i n t im pose s cons ta n t retu rn s to s cale (CRS). Our analys i s will be valid for both scale assumptions. The frontier is found by m a xim i sing the sum of the slacks on the output constraints, , and input constraints, . The strongly efficient u n its (us i ng the term inology of Charnes et al., 1985b ) are identified by th e sum of the slacks and therefore al l the slack varia b l e s bein g zero. All weights , + mis − nis ijλ , m u st be zero except th e weight f o r itself tha t will be one (i.e. 1,0 =≠= iiij jifor λλ if i is an efficient unit). 2 The efficient points wil l appear as vertex po ints o n the fron tier function s u rface, or co rner po ints of facets. Th e sets of strongly efficien t units, P, and the in effic i e n t units, I , are: 2 A strong ly efficien t u n it, i, may end up be ing locate d exactl y on a facet . We m a y then ha ve m u ltipl e solutio n s for th e wei g h t s, alth ou gh th e max i m a l su m o f slack s is stil l zero . On e o f t h e so l u tio ns wi ll b e λij = 0 fo r j ≠ i, and λii = 1. 3 ( 2 ) : 0 : mi ni m M n N mi ni m M n N P i J s s I i J s s P I J + − ∈ ∈ + − ∈ ∈ ⎧ ⎫= ∈ + =⎨ ⎬⎩ ⎭ ⎧ ⎫= ∈ + >⎨⎩ ⎭ ∪ = ∑ ∑ ∑ ∑ 0 ,⎬ So far we only have slacks as m easures of in efficien cy. If we want only one m easure for each unit, and a m easure that is ind e pendent of units of m easurem ent, the Farrell (19 57) m easure of technical in efficiency is the natural choice. The standard DE A m odel on prim al (enveloping) f o rm , is set up as a pro b le m of determ i n i ng the Farrell technical efficiency score, Eoi , (o = 1,2), either in the inpu t- (o = 1) or the output (o = 2) direction for an observation, i . The following LP m odel is form ulated for each obs ervation in th e case of inp u t-orientatio n: 1 . . 0 , 0 , 0 1 i i ij mj mi j P i ni ij nj j P ij ij j P E Min s t y y m M x x n N θ λ θ λ λ λ ∈ ∈ ∈ ≡ − ≥ ∈ − ≥ ∈ ≥ = ∑ ∑ ∑ (3) In the case of output orientation we have the following LP program : 21 / . . 0 , 0 , 0 1 i i i mi ij m j j P ij nj ni j P ij ij j P E Max s t y y m M x x n N φ φ λ λ λ λ ∈ ∈ ∈ ≡ − ≤ ∈ − ≤ ∈ ≥ = ∑ ∑ ∑ (4) For notational ease the sam e symbols have been used for w e ights in (1), (3) and (4). The proportionality factor, θi or φi , and the weights, λij , are the endogenous variables. Adopting the notation #N and #M for the num ber of inputs and outputs resp ectively, the point 1 # , 1( , . . , , . . , )ij j ij N j ij j ij Mj j P j P j P j P x x y yλ λ λ λ ∈ ∈ ∈ ∈ ∑ ∑ ∑ ∑ # (5) 4 is per construction on th e frontier su rface, and is defined as the reference point f o r unit i . If there are no slacks on the output- and input constraints in (3) or (4) then the reference point coincid e with the radial pr ojection point, using either θi or φi when adjusting an in efficient observation. These po in ts will no rm ally b e in te rior po ints o n facets (bu t m a y fall o n border lines ). W ith one or m o re slacks pos itiv e the ref e rence po int and the rad i al pro j ec tio n point differ. The reference po ints will ag ain appear as vertex points on the fron tier function surface, or corner po ints of facets . It is well known that the ra dial Farrell efficien cy m easure Eoi m a y be one, but that the unit m a y still im prove its pe rform a nce by either us in g less inputs or produc ing m o re outputs. All units with a radial efficiency score of one are by de finition located on the frontier, but it is only for the strongly efficient un its that the reference points coincid e with the observ a tion. A unit m a y have Eoi =1, but one or more of the constrai nts in (3) or (4) being non-binding (i.e. one or m o re slacks positive and zero shadow prices on the constraints in question). Although the m odel (3) or (4) can be so lved directly by letting the index j run over all observations in J, a two-stage procedure of solving (1) f i rs t is of ten f o ll owed. By using the inform ation on strongly efficient units when solv ing (3) or (4), the LP computations are done m o re efficiently, and one will only identify reference poin t s by (5) that are in the strongly efficient sub s et of the fro n tier. In the context of the D E A m odels (3) and (4 ), the strongly efficient units are term ed peers. For each in efficient unit, i , a Peer group set , Pi , (Cooper, Seiford and Tone, 2000) m a y be for m ed: { } IiPpP ipi ∈>∈= ,0: λ (6) where ipλ are the solu ti o n values of the we ight s in either (3) o r (4) depending on the orientation in question. If the Peer group sets are em pty, then all the units are efficient. The solutio n s to (1), (3) or (4) do not identif y facets sys t ematically, bu t by using (6) we can identify the corner points of f acets where one o r m o re radial pro j ection points of in efficien t units a r e lo c a t e d . 5 It will also turn out usef ul to look at the group of inefficient units refere nced by a peer. Such a set is defined for each peer, p, as th e Referencing set in Edvardsen and Førsund (2001) with refere n c e to the solu ti o n s of (3) or (4): { }: 0 ,p ipI i I p Pλ= ∈ > ∈ ( 7 ) The self evaluators The Referencing set (7) m a y be e m pty, in wh ich case th e unit is ca lled a s e lf -e v a l u a t o r : Definition 1 : A peer , where the s et P is defined in ( 2 ) , is a self-evalu ator if Pp∈ ØI p = , where I p is defined in ( 7 ) 3 . The set of peers m a y thus be partiti oned into a set of self - e v a l u a t o r s , PS , and a set, PA, of active peers, i . e . peers with non-em pty referencing sets: { } { PPP ØIPpP ØIPpP AS p A p S =∪ ≠∈= =∈= : : } (8) The self-ev a luato r s are vertex poin t s of facets without any reference p o ints d e fined as th e radial projection poin t s o f ineffi cient observations located on these facets. The LP solutions to (3) or (4) do not give us any inform ation as to w h ich ef ficien t units constitute th e vertex points of such a facet without reference points. An effici ent unit m a y be a vertex point for m a ny facets . Our definition of a s e lf-evaluato r im plies th at there are no reference p o ints on any of its facets. 3 . The determination of type of self-evaluator There are two possibilities as to th e location of facets form ed by self-evaluators on th e frontier surface. Such facets m a y be part of the extrem e ar eas of the frontier, i.e. facets clos es t to the axes in the case of CRS, or facets, in the case o f VRS, also furthest away from the origin or closes t to th e orig in (th e V R S f r ontie r w ill in ge n e ral no t contain the o r ig in). In the ca se of 3 An altern ate d e fi n itio n co u l d b e in term s o f th e refe ren c e sh ares d e fi n e d in To rg ersen, Førsund and Kittelsen (1996), where a self-e v a l u a t o r has a refe re n c e sha r e of zero. 6 CRS only mixes of inputs or outputs m a y be extrem e, while in the case of VRS we in addition have th e sc a l e dim e nsio n. Such se lf -evalu ator s w ill b e term ed exterio r self -eva luators. In the case of CRS, facets without any reference point s m a y also be found in the interior of the frontier surface with res p ect to m i xes, while fo r VRS interior also m e ans interio r regardin g scale. Such self -eva luators w ill b e term ed interior self -eva lu ators. Figure 1 shows the two differe nt cas es in the s i m p lest case of two dim e nsions. The observations represented by points A, B, C, D, F and G are efficient, while O1 is in efficient. The radial reference or projection point for unit O1 is a in the case of input orientation. The ref e rence p o int (5) in this sim p le case coin cides with the peer A. Considering output- orien t ation the peers are D and F , and the referen ce point is d. To illustr a te the ref e ren c ing set of a peer, th e shaded area in Figure 1 shows the referencing zone for the efficient unit D in the case of outp u t or ienta tio n. A ll th e in ef f i cient un its be ing in unit D ’s ref e rencing set m u st be located here (such inefficient uni ts m a y also ap pear in referenc ing sets of other peers; here unit F ’s). If the referencing zone is empty then th e peer is a s e lf-evalu ator. Rem oval of such a self -eva luator w ill no t c h ange the ef f i ciency sc o r es f o r any other units. W e w ould expect th e self-evaluators to be extrem e poi nts in one or more of the m i x or scal e di m e nsi ons, but if the referencing zone is narrow a se lf -e valuato r m a y also be c e ntra lly p l a ced w ithin the set of observations. A narrow zone m eans that other peers are clo s e to the s e lf-evaluato r . Figure 1: DEA and t he two types of self evaluators 7 Notice th at the classif i cation as a self-evaluator is dependent of the orientation of the efficiency m easure. Considering out put orientation we have that both B and C are interio r self -eva luators, w h ile A and G are exterior self-eva luators. Considering input orientation w e have that B , C, D and F are inte rior self -eva luators, w h ile G is an ex terior one. In b o th cases the unit G could have been observed anywhere betw een th e line g ’ ( t he continu a tio n of the line D F ) and the line g’ ’ (ref e ren ced by F), w itho u t any un it changing its estim ated ef ficiency or its status as peer. The efficiency score of 1 assigned to unit G ther efore contains little inform ation. In e.g. the output oriented case w e see that there is a co nsiderab le s c ope for output variation for a gi ven input yielding the e fficiency score of 1. Our purpose is to develop a m e thod for classification into exterior or interior self-evaluators using only the standa r d DEA for m at . Enveloping from below The production set is by construction convex. If a ll inefficient units are removed from the data set, and a new run is done with only the effi cient units, w e will find the exterior peers by reversing the enveloping of the data from “abov e” to be from “below”. A ll that ne eds to be done is to reverse the inequali ties in the LP program (1) by adding the s l ack variables instead of subtracting: (9) ( ) . . 0 , 0 , , 0 , , 0 1 mi ni m M n N ij mj mi mi j P ni ij nj ni j P mi ni ij ij j P Max s s i P s t y y s m M x x s n N s s m M n N λ λ λ λ + − ∈ ∈ + ∈ − ∈ + − ∈ ⎧ ⎫+ ∈⎨ ⎬⎩ ⎭ − + = ∈ − + = ∈ ≥ ∈ ∈ ≥ = ∑ ∑ ∑ ∑ ∑ Notice that we are only consider ing observations belonging to the set of strongly efficient units P determ ined by solving (1). This envelopment of the da ta is by c onstruction concave. 8 The units th at tu rn out a s “ef f i cient” in solv ing ( 9 ), in the se nse tha t all slacks are zero, m u st be units be longing to th e exterior fa cets in the s o lution to th e origin al m odel (1). W e will use this result to define exterior a nd interior strongly efficient units: Definition 2: A strongly efficien t unit belonging to th e set P defined by (2) is exterior if it belongs to the set P E: :E mp np m M n N P p P s s+ − ∈ ∈ ⎧= ∈ + =⎨⎩ ⎭∑ ∑ 0 ⎫⎬ )E I (10) where the s lack variab les, , are solutions to the problem (9) . −+ npmp s,s A strongly efficient unit belonging to the set P defined by (2) is interior if it belongs to the set : 0 (I mp np m M n N P p P s s P P P+ − ∈ ∈ ⎧ ⎫= ∈ + > ∪ =⎨ ⎬⎩ ⎭∑ ∑ (11) where the set PE is defined in ( 1 0) 4 . To dete rm ine the na ture of a se lf -e valuato r an orien t ation f o r th e c a lcu l ation of th e Farr ell efficiency m easures has to be chosen, i.e. e ither input- or output or ientation. The following definition can then be made as to th e class i ficatio n of self-evaluato r s: Definition 3: Consider a peer , w here the set P is defined in ( 2 ) , that is a self- evaluato r, , where the set P S is defined in (8) and found by running either the input- oriented program (3) , or the output-oriented program (4). If , where the set P E is defined in ( 1 0) , then p is an exter ior self-eva luator. If then p is an interior self- evaluato r: Pp∈ SPp∈ EPp∈ EPp∉ ( ) SE S E SI S I SE SI S P P P P P P P P P = ∩ = ∩ ∪ = (12) where P SE and PSI are the sets of the exter ior and interior self-evalua tors respectively. Illus t r a ting the approa ch using Figu r e 1, w e hav e tha t the ne w “f rom below f r ontier” w ill b e the lin e f r om A to G , thus these units are the only ones o n the “f rom below f r ontie r ” and 4 N o t e t h at i n t h e spe c i a l case w h er e t w o u n i t s ha ve i d e n t i c a l i npu t - o u t p u t vect o r s , bot h c oul d be cl a ss i f i e d as ex terior b y th i s criterio n , bu t wou l d no t h a ve u n i qu e in tensity weig h t s in (9 ). In th is situ atio n, wh ich i s lik ely t o be rare i n em pirical applications, it seem s n a tu r a l to c l a s s i f y t h e u n i t s a s in te r i o r . 9 therefore exterior points in PE. This class i ficatio n is indepen d en t of orientation, and they are both being located on exterior facets in the original problem (1). In the case of output orien t ation, the self -ev a luators B an d C, according to the s o lution to problem (4), will no t appear on th e new frontier, and they are therefor e interior according to Definition 3. The self- evaluators A and G appear on the new frontier and are ther efore exterior. In the case of input orientation solving problem (4) gives B , C, D , F and G as self-evaluators, and we have that B, C, D, and F are inte rio r self -eva l u a t o r s and G a n exter i or o n e. W h ile A is an exterior peer in input orientation, it is not a self-evaluator. Figure 2 pro v ides anoth e r illustration . In a tw o-dim e nsional input space an isoquant is shown in the ef f i cie n t units A, B, C and D . Consider input orientation and CRS. Assum i ng inefficient units are only located north east of the isoquant segm ent AB in the cone d e lim i t e d by the rays going through the points A and B , we have that C is an interior se lf -eva l u a t o r , and D is an ex te rior self-ev a luato r . Running the “r ev erse” p r ogra m (9) we will env e lope the four peers from “behind” by the broken line from A to D . W e then know that units A an d D are exterior, and using the infor m ation from runni ng the DEA m o del (1) we then have that unit C is an interior self evaluator, and unit D an exterio r one. Figure 2. Determining the type of self-evaluator 10 It m a y also be of inter e st to c l ass i f y the activ e peers according to th e type exterior and interior. Building on definition 3 we have. Definition 4. The active peers defi ned in (8) belong to the subsets P AE and PAI : ( ) AE A E AI A I AE AI A P P P P P P P P P = ∩ = ∩ ∪ = (13) where P E and PI are defined in (10) and (11) respectively. The program (9) is not the standa rd DEA additive for m ulation, since the sign of the slacks in the restrictions on inputs and outputs have been changed. However, by negating these equaliti e s , ( 9 ) can be re w r itte n as: ( ) . . 0 , 0 , , 0 , , 0 1 mi ni m M n N ij nj ni ni j P mi ij mj mi j P mi ni ij ij j P Max s s i P s t x x s n N y y s m s s m M n N λ λ λ λ − + ∈ ∈ + ∈ − ∈ + − ∈ ⎧ ⎫+ ∈⎨ ⎬⎩ ⎭ − − = ∈ − − = ∈ ≥ ∈ ∈ ≥ = ∑ ∑ ∑ ∑ ∑ M (14) Com p aring (1) and (14 ) we see that these are identical except th at inp u ts and outputs are exchanged. Since ex isting D E A softw a re of ten w ill solve th e additive m odel (1), w e m a y as well for convenience find the set of exterior self-evaluators PSE by exchanging inputs and outputs and running (14) on the strongly efficient units, rather than running (9) on these units. 11 4. An empirical application The data W e will apply the m e thod for determ ining interi or and e x terior se lf-evalua t ors o n a cross section data set of the nursing and ho m e care sector of Norwegia n m unicipalities. The data is found in Edvardsen et al. (2000). Th e prim ary data source is the o fficial yearly statistics for m unicipal a c tiv ities pub lished by Statis tic s N o r w ay. Resource usage is m easured by financial data and number of m a n-years of different categories. P r oduc ti o n da ta contain s m a inly th e num b er of clien t s dea lt w ith by ins t itutionali sed nursing, hom e based nursing, and practical assis t ance. Quality info rm ation is lacking, but the clients a r e split into som e age gr oups that m a y be of significance for res ource use. In cooperation w ith representatives for m the municipalities and the m i nistries of Finance, Municipal and re gional affairs, and Social and health affairs we have chosen to split the clients on two m a j o r age groups, 0-66 and above 66 (67+), and use institutions and hom e care as sepa ra te outputs. W ithin institutions there ar e also a num ber of short-stay clients, either com i ng on a day care basis or on lim ited stay of convalescence. These usually require fewer re sources than the perm anent clients. As indica tors o f quality of institutions w e have inf o rm ation of num ber of single perso n room s and on clients staying in closed wards. The sepa ration is regarded both as a quality factor for the clients taken care o f (dem ente d cases), an d for the other clients. In hom e -based care m e ntally disabled m a y be quite resource dem a nding. They m a y also be found in the 0-66 age group within institutions. There is no inform a tion on how long tim e a hom e visit may last or how often it is received. Such inform ation would obviously have given us some quality indicators. We also run the risk of municipa lities cutting down on both length and num b er of visits showing the same num ber of clients receiving a more gene rous support in other municipalities. To ensure that th e data quality was good enou gh we entered a phase of quality c ontrol. W e strongly feel that one should not autom a tically rem ove outliers, but if possible contact the municipality in question and ask if the data is corre ct. This is espe cia l l y im portan t if the m e thodology is fron tier based (such as DEA) because th e units defining the fro n tier are outliers by definition. This led to m a ny changes in the dataset and required quit a lot of work, but as a result we could be m u ch more confiden t in the quality of the data (see Aas (2000) for details). 12 Table 1: Pri mary variables used in the DEA m odel, cross-section 1997 of 469 municipalities. Av era g e Standa rd dev iation M in M a x I n p u t s Trained Nurses x 1 31.1 41.4 1.5 410.4 Other Employ ees x 2 137.4 169.4 5.3 1821.5 Other expenses x 3 9066.2 13449.5 190.0 108990.0 Outputs: No. of Clients Institutions, a ge 0-66 y1 3.4 4.9 0.0 50.0 Institutions, a ge 67+ y2 87.7 108.6 0.0 1024.0 Short-term stay y3 113.8 163.3 0.0 1614.0 Closed wards y4 11.8 19.3 0.0 195.0 Single person room y5 65.7 82.2 0.0 747.0 Mentally disabled y6 48.7 79.5 0.0 857.0 P ractical assistance, 0-66 y7 51.3 66.3 0.0 597.0 P ractical assistance, 67+ y8 212.7 272.4 1.0 2190.0 Home b ased nursing, 0 -66 y9 34.1 45.3 0.0 407.0 Home b ased nursing, 67 + y10 125.8 153.3 1.0 1480.0 Table 1 shows descriptive statistics for the vari a b l e s used in the D E A model. The f i rst th re e rows m easure the inpu ts in the m o del. Trained nurses and Other Employees show s us that about 18% of the em ployees (m easured in m a n-years) ar e trained nurses. Other exp enses are m easured in 1000 NOK (Norwegian currency). The la st 9 rows in table 1 m easure the outputs. Institution, age 0-66 an d Institu tins, age 67+ ar e the num ber of ins tit u t i o n a l i z e d c lie nt s in th e age groups 0-66 and a bove 67 respectively. Short-term stay shows how m a ny visits the institutions in the m unicipality have gotte n from clients who are not residents, while Closed ward shows how m a ny of the residents are in a special ward for dem e ntia clients. Mentally disabled shows how ma ny of the clients are m e ntally disabled (alm ost all of these clients ge t hom e care) . Practical assistance, 0-66 and Practical assistance, 67+ counts how m a ny clients get practical assistance (such as cleaning and m a king food) in the indi cated age groups, while Home based nursing, 0-66 and Home based nursing, 67+ c ount the sa m e for clien t s ge ttin g nursing services in their own hom e s. The Farrell output-oriente d efficiency scores Figure 3 shows E 2 (output-increasing efficiency assum i ng variab l e retu rn s to scale) . Each bar in the diagram represents one of the 469 m unicipalitie s, sorted by increasing efficiency. The 13 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 10 000 20 000 30 000 40 000 50 000 60 000 70 000 S iz e measure d in man- y ears E2 Figure 3: Sor ted output-ori ented efficiency scores height of the bars represents the efficien cy of th e DMU, while the width of th e bar shows the size m easured by m a n-years (sum of trained nu rses and other em ployees). Both large and sm all DMUs can be found in all parts of th e diagram , with the exception that no large municipalities are lo cated in the (v ery in ef f i c i e n t ) lef t m o s t part of the diagram . The average efficiency is 86 percen t, while the effici en cy of the averag e un it is 67 p e rcent. E xterior (PSE) 25 Self -evaluators (PS) 29 Inte rior (PSI) 4 E xterior (PAE) 72 Active (PA) 100 Inte rior (PAI) 28 Additive model Farrell ef fi ciency model E f f icient Ex terior (PE) I neffi cient (I) 340 E f fi cient (P) 129 97 Total (J) 469 E f fi cient I nterior (PI ) 32 Figure 4: The taxonomy of units in DEA efficiency analyses 14 An overview of the taxonom y deve loped in Sections 2 and 3 fo r classification of units is given in Fig u re 4, togeth er w ith the a c tual deco m position f o r the da ta se t at hand. In view of the relatively large num ber of observations it m a y be surprising that as m a ny as 28 percent of the units are efficient. This m a y be due to the unusually high num ber of dim e nsions, 13 variables in all. S i nce the effici ent units span out the frontier te chnology it is to be expected that the nu mber of exterior on es is h i gher than the in te rior ones, 75 and 25 percent respectively. Turning to the Farrell efficiency m odel (4) the self-eva luators represent 23 percen t of the efficient units. A s e xpected the rela tive sha r e of exterio r peers is larger in th e group of self-evaluators than in the group of active peers, 86 versus 72 percent. Among the active peers that share of interior units is highe r, 28 percent. This dist ribution is of importance for the em pirical support of the frontier and the associated efficiency distribution. Table 2. Relative size of interior- and exterior self- evaluators measured as percentage deviation from the sample average Trained nurses Other employ ees Other expens es In st 0-66 In st 67+ Short time stay Closed ward Single person room Mentally disabled PA 0-66 PA 67+ HS 0-66 HS 67+ Interior sel f-e v a lu a tors 425 Å snes 20 3 -1 -12 68 15 61 25 7 -2 27 6 54 616 Nes -56 -63 -58 -41 -44 -38 -49 -25 -71 -40 -36 -38 -41 807 Notodden 14 36 22 17 83 167 53 106 15 -34 42 17 84 1567 Rindal -51 -68 -64 -71 -41 -32 -41 -45 -84 -73 -54 -82 -39 E x terior sel f-e va lua tors 101 H alden 126 194 62 17 32 -32 -100 14 198 167 319 193 402 213 Ski 63 43 66 311 -12 -38 69 11 54 257 58 -30 -3 217 Oppegår d 92 6 29 76 -9 22 -100 28 42 15 43 390 184 219 Bær um 929 736 1102 825 748 551 1112 1038 781 659 538 577 415 430 Stor-E lvdal -68 -48 -49 -100 -52 -49 -7 -50 -77 -67 -37 26 -42 615 Flå -90 -73 -73 -100 -71 -77 -100 -70 -90 -84 -81 -97 -77 632 Rollag -81 -72 -74 -71 -69 -89 -100 -63 -65 -80 -80 -24 -81 709 Larvik 345 296 205 517 283 303 273 241 393 356 390 170 280 806 Skien 353 356 346 252 297 93 19 304 235 349 551 322 562 904 Grimstad 92 21 -2 -12 27 -23 188 -7 54 159 44 225 9 941 B ykle -86 -76 -77 -100 -73 -82 -58 -63 -98 -94 -96 -91 -94 1144 Kvit sø y -85 -95 -94 -100 -95 -89 -100 -91 -100 -94 -94 -94 -90 1222 Fitj ar -61 -64 -73 -71 -70 -58 -32 -27 -77 -75 -73 -77 -79 1411 Gulen -68 -49 -57 -41 -24 -84 -100 -57 -75 -84 -59 -65 -45 1612 H emne -51 -57 -58 -71 -65 -16 -7 -82 -88 -49 -41 38 7 1632 Roan -88 -82 -86 -100 -82 -77 -100 -71 -84 -94 -79 -91 -74 1702 Steinkj er 201 116 27 135 85 87 358 68 60 157 172 126 115 1714 Stj ø rdal 128 96 29 487 97 99 36 128 153 126 60 3 18 1723 Mosvik -72 -87 -89 -100 -76 -82 -100 -76 -77 -88 -85 -88 -75 1839 Beiar n -78 -75 -74 -41 -75 -75 -49 -62 -90 -90 -71 -77 -65 1868 Ø ksnes -48 -25 -48 -12 -37 -57 -41 -19 -42 -40 -56 143 -26 1920 Lavangen -87 -82 -83 -71 -77 -82 -66 -73 -79 -96 -85 -97 -84 3001 By gdøy -Frogner 41 10 150 -100 -37 -100 -100 -16 -32 15 194 99 188 3003 St.H anshaugen-Ullevål 262 271 695 47 392 167 222 441 33 37 347 -9 231 3004 Sagene-Torshov 241 339 808 311 305 338 171 252 23 372 554 176 351 Inputs Out p uts Municipality number and name 15 Far out or alone in the crowd? In Table 2 the relative distan ce fro m the aver age unit is illustrated by m easuring each of its variab les ag ainst the av erage for th e sam p le (J ). The f i rst f our units a r e the inte rior se lf - evaluators in PSI . These units are on both sides of the av erage, and on e of the four units is quite close to the sam p le average. N o unit is clo s e to either the sm all or large exterior units. It seem s appropriate to use the e xpression “alone in the crowd.” The 28 exterior self-evaluators are distributed with half above and half below the sam p le average. On e unit has m a xi m a l sample values for two of the variables. There are several output v a ria b les w i th z e ro as th e low e r lim it. The variab le “Inst itution, 0-66” has seven exter i o r units w ith the m i nim u m value of zer o; while for “Closed ward” there are eigh t exterior units with the minim u m value of zero. So given that “far out “ m eans both sm all and large units the exter i or units dese rv e w e ll this classification. The influence of extrem e m i xes m a y also be investigated, but due to all the possib l e com p arison we leave this exercis e out. The idea behind the tw o-stage approach is ba sed on the distinction between pure inputs and outputs on the one hand, and envi ro n m e n t a l va riabl e s on the other. B y th e assum p t i o n s of th e DEA m e thod, the input-output vectors m u st belong to a de term inistic technology set bounded by the frontier. However, environm ental variable s m a y be relevant for the perform a nce of the units, but their influence m a y be regarded to be of a stocha s t ic natur e tha t is m o st appropriately revealed by studyi ng the statistical associati on between som e m e asure of perform a nce and the environm ental variables. S i nce the crucial point of being concerned with environm ental variables is that there must be some influence on either th e discretionary inputs or outputs of the environm ental variables there is a good case for advocating a single stage approach an d incorpo r ating all re levant variables in one single m odel. O n e rea s on f o r trea tin g environm ental variables differen t l y than stand a r d outpu ts and inp u ts is tha t th e w a y th e variables interact with the standard production variables m a y be difficult to m odel. It m a y, for exam ple, not be so clear-cut wheth e r the variable is an inpu t or an output. The for m ulation of the second stage is to establ ish an association between the efficiency score and the env i ronm ental variab les, z k : 2,1o,Ji,)z,..,z(fE iK1oi =∈+= ε (15) 16 where εi is a random variable. There have been se veral approaches to estim ating (15). The first approach was to specify f (.) as a linear function and apply OLS (Seitz 1967, 1971). But there are two special features of the m odel ( 15). By definition the e fficiency scores are restricted to be between zero and one, 2,1o,Ji,1)z,..,z(fE0 iK1oi =∈≤+=≤ ε (16) and using the DEA model (3) or (4) to generate th e efficiency s c ores usually leads to a concentration of the values 1. As shown in Figure 4 we have 28 percent of the efficiency scores at the upper lim it of one. This has lead researchers to apply a censored regression like the Tobit model or truncated regressions. These approaches are strong ly criticized in Sim a r and W ilson (2003). The funda m e ntal point is m a de that the effi ciency scores in (15) are estim ates of the unknow n efficiencies, and that thes e scores are serially correlated. Therefore, neithe r applying a Tobit or a trun c a ted reg r ess i on w ill so lve this prob lem . A sequence of bootstrapping techniques is propos ed that will yield proper confidence intervals of the param e ters of f ( . ) . Table 3: Stage 2 regression result s applying OLS to a linear model All units included* ) Excluding ex terior Self-evaluators R2 0.1737 R2 0.2082 Variable Coef f . p-value Coef f . p-value Climate indicator -0.007 0.035 -0.006 0.056 Share of priv ate institutions 0.098 0.019 0.099 0.015 Free disposable income, 1 996 -0.020 0.054 -0.028 0.010 Share of u sers in home care -1.019 0.000 -1.089 0.000 Share in home care of age group 0-66 1.823 0.170 1.437 0.282 Share in home care of age group 67-79 0.574 0.006 0.665 0.001 Share in home care of age group 80-89 0.270 0.004 0.261 0.004 Share in home care of age group 90+ 0.100 0.019 0.109 0.011 Share in inst. care of age g roup 0-66 24.785 0.019 27.615 0.009 Share in inst. care of age g roup 97-79 -1.072 0.011 -0.926 0.026 Share in inst. care of age g roup 80-89 -0.101 0.524 -0.152 0.331 Share in inst. care of age g roup 90+ 0.026 0.561 0.053 0.235 Constant term 1.527 0.000 1.562 0.000 * ) Co mm u n itie s with in th e two m a j o r cities Berg en an d Oslo are ag greg ated and on e u n it is rem o v e d fro m th e dat a set 17 However, since the purpose of our paper is to de m onstrate the im por tance of the role of exterior peers, the relation (15) is here inte rpreted just to represent an investigation of association and not to be a cau sa lity m odel. Theref ore, O L S is used to estim a te a line a r function (15). An advantage of OLS is that bette r diagnostics to charact erize the covariations are ava ilab l e , like the m u ltip le r e gres si o n coef f i ci e n t . Table 3 sho w s the resu lt of an OLS regression using a linear m odel in (15) . The p-values are also given, although th ey should not be taken at face value due to the inheren t statistical problem s with the approach, as m e ntioned abov e. W e perform regress i o n s firs tly w ith the com p lete data set, and secondly excl uding the exterior self-evaluators. The environm ental variables represent backgrou nd variables that experts have suggested ma y influence th e efficiencies of municipalities. Climate indicator is a m e asure of the averag e tem p erature m easured over the year in the m u ni cipality. It can also be seen as a proxy for am ount of snow, altitude and dist ance from the coast. W e note that rem ovi ng the exterior self- evaluators changes both the regression coeffi cient and the p-value, in dicating a weaker connection between efficiency s c ores and this v a riab le. Share of private institutions is m easured by how large sha r e of the total n u m b er of institu t i o n s ar e i n t h e privat e sect or (most oft e n NGO’ s) . I t woul d be bet t e r t o m eas ur e t h i s by the nu m b er of clients, b u t such d a ta w a s not av ailab l e. Poss ible interpr e tations of a positiv e p a ram e ter estim ate (an d low p-values in bo th regres si on models) are that the municipalities own care providers g e t a learn i n g effect from presence of private service prov iders, o r th at priv ate presence red u ces inefficiency becaus e they increa se the fear of privati zation in the municipa l nursing sector. Free disposable Incom e, 1996 is m easure of the relativ e w ealth of the m unicipa l i t y (per inhabitant). It is calculated by finding the difference between the actual income in the municipality, and the “required ex p e ns e s ” in the m unicip a l i t y in other sectors than care for the elderly (i.e. schools, roads etc.). Require d expenses are calculated on de m o graphical variables and other factors exogenous to the municipal ity. (See Aaberge and L a ngørgen (2003) for the details behind the construction of th is indicator.) Data fo r 1997 (the year all the other data is from ) was also available, but we reasoned that the m uni cipality’s decision on how it want provide care for the el derly is m o re strongly based on incom e in the previous than 18 in the current year. This has som e statistica l support in that the ’9 6 variable has larger explanatory power m easured by R2 of the m ode l and T-value of the param e t e r estim a t e . The p-value f o r the param e t e r estim at e f o r this var i abl e im prove s w h en the exter i o r self -e v a l u a t o r s are rem oved from the regression m o del. One po ssible explanation of the negative param e ter estim ate is that a “rich” m unicipality m i ght use the extra resources on higher quality (not picked up by the DEA model) and/or allowi ng inefficiency in production of services. Share of users of home care is a m easure of the size of the share of hom e care clien t s in rela tion to a ll th e clients getting nur sing se rvic es. This coef ficient h a s a negativ e param e ter estim ate. Th is is an indication that the techn i ca l efficiency tends to b e lower when a larger part of the municipality ’s clien t s is in hom e care. This is interes ting, b ecause it is a m easure of the product m i x in the m unicipality. The D E A m e thod takes in to account the case m i x w h en estim a t i n g th e f r ontie r . H o w e ve r , the d i stance between the frontier and the average unit behind the frontier m i ght vary with case m i x. It is im portant to rem e mbe r that since we have no price inf o rm ation on the products (hom e care and institutionalized care), we do not know which group has the highest total efficiency. W ithout price infor m ation we can only estim ate technical efficiency and scale e fficiency, not allocative efficien cy, which is also a component of total efficiency. Thu s , we can make no r ecommendation of what is better, only point out that the variation of tech nical efficiency seem to grow with the share of hom e based nursing. Share in home care of users in age group… (four age groups) m easures how large share of the total population in an age group gets hom e based nursing services . W ith the exception of the lowest age g r oup (0-66) all of the param e ter es tim a t e s are statis t i c a l l y s i gnif i c a n t an d positiv e . This supports our hypothesis that the higher the coverage of hom e based nursing, the lower the requ ired resource usage per clien t . The reason ing is that the nursing se ctor behaves as if it ranks its potentia l c lien t s f r om the ones tha t re quire the m o st nursing to the ones that requires the least, an d that it uses this rank ing as a prio ritized lis t of which clients to accep t first. If the municipality has a larger share of the population in an age group as its clients, we expect the average req u ired resource usage per clien t to be lower becau se the av erage client is h ealth ier. Share in inst. care of users in age group … (four age groups) is sim ilar to the variables described above, but for institu tionalized care. The param e ter estim ate for the youngest age group (0-66) is positive and statistically significan t. It is a priori known that som e of these clients require a lot of resource usage, but rem e m b er that th e num ber of users in this group 19 (inst. 0-66) is included in the DEA m odel. It m i ght be that the m uni cipalities who has a relatively large share of these users com p ared to their total population have healthier clients on average. The only o t her ag e grou p in inst. ca r e tha t ge ts a st atis t i c a l l y signif i c a n t p a ram e t e r estim ate is 67-79 where the sign is negative. T h is is an indication that the “youngest of the oldest” require m o re resource usag e in inst. care than the othe r groups above 67. It m i ght be that it m o re for difficult for the clients in this relatively young age group to get inst. care, and that the c lie nts w ho actuall y get it requir e m o r e resources on average than in the older age groups. Re m oving the exterior self-evalu ators can m a ke a difference. In this case the explain e d share of the to tal varian ce in the m odel increas ed as R2 rose from 17% to 21%. Both coefficient estim ates an d p-values change, sharp e ning the esti m a t e s of seven coef f i c i e n t s w h ile o n ly thre e had increas ed p-valu es 5 . W h ile num erical changes are s m all, they are still sizeab le considering that only 25 out of 469 observations (5%) were re m oved. Essentially, we have rem oved the units that are m o st likely not to co ntain any inform ation, i.e. to be pure noise 6 . This is of co urse not con c lusiv e evid ence that on e approach is better th an the other. T h e point w e w a nt to m a ke is that it may m a ke a difference. W e ha ve already argued that it m a kes theore ti c a l s e nse to r e m ove the exte rior s e lf -e valuators. It m a y be added that in Sim a r and W ilson (2003) it is conjectured th at the bootstrap works better the denser the data. S i nce we have rem o ved data poin t s in regions tha t by d e fi nition are as “th i n” a s p o ssible, th e bootstra p should also work better. In sum , we feel that we have m a de a solid case f o r the advantages of identifying and rem ovi ng the exterior self-eva luators when doing a two-stage an alysis in a D E A settin g . 5 In co nt r a s t , e x cl u d i n g al l sel f - e v a l u a t o r s , bot h i n t e r i o r and e x t e r i o r , w oul d ha ve l o w e r e d R2 and de cre a s e d p- v a lu es on ly fo r th ree co efficien ts an d in creased th em for sev e n. 6 Prel i m i n a r y r e su l t s fr om usi n g t h e h o m o g e n o u s bo ot s t r a p s u g g e s t e d by Si m a r and W i l s o n ( 1 99 8 ) sh o w a standa rd e r ror of the bias-c or r e c t e d es timates th at is co n s isten tly twice as larg e for th e ex teri o r than fo r t h e in terio r self-evalu ato r s, sup portin g th e lack of inform a tion conte n t in t h e e f ficiency estim a t es of the former. 20 5. Conclusions The units f ound strongly efficient in DEA studi es on efficiency can be divided into self- evaluators and active peers, depending on whethe r the peers are referencing any inefficient units o r no t. The contribution of th e paper star ts w ith sub d ividing the self -eva lua t ors in to interior and exterior on es. The ex terior self-ev a luato r s ar e efficient “by default”; there is no firm evidence from observations fo r the classi f i cation. Self -evaluator s m a y m o st natura ll y appear at the “edges” of the technology, but it is also possible that self -evaluators appear in the interior. It m a y be of im portance to disti nguish between the self-evaluators being exterior or interior. Finding the influence of som e variables on the level of efficiency by running regres sions of efficiency scores on a set of potential explanatory variables is an approach of ten f o llow e d in actua l investigatio ns. U s ing ex te rior s e lf-evaluato r s with efficiency score of 1 in such a “two-stag e” procedure may then dis t ort the resu lts, becaus e to assign the value of 1 to these s e lf -eva l u a t o r s is arb i trar y . Inter i or self-evaluators, on the other hand, may have peers that are fairly sim ilar. They should th en not be dropped when applying the two- stage approach. A m e thod for classifying self-e valuators based on th e additive DEA mode l, either CRS or VRS, is developed. The exteri or strongly efficient units ar e found by running the enveloping procedure “from below”, i.e. reversing the sign s of the slack variab le s in the additi v e m odel (1), after removing all the inefficient units from the data set. W h ich units of the strongly efficient units from the additive m o del (1) th at turn out to be se lf-evaluators or active peers, w ill dep e nd on the or ientation of the ef f i cie n cy analy s is, i.e. w h ether inpu t-o r output orientation is adopted. T h e classi fication into exterior and interi or peers is determ ined by the strongly efficient units turning out to be exte rior ones running the “reversed” additive m odel (9). The exte rio r self -e v a l u a t o r s units should be rem oved from the observations on efficiency scores when perform i ng a two-st ag e analys is of explain i ng th e distribution of the scores. The application to m unicipal nursing- and hom e care services of No rway shows significan t effects of re m oving exterior self-evaluators from the data when doing a two-stage analysis. T hus the conclusions as to explanations of the efficiency s c ore d i stri bution will be qua lified taking our new taxonomy into use. 21 Referen ces Banker, R.D., A. Char nes and W . W. Cooper ( 1984) "Som e models for estim ating technical and scale in efficiencies. " Management Science, 30, pp. 1078-92. Charnes, A., C.T. Clark, W . W. Cooper, and B. Golany (1985a) "A Developm ental Study of D a ta Envelo pm e n t A n alysis in Meas urin g the Ef f i ciency of Maintenance Units in th e U.S. Air Forces." Annals of Operations Research, 2, 95-112. Charnes, A., W . W. Cooper, B. Golany, L. Seiford, and J. Stutz (1985b): "Foundations of Data Envelopm ent Analysis for Pareto-Koopm an s Efficient Empirical Production Functions.," Journal of Econometrics, 30, 91-107. Charnes, A., W . W. Cooper and E. Rhodes (1978): “Measuring the efficiency of decision m a king units,” European Journal of Operations Research 2, 429-444. Cooper, W . W . , L. M. Seiford, and K. Tone (2000): Data Envelopment Analysis. A comprehensive text with models, applica tions, references and DEA-solver software , Boston/Dordrecht/London: Kluw er Academ ic Publishers. Edvardsen, D. F. and F. R. Førsund (2001): “International benchm arking of electricity distribution utilities, ” M e m o randum 35/2001, Departm e nt of Ec onom ics, University of Oslo. Edvardsen, D. F., F. R. Førsund og E. Aas (2000): “ Effek t i v i t e t i pleie - og om sor g s s e k t o r e n ” [Effici ency in the nursing- and hom e car e sector ], Rapport 2/2000, Oslo: Frischsenteret. Erlandsen, E. and F. R. Førsund (2002): “Effici ency in the Provision of Municipal Nursing- and Hom e Care Serv ices: The Norwegia n Exp e rien ce,” in K. J. Fox (ed.): Efficiency in the Public Sector , Boston/Dordrecht/London: Kluwer Academ ic Publishers, x-y. Färe, R. and D. Primont (1995): "Multi out put production and duality: Theory and applic at i o n s , " Southern Illin o i s Univ ersity at Ca rbondale. Farrell, M. J. (1957): “The m easur em ent of productive efficiency,” Journal of the Royal Statistical Society, Series A, 120 (III), 253-281. Førsund, F. R. and N. Sarafoglou (2002): “On th e origins of data envelopm ent analysis,” Journal of Productivity Analysis 17, 23-40. Nerlove, M. (1965): Estimation and identification of Cobb – Douglas production functions , Am sterdam : North-Holland Publishing Com p any Seitz, W . D. (1967): “Efficiency m easures for steam -electric generating plants”, Western Farm Economic Associa tion, Proceedings 1966, Pullm an, W a s h ington, 143-151. Seitz, W . D. (1971): “Productive efficiency in the steam -electric ge nerating industry,” Journal of Political Economy 79, 878-886. 22 Sim a r, L. and P. W . Wilson ( 1998) "Sensitivity Analysis of Efficiency Scores: How to Bootstrap in Nonparam e tr ic Frontier Models." Management Science, 44, 49-61. Sim a r, L. and P. W . W ilson (2003): “Estim ation and inference in two-stage, sem i -param etric models of production processes,” Techni cal report 0310 IAP st atistics network ( http://www.stat.ucl.ac. be/Iapdp/tr2003/TR0310.ps ). Torgersen, A.M., F.R. Førsund, and S.A.C. K ittelsen (1996): "Slack -Adjusted Efficiency Measures and Ranking of Efficient Units." Journal of Productivity Analysis , 7, 379-39. Aas, E. (2000): “På leting etter m å lefeil – en studie av pleie- og om sorgssektoren”, Notater 2000:10, Statistics Norw ay, Oslo. 23 1 CLIMBING THE EFFICIENCY STEPLADDER: ROBUSTNESS OF EFFICIENCY SCORES IN DEA∗ by Dag Fjeld Edvardsen Norweg ian Bu i l d i n g Research In stitu te , Fo rskn ingsveien 3 b , NO-0 314 O slo, N orwa y. Email: dfe@b yggforsk. no Abstract : The robus t n e s s of the effic i e n c y score s in DEA (Data Envel o p m e n t Analy s i s ) has been addres s e d on a nu m b er of occa si o n s . It is of crucia l i m port a n c e for the practi c a l use of ef fici e n c y scor e s . The purpo s e of this paper is to dem on s t r a t e the usefu l n e s s of a new way of getti n g an indica t i o n of the sensit i v i t y of each of the effici e n c y score s to m e asu r e m e n t erro r . The m a in idea is to inves t i g a t e a DMU’s (Deci s i o n Maki ng Unit) sensit i v i t y to sequen t i a l rem ov a l of its m o st influ e n t i a l peer (with n e w pe er iden ti f i c a t i o n as a part of ea c h of the iter a t i o n s ) . The Effici e n c y stepla d d e r appro a c h is shown to provid e relev a n t and useful infor m a t i o n when a pplied on a dataset of Nordic and Dutch elect r i c i t y distr i b u t i o n utili t i e s . Some of the em pir i c a l effic i e n c y esti m a t i o n s are shown to be very sensit i v e to the validi t y and existe n c e of one or a low num ber of other obser v a t i o n s in t h e sam p le . The m a in co m p e t i n g metho d is Peeli n g , which consi s t s of rem ov i n g all the front i e r units in each step. The new met ho d has so me stren g t h s and some weakn e s s e s in com p ari s o n . Al l in all, the E fficie n c y stepla d d e r measur e is si mple and cr ude, but it is s hown that it can provide useful inform a t i o n for practi t i o n e r s about the robust n e s s of the efficie n c y scores in DEA. Keyw ords : DEA, Sensit i v i t y , Robus t n e s s , Effici e n c y step la d d e r , Peelin g . ∗ Thi s st ud y i s part of t h e m e th o d o l o g i c a l de ve l o p m e n t w ith in t h e research p r o j ect “Produ ctiv ity in Constru c tion ” at th e No rwegian Bu ild i n g Research In stitu te (NB I ) fina n ced b y t h e Norweg ian Research Coun cil. Finn R. Før s u n d ha s f o l l o w e d t h e wh ol e rese a r c h p r oc e s s a n d o ffe r e d det a i l e d c o m m e n t s . Hans B j u r e k , Håka n E gge r t , Lenn art Hj al marsson , an d Sv erre A.C. Kittelsen h a v e also g i v e n v a lu ab le commen t s. An y rem a in in g m i sunde r s t a n d i n g s are s o lely this a u thor’s res p onsi b i l i t y . 2 1 . Introduction The robust n e s s of the effici e n c y scores in DEA has been addres s e d in a num b er of researc h papers. There are several potenti a l p r obl e m s that can dis t u r b preci s e effic i e n c y estim a t i o n , such as sampli n g error, specif i c a t i o n error , and m easu r e m e n t error . It is alm o s t exclu s i v e l y the latte r tha t is dea lt with in this p a p e r . It has been proven analyt i c a l l y that the DEA effici e n c y es tim a t o r s are asym pt o t i c a l l y consiste n t given that a set of assum p ti o n s is satisfie d . 1 The m o st critic a l assum p t i o n m i ght be that there are no m easur e m e n t errors . The DE A m e thod estim a t e s the produc t i o n possi b i l i t y set by envelo p i n g the data as close as possi b l e , in the sense that the front i e r consi s t s of conve x com b in a t i o n s of actual observ a t i o n s , given that th e front i e r es tim a t e can never be “belo w ” an observed value. If the assum p ti o n of no m easur e m e n t error is broken we m i ght observ e input- o u t p u t vector s that are outside th e true produ c t i o n poss i b i l i t y set, an d the DEA f r ont i e r estim a t e will be too op tim i s t i c . Calculat i n g th e efficien c y of a correc t l y m easu r e d obser v a t i o n again s t th is optim i s t i c f r ont i e r will lead to e fficiency sco r es that are biased downwards. In other words , even sym m e t r i c m easu r e m e n t error s can p r o duce efficien c y estim a te s that are too pessi m i s t i c . It is of cruci a l im por t a n c e for the practi c a l use of the effici e n c y score s tha t inform a t i o n about their sensit i v i t y is availa b l e . The reason why measur i n g sensit i v i t y is a c h all e n g e is in a sense rela t e d to the diffi c u l t y with looki n g at n-dim e n s i o n a l space . In two dim e n s i o n s , and po ssi b l y th ree , one can get an ide a of the sensi t i v i t y of one obser v a t i o n ef f i c i e n c y score by v i sua l l y insp e c t i n g a scatter diagram . But when the num ber of dim e nsi o n s is higher than three, help is needed. The Efficiency stepladder m e thod introduced in this pa per is an offer to em piri c a l l y orien t e d DEA appl i c a t i o n s . This paper is not about dete ctin g outliers ; it is about investigatin g the robustness of each DMUs effici e n c y score. The main inspir a t i o n is Ti mmer (1971) , and the inten t i o n is to offer a crude and sim p le m e tho d that works relat i v e l y quick l y and is availa b l e to practi t i o n e r s as a freely d o wnlo a d a b l e softwa r e packag e . In the followi n g only DEA related approac h e s are consi d e r e d . There are m a inl y two ways sensit i v i t y to m easur e m e n t error in DEA has been exam ined : (1) perturba t i o n s of the 1 See Ba nke r (1 99 3 ) a n d K n ei p et al. ( 1 9 9 8 ) fo r detai l s . 3 observat i o n s , often with strong focus on the underl ying LP m odel, and (2) exclusion of one or more of the observa t i o n s of the dataset . The Eff i c i e n c y stepl a d d e r is base d on the latte r alte r n a t i v e . The main idea is to exam in e how the effici e n c y sco r e of a g i ven ine fficient DMU develops as th e m o st inf l u e n t i a l other DMU is rem ov e d in each of the itera t i v e steps . The first step is to deter m i n e which of the peers w hose rem o v a l is asso c i a t e d with the larges t incre a s e in the effici e n c y s c ore . This peer is perm anently removed, and the DEA m odel is recalc u l a t e d giving a new effici e n c y score and a new set of peers. The remova l conti n u e s in this f a shi o n until the DMU in quest i o n is fully efficient. This series of iterative DMU exclusions provides an “efficiency curve” of the increa s i n g effici e n c y values conn ec t e d with each step. There are few alterna t i v e approac h e s avai lable that provide inform ation about the sensiti v i t y of efficie n c y scores. Related m e tho d s in the liter a t u r e are Peeling (Barr et al., 1994), Effic i en c y Order (Sinuany - S t e r n et al., 1994) and Efficie n c y Depth (Cherch y e et al., 2000). Peeling consists of re m ov i n g all the front i e r units in each step. There are also sim il a r i t i e s betwe e n the Effic i e n c y stepl a d d e r and the Effici e n c y Order/ E f f i c i e n c y Depth m e thod s . The m a in diffe r e n c e is that the Effici e n c y steplad d e r approac h is concern e d with the stepw i s e in cre a s e in the effic i e n c y score s after each iterat i v e peer rem ova l , while th e Efficien c y Order/Ef f i c i e n c y Depth m e thod s are m o re concer n e d w ith the number of observ a t i o n rem ova l s that is requir e d for th e DMU in question to reach full efficien c y . The em pir i c a l appli c a t i o n is m a inl y used as a n illu s t r a t i o n on how the Ef f i c i e n c y stepl a d d e r m e tho d wor k s on real world data. The application is used to show what kind of analysi s can be perform e d using this m e thod . To carry out a full s cale empir i c a l analy s i s is an extensive undertaking, and is outsi de the scope of this paper. The layout o f the rest of the paper is accord i n g to the followi n g plan. Section 2 g i ves a brief surv e y of som e of the lite r a t u r e rela t e d to th e sensi t i v i t y of the ef f i c i e n c y score s in DEA. Sectio n 3 explai n s the basic proper t i e s of the DEA method. Introdu c t i o n of the Efficie n c y stepla d d e r approa c h is the topic of Section 4. In Section 5, m odel specification and the basic facts about the datase t are pres e n t e d . The em pir i c a l re sults and how the Efficiency stepladder m e thod can provid e insigh t about the sensitivity of the dataset used are found in Section 6. Section 7 rounds off the paper with the conclusions. 2 . Sensitiv i t y in D E A – a brief survey T h e topic of this paper is the sensit i v i t y of the efficie n c y scores in DEA. Other non- p a r a m e t r i c appro a c h e s are claim e d to be m o re robust to noisy data. On e exam p l e is the Order - 4 M fronti e r m e thod . It is descri b e d in Cazals et al. (2002). One applic a t i o n of this m e thod (on U.S. Commerc i a l Banks) is W h eeloc k and W ilson (2003). Instead of measuring perfor m ance relative to the unknown (and di ffic u l t - t o - e s t i m a t e ) bounda r y of the production set, perfor m a n c e for a given DMU is m easur e d re lat i v e to expec t e d m a xim u m outpu t a m ong banks using no m o re of each input than the give n DMU. The author s clai m that this appro a c h perm i t s a fully non-p a r a m e t r i c estim a t i o n with a m u ch better rate of conver g e n c e than DEA, avoiding th e usual curse of di m e nsionality that plagu e s trad i t i o n a l non-p a r a m e t r i c effic i e n c y esti m a t o r s . In the followin g , only DEA related approach e s are consider e d . There are m a inly two ways in which sensi t i v i t y to m easu r e m e n t erro r in DEA has been exam i n e d : (1 ) pe rtu r b a t i o n s of the observations, often with strong focus on the underlying LP m odel, and (2) exclusion of one or m o re of the dataset observations. Othe r alternatives have been used when m o r e inform a t i o n about the uncerta i n t y of one or a few of the dim e nsio n s is availabl e . 2 2.1 Investigations based on perturbati ons of the dat a in the LP model C h a r n e s et al. (1985) exam i n e d the conseq u e n c e s of varying one of the output varia b l e s . The in ten t i o n was to ident i f y the ef f i cien t DMUs that hav e wide rang ing effects an d disti n g u i s h them from other s whoose effects are m o re lim ited. In the conclus i o n they state that “ More work needs to be done to extend this for studying the consequences of altering several ou tputs simultaneously. Inpu t variations and also simultaneous variations o f inputs and outputs need to be addressed in other resear ch that should be of value for sensitivity analysis in general . ” One of the papers th at picked up th at challe n g e was Charne s et al. (1992), who used “dist a n c e ” (the norm of a vecto r ) in order to d e term i n e the “radii of stabil i t y ” for a DMU. W ithi n this regio n , data varia t i o n s do not alter a DMU’s status from ineffi c i e n t to effici e n t (or vice versa). This is done by centring a box on the original observation for the DMU in questi o n . This box (they refer to it as a “Unit ball”, even when it is not round in any possible sense) is def i ne d by the Chebys h e v norm which is descri b e d by the sm alle s t distan c e from the centre of the box to any of the sides. For an ineffi c i e n t DMU the radius defini n g this box is increas e d from zero until an observa tion within this box can be recl as s i f i e d from ineffi c i e n t to effici e n t . Th e sensit i v i t y of the effici e n t units is e s tim a t e d in a sim il a r way. 2 See Kittelsen et al. ( 200 1) . 5 Thom p s o n et al. (1994) wante d to deter m i n e the m a gn i t u d e s of data varia t i o n s that can cause changes in status for the DMUs classified as fully effi cient. Their m e thod is based on studyin g the effects of s m all increme n t s and decr em e n t s in the inputs and output s with regard s to the DMU’s classification as efficient or ine fficient. They applied this m e thod on two real world datasets (Kansas far m ing a nd Illinois coal m i ning). In the latter they found that within the data v a r i a t i o n s cons i d e r e d (+/-2 0 % or les s in absol u t e v a lue ) , 98 % of the DMUs in the subse t orig i n a l l y class i f i e d a s 100 % ef f i c i e n t were in sen s i t i v e to pote n t i a l da ta er ror s . Th e author s claim that their sensit i v i t y analys i s shows that DEA results tend to be robust for extrem e effi cie n t DMUs . Zhu (1996 ) exam i n e s how to ident i f y the sens i t i v i t y or robus t n e s s of efficient DMUs in DEA. His approac h is based on linear pr ogramming problem s whose optim a l values yield partic u l a r region s of stabili t y . Suffic i e n t and ne cessary conditions for variations in inputs and outputs of an efficie n t DMU to m a in ta i n full effici e n c y are provid e d . 2.2 Investigations based on exclusion of obs erv ations from the dat a set An early and influential c ontrib u t i o n was Tim m e r (1971). This paper heavily quoted Farre l l (1957 ) , but used deter m i n i s t i c front i e r s ( e st i m a t e d with line a r pro g r a m m i n g ) inst e a d of DEA. Though not m e ntio n e d in it s abstra c t , the paper was a pi oneering contribution when it com e s to m easur i n g the s e nsit i v i t y of the effici e n c y scores when rem ovi n g select e d units from the dataset. Timmer showed two wa ys to do this . The first alter n a t i v e he suggested was to rem ove observat i o n s from the dataset until a give n pe rce n t a g e of the d a tas e t is ou tsi d e the probabil i s t i c frontier . Th e other alternative he suggested wa s to rem ove efficient observations one by one until the resulting frontier stabili z e s . Timm er claim e d that either of these appro a c h e s m a y overc o m e the objec t i o n s to es tim a t i n g a fronti e r functi o n becaus e of data probl e m s . Supere f f i c i e n c y was introdu c e d in Anderse n and Peterse n (1993). It was introdu c e d to rank effici e n t units, but as pointed out in Banke r and Chang (2000) , it is probab l y m o re useful f o r detec t i n g outli e r s when there is reaso n to belie v e that the d a ta m i ght be noisy . Supere f f i c i e n c y is a m easure of the relativ e ra d i a l d i s t a n c e f r om the origi n to the DMU in questio n , w h e n the frontie r is estim a t e d wit hout this DMU included in the dataset. Supere f f i c i e n c y is by co nstr u c t i o n g r eate r than (or equal to) one. A s uperef f i c i e n c y value of 1.2 im plies that the DMU is position e d “20% ou tside” where the frontier would have been without this DMU (i n a radial sense). 6 Peelin g is descri b e d in Barr et al. (1994). This approa c h m easur e s how m u ch the efficiency of a DMU would change if the whol e frontie r was rem oved . Th ey used th e alleg o r y that peeling in DEA is like rem oving layers fr om an onion. T h e DEA dataset can be seen as a series of frontiers inside other frontiers. If we remove all the obs er v a t i o n s in the f i rst f r ont i e r , a new frontier is generated when the LP m odel is reca l c u l a t e d f o r the r e m a i n i n g units . This conti n u e s until th ere a r e no m o re obser v a t i o n s le f t . W ith peeli n g one is typic a l l y m o stl y concerned with which frontier a D M U belongs to -- the one wh ere it becom e s efficient. A weakne s s is that a f r onti e r in DEA typic a l l y co nsis t s of differ e n t num ber s of units. For the individu a l DMU, rem o ving one single unit can be sufficient fo r it to reach th e front i e r . Re m oving the entire frontier is m easured as on e operatio n , independ e n t l y of the num b er of units this p a rti c u l a r f r on t i e r cons i s t e d of . One attrac t i v e aspe c t with Pe elin g is that it is ve ry fast to com p u t e . Peeli n g is well known in th e DEA resear c h comm un i t y , but surpri s i n g l y few em piri c a l DEA applic a t i o n s take advant a g e of this m e thod . One possib l e explan a t i o n is that none of the m a ins t r e a m comm e r c i a l and freew a r e DEA softwa r e packag e s offer autom a t i c generat i o n of the layer num ber of each DMU. P eeli n g in DEA is in spiri t very close to Timmer (1971), but since Timm er did not use DEA the select i o n of which and how m a ny DMUs to remove is a little differe n t . Timmer sugges t e d removi n g a given num b er or a given percent a g e of the DMUs, while Barr et al . suggest rem ovin g the entire frontie r – independ e n t l y of whether the frontier is m a de up of 1 or 20 DMUs. Sinuany - S t e r n et al. (1994) introduced Efficiency Order as “ the number of units we need to delete in order to reach efficiency.” W h at algorithm one should use to identify the num b er of units that is requir e d to be delete d is not explai n e d in deta i l . A sim il a r appro a c h can be found in Cherch y e et al. (2000) , 3 who used a m i xed integ e r algo r i t h m to ident i f y the Effici e n c y Order. Furthe r inform a t i o n on how th e Effici e n c y Order relat e s to the Effici e n c y stepla d d e r is given in Sectio n 4. W ilson (1995) invest i g a t e d the conseq u e n ces of rem oving observations from t h e dataset. If removing an observati o n m a kes big differ e n c e s in the e fficiency scores of the other DMUs, then the area of the datase t where this input/ o u t p u t com b i n a t i o n was found is not densel y pop ula t e d , and convex combin a t i o n s of other DMU offer littl e help. This is an indi c a t i o n th at th e observa t i o n in questio n is a possible outlier and shou ld be investig a t e d for m easu r e m e n t error . By defin i t i o n this appro ach w o rks only on the fully efficient units. 3 They use t h e t e rm “eff i c i e n c y dept h ” , a n d d o not refe r t o S i nu a n y - S t e r n et al. (1 994 ) . 7 3 . Data Envelopment Analysis 3 . 1 The or igins of DEA The original idea behind DEA was introduced in Farrell (1957). It was further develope d in a very influent i a l paper by Char nes, Cooper and Rhodes (1978). The term Data Envel o p m e n t Analy s i s ( D EA) was coine d in th ei r paper. However, the first use of Linear Progr a m m i n g (LP) in th e calcu l a t i o n of the DE A effic i e n c y score s was m a de by Farre l l and Fieldho u s e (1962). The DEA model with variab l e return s to scale is often refer r e d to as the BCC- m odel (Banker, Charnes and Cooper, 1984), but it w a s introduced in Afriat (1972) in the s i ngl e o u tpu t case, and em pir i c a l l y im ple m e n t e d in the case of m u lti p l e ou tpu t s in Fä re, Grossko p f and Logan (1983). 4 Banker (1993) proved that the output orient e d e ffic i e n c y score is consi s t e n t in the cas e of a single output, while Kneip et al. (1998) showed statistical consistency and rate of conver g e n c e in the g e nera l m u ltip l e - i n p u t and m u ltip l e - o u t p u t case. Unfort u n a t e l y the rate of convergence is low, leading to sampling bias. The expected si ze of this bias increases exponentially in the number of inputs and outputs for a given sam p le size. The bias can be estim a t e d an d the effici e n c y estim a t e bias adjus t e d with a sta t i s t i c a l techn i q u e ref e r r e d to as bootst r a p p i n g (Efron , 1979). If the requir e d num ber of inputs and output s is large compar e d to the num ber of DMUs availa b l e , the standa r d error s for the (boots t r a p p e d ) bias correc t e d ef f i c i e n c y s c ore s will b e very large , and the d i s c r i m i n a t i o n of the ef f i c i e n c y scor e s will hav e little statis t i c a l significance. Including too few input s and outputs will reduce the curse of dim e ns i o n a l i t y , but will lead to a wrongl y speci f i e d effici e n c y m odel. In a sense this is worse, becau s e th e confi d e n c e inter v a l s w ill m i sle a d i n g l y tend to be sm all e r the fewer input s and outpu t s we inclu d e . 5 Includ i n g too m a ny inputs or output s (as long as the correc t ones are inclu d e d ) will not m a ke the ef f i c i e n c y estim a t o r incon s i s t e n t (in an asymp t o t i c sen s e ) , but with f i nit e s a m p l e s it w ill m a ke the ef f i c i e n c y estim a t e m o re noisy an d biased . Statis t i c a l tools for choosi n g m o del speci f i c a t i o n have been develo p e d , but they do (of course ) requir e that observations of the im portant inputs and ou tputs are availab l e , an d that a suf f ici e n t l y large num be r of DMUs are availa b l e for the tests to give sign i f i c a n t re su lts (dependi n g of the power of th e tes t s) . Th is line of th ought leads back to Banke r ( 1993, 1996) and Kittel s e n (1993) . 4 Fo r t h e hi st o r y o f t h e de ve l o p m e n t of DE A, see F ø rs u n d an d Sa ra f o g l o u ( 2 0 0 2 ) . 5 Th is is a co m p licated m ech an is m , an d will no t b e cov e red in furth e r d e tail in th is p a p e r. 8 3 . 2 The LP form ula tion of th e DE A m odel F ø r s u n d and Hjalm a r s s o n (1979) define the m easur e s E 1 to E 5 , where E 1 is radial effici e n c y assum i n g variab l e return s to sca l e . T h is is the s a m e as the Banke r et al. (1984) model for m u l a t e d as: E Min s t y y m M x x s j N i i ij mj mi j N i si ij sj j N ij j N ij 1 0 1 0 1 1 0 ≡ − ≥ = − ≥ = = ≥ ∈ ∈ ∈ ∈ ∑ ∑ ∑ S θ λ θ λ λ λ . . , , . . . , , , . . . , , ( 1 ) The usage of sym bo l s in m odel (1) is as f o llo w s : E 1i is the input saving VRS efficiency for DMU I, θ i is a scalar, S is the number of inputs dim e n s i o n s , M is the num ber of output dim e nsio n s , and N is the set of DMU. The ind i c e s i and j b e lo n g to th e se t N, y mj is the level of output and x is the level of intput. λ i j is a referenc e weights. The peers for DMU i in proble m (1) are the on es for which λ ij is stric t l y posit i v e . If DMU i is stron g l y effic i e n t then λ ii has the value of 1. In other words, a strong l y effici e n t DMU is its own peer. 6 Notice tha t not a l l units with rad i al effic i e n c y equal to 1 are Pareto effic i e n t . They m i ght have slack in one or m o re dim e ns i o n s . To identi f y which of the DMUs are Pareto efficient, the “additi ve” DEA m odel can be used. Here the sum of slacks for each DMU is m a xim i z e d , and only the Pareto effi c i e n t units hav e zero slack (see Charn e s et al., 1985). Figure 1 is an illustr a t i o n of how the DEA m o del works in the VRS case with two inputs and one output. T h e DMUs A, B and C are efficient and defi ne the boundary of the Produc t i o n Possib i l i t y S e t (PPS), while DMU D is in effi c i e n t and is p o siti o n e d strict l y in th e interio r of the PPS. Th e input saving radial efficiency of DMU D is equivalent to the proportional radial contraction of all inputs possible while stayi ng within the PPS. T h e radial contra c t i o n is stopped at point F. The ra dial input saving effici e n c y of DMU D (E 1 ) is equa l to the ratio GF/GD. However, point F is not a Pare to effici e n t point. In ad dition to reducing both 6 If t w o (o r m o r e ) DM Us ha ve i d en t i c a l i n pu t - o u t p u t vect o r s , t h e c hoi c e o f peer ( s ) i s not u n i q u e . 9 G F A Input x1 Inp ut x 2 O ut pu t y SLACK O D B H C Fi gure 1: Il lustration of DE A w ith tw o inputs and one output ( V RS) using the radial Farrel l input reducing efficiency measure. the inpu t s w ith th e sam e perce n t a g e , it would al so be possibl e to further reduce the usage of input x 1 equ i val e n t with the distanc e FH. The exis t e n c e of this extr a s l ack is no t cap tu r e d by the E 1 m easu r e . 4 . The Efficiency stepladder – a met hod for measuring sensitiv i t y in DEA The basic idea behind the Effici e n c y step l a d d e r appr o a c h is quit e s i m i l a r to th e efficiency order (Section 2.2). The robustne ss of the efficiency score of the unit under inves t i g a t i o n is exam i n e d in ligh t of the ex clu s ion of other observations in the sam p le. In both cases one is intere s t e d in the lowes t possi b l e number of observat i o n s th at has to be rem o v e d for the DMU in questi o n to reach the fronti e r , but with the Effici e n c y stepla d d e r approa c h there is greate r focus on the en ti re develo p m e n t from the origin a l effic i e n c y and then step by step until the DMU is fully effi cien t . The exact algorith m used is presented in Section 4.1, but the bas i c id ea used in the com pu t e r progr a m is in each step to determ i n e which of the peers whose rem o v a l is a ssoc i a t e d with th e large s t ef f i c i e n c y incre a s e . This p eer is then rem ov e d , and the DEA m odel is recal c u l a t e d leadi n g to a new peer group set. This is repeate d until th e DMU has an efficiency score of 100%. 10 One alterna t i v e algo rit h m could be to iterat e ov er all altern a t i v e s and then determ i n e which sequen c e of observa t i o n re m o v a l s th at m o st quick l y moved the DMU to the f r ont i e r . This approa c h would howeve r extrem e l y ti m e consum i n g with even m e dium sized datas e t s because of the very high num ber of possibl e se q u en c e s to co nsid e r . A natura l im plem e n t a t i o n when an unlim i t e d num b e r of CPU cycles are not avail a b l e is a first step o p ti m a l algo r i t h m . I t can easily be shown that this algo ri t h m can choose stupid paths when diggin g out peers in order to ge t the unit to the f r ont i e r with as few steps as po ssib l e ( one exam p l e is provi d e d furth e r down in the text), but the result s are st ill useful as long as one rem e mb e r s that a high num b er of s t eps to the fronti e r shoul d not be taken as very strong evidenc e that the DMU is ineffi c i e n t . On the other hand a lo w num b er of Effici e n c y st epladder iterations before we reach th e front i e r m eans with cer tainty that the efficiency of the DMU in question is very sensit i v e to the qualit y of obs ervation for the DMUs removed in the Efficiency stepladder sequen c e . In other words, be c oncerned if the slope of the Effi cie n c y stepla d d e r is steep, but don’t be too calm if th e increase is slow. The com pu t e r progr a m used to calcu l a t e th e numbers in this paper is “DagEA” which has been develo p e d for this exact purpos e . In the curren t versio n it is a front end 7 that uses DEAP (Coe lli, 1996) as its DEA solver. Both D a gEA will b e and DEAP are freely availab l e on the Intern e t . In the f r ont e nd of DagEA there are routines f o r autom a t i c calcu l a t i o n of the Efficie n c y s t epla d d e r fo r a dataset . 8 For m i ddl e - s i z e d d a tas e t s , ca lcu l a t i o n is rela t i v e l y f a st, but calcul a t i o n tim e incr ea s e s expone n t i a l l y with the dim e n s i o n a l i t y . 9 4.1 The Efficiency stepla dder approach illustrated T h e one step optim a l algor i t h m is very sim p l e : 1. C a l c u l a t e th e DEA effi cien c y score f o r the u n it of in ter e s t ( “ DMU P 1 ”) and write down which units serve as peers for DMU. The peers are charac t e r i z e d by having a strictly positive λ in the o p tim a l solut i o n of the LP m odel for m ula t e d in (1). 7 A front e nd i s a com put e r progr a m th at provides the vis u al interface t h at the use r interacts with, but it uses anot her com p ute r program as its calcula t i o n engi ne. 8 Dag E A is d e v e l o p e d b y Dag Fj eld Edv a rd sen . It will ev en tu ally b e do wn l o ad ab le for free fro m h ttp :// h o m e . b r o a d p a r k . n o / ~ d f e d v a r d . 9 C a l c u l a t i n g al l t h e Effi c i e n c y st ep l a d d e r va l u e s o n t h e dat a s e t used i n t h i s pape r t o ok 3 7 m i nu t e s on a 1.2 G h z Pen tiu m M n o teb o o k , b u t there are still possib ilities fo r op ti m i zin g th e so urce co d e . Increasing th e speed b y a fact o r o f 1 0 on t h e sam e har d w a r e m i gh t be real i s t i c . 11 2. F o r each of the peers ident i f i e d in th e step above: Calculat e the efficien c y score of the DMU P 1 i f t h at peer is rem ov e d from the data set , write down the efficie n c y score, and put the peer back in the datas e t bef o r e the effic i e n c y of DMU P 1 is calcu l a t e d with anothe r peer tem por a r i l y rem ove d . 3. P e r m a n e n t l y rem ov e the m o st influ e n t i a l peer identi f i e d in the step above. This is the one that has experie n c e d the largest change in effici e n c y ass o c i a t e d with its rem ova l (this is equivalent with th e peer whose rem ova l m a kes the effici e n c y s c o r e of DMU P 1 the larges t ) . This effici e n c y score an d the m o st inf l u e n t i a l pe er’ s ident i t y are added to the Effic i e n c y step l a d d e r tab l e . 4. R e p e a t (1)-( 3 ) while perm a n e n t l y rem ov i n g the peers iden t i f i e d in (3) and for each iterat i o n adding the id-num b e r of the m o st inf l u e n t i a l pee r and the ef f i c i e n c y scor e assoc i a t e d with its r e mov a l . Stop the re petitions when the e fficiency of DMU P 1 reaches 1. 4.2 Possi ble problem s w ith th e one step optim al al gorithm Figure 2 is an exam pl e of how a one st ep optim al routine can potentially choose a route towar d s the front i e r that takes a higher nu mb e r of steps than neces s a r y . The chall e n g e is to f i nd the short e s t numbe r of seque n t i a l peer ex clu s i o n s that will m a ke DMU J f u lly ef f i c i e n t . Lookin g at Figure 2, it is easy to see that excluding DMU H a nd then D M U I results in DMU J reach i n g th e front i e r in two steps . H o wev e r , the one step op tim a l algor i t h m by defini t i o n Ou tp ut H I J FD C GB A E Inpu t Fig ure 2: I llustration of how t he one step optimal algorithm can choose the wron g path. 12 only com p are s the altern a t i v e one step further do wn the road. Becaus e of this it will choose to rem ove DM U A instead of re m ovin g DMU H sin ce this will resul t in the great e s t inc r e a s e in the efficie n c y score for DMU J. Next, with DM U A out of the way, it w ill choose to rem ove DMU B instea d of DM U H, for the sam e reason . This contin u e s as the one step optim a l algori t h m c hoose s to rem ove the DMUs C, D, E, F, and G. Only after this it decide s to elim i n a t e D M U H and DMU I. In this ex am p l e the algo r i t h m used nine steps to accom p l i s h what really needed only two steps. Howeve r , this exam p l e is const r u c t e d to show t h e one step optim a l algor i t h m in the worst possib l e light. There is no indica t i o n that such behaviour is common when real world data is us ed . At the sa m e tim e it is a dem on s t r a t i o n of why it is im por t a n t to think of th e Effici e n c y stepl a d d e r ap pro a c h as on e way safe. If it report s that it only takes a low number of sequen t i a l peer rem ova l s to m ove from large in eff i c i e n c y to the fronti e r one can be certai n that the sen s it i v i t y of the effici e n c y score is high. But as de m ons t r a t e d by Figure 2, one should not be too calm if the algori t h m indic a t e s that a high num b er of peer remova l s are necess a r y . It could be tem p ti n g to use brute force and co mpa r e all p o ssi b l e peer rem ov a l s until the path with the lowest num be r is found, but this m a y not be a practica l alternat i v e becau s e this exhaus t the capaci t y of today’ s gen e r a t i o n of PCs. The reaso n is that the num b er of alter n a t i v e s to com p a r e will easily be extrem e l y high. There are so m e ways to re duce this problem . One way is to clus t e r two obser v a t i o n s close to each other into a singly entity , a nd possib l y m i nim i z e a penalt y functi o n where rem ovi n g this entity counts as double the rem o va l of only one DMU. Anothe r possib i l i t y is to com b i n e the Effic i e n c y stepl a d d e r appro a c h w ith peelin g , and notice the cases where there are large differen c e s between the results of thes e two m e thod s . Peelin g has it own weakne s s e s , but it is simpler , faster and in som e cases m o re robust in diffi cult situations such as the one present e d in Figure 2. 4 . 3 Efficiency stepl a dder for fully efficient units T h e Effic i e n c y stepl a d d e r can also b e cal culat e d for ful l y effi ci ent DMUs, or DM U s which becom e fully effici e n t after having gone throug h a num b er of iterations from their origi n a l pos it i o n b e low the f r ont i e r . In the s e cases we m easur e us ing the “super e f f i c i e n c y ” concep t (Ander s e n and Peters e n , 1993). W e conti n u e to do Effici e n c y stepla d d e r iterat i o n s , and stop when the supere f f i c i e n c y is undefi n e d . The higher the num ber of steps from the origina l positi o n to undef i n e d , the more units involv e d in calculating the efficiency of the DMU. The reason why the sensiti v i t y of the fully ef f i c i e n t DMUs is inte r e s t i n g is the sam e as 13 for the ineffici e n t units – it is relevant to know how robust the frontier that this unit is com p a r e d with is to m e a s u r e m e n t erro r . If th e efficie n c y of the unit becom e s unde fi n e d after a low num b e r of Effic i e n c y stepl a d d e r ite ra t i o n s , th is is an indic a t i on that the part of the frontier that this unit is com p a r e d with is not very robust . Calcul a t i n g the Efficien c y stepladd e r involv e s removi n g the m o st influe n t i a l peer for a DMU in each step. The fully effici e n t DMUs are their own peers, and when we rem o ve them the value is by d e f i n i t i o n g r e a t e r than or equal to 1 (they are no longer allowed to be pa rt of the convex com b in a t i o n s that define the fronti e r , but rem a in as ghost units that we m easur e agains t ) . For this reason ESL(1) for the f r ont i e r un its is th e same as supe r e f f i c i e n c y , wh ile ESL(2 ) a nd late r tak e the supe r e f f i c i e n c y concep t furthe r . 5 . M o del specification and data The em piri c a l part of this pape r is m a inl y inten d e d as an illus t r a t i o n of the ESL m e tho d . The datas e t is a quite typic a l exam p l e of the datas e t s used in em piric a l applic a t i o n s for the DEA m e tho d when it com e s to the num be r of observ a t i o n s and the num ber of inputs and output s . The datase t is cross sectio n data on the Nordic and Dutch electri c i t y distrib u t o r s in 1997 (see Edvard s e n and Førsund , 2003). The data was collecte d by the national regulato r s . The key charac t e r i s t i c s of the data are pr es ent e d in Table 1. The differe n c e in size betwe e n th e DMUs is la r g e , as revea l e d by the las t two co lum n s . TOM is Total Oper a t i n g and Maintenance cost (including labor costs) measured in Swedish kronor in thousands. LossMWH is energy loss in m e gawa t t hours, RV is Repla c e m e n t Value m easu r e d in Swedi s h kronor in thousand s . NumCust is the num ber of costum e r s . TotLines is the tota l le ngt h of lines . MwhDeliver ed is the sum of m e gawa t t hours deli vered. See Edvardsen and Førsund (2003) for further details on the content and history of the dataset . Table 1. Summary statistics. Cross- section 1 997. Nu mber of units 122. Average Median Standard Deviation Minim u m Maxim u m TOM(kSEK) 152388 97026 182923 11274 981538 LossMWh 91449 52318 104777 7020 615281 RV (kSEK) 2826609 1907286 3288382 211789 22035846 NumCu st 109260 55980 163422 20035 1052096 TotLines 7640 4948 8824 450 54166 MWhDelive red 2110064 1003472 2815025 166015 178054730 14 6 . The results 6 . 1 The basic DEA e fficiency results F i g u r e 3 is an Efficiency Diagram 10 showing the result s of the effici e n c y calcu l a t i o n s assum i n g Varia b l e Retu r n s to Sc ale (VRS) . Each of the efficie n c y s c o r es is calcula t e d b y solvi n g the linea r p r ogr a m m i n g probl e m in (1 ). The DEA calcu l a t i o n s shown in Figure 3 assu m e no m eas u r e m e n t erro r , but what if th is assum p t i o n does not hold? The purpose of the Effic i e n c y stepl a d d e r appro a c h is to exam i n e the s e nsiti v i t y of efficie n c y s c ores to m eas u r e m e n t erro r s . 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2000000 4000000 6000000 8000000 10000000 12000000 14000000 16000000 18000000 S iz e in T O M E1 Fig ure 3: Input sa ving efficiency scores wh en assuming VRS (E 1 ). 10 One i n terest i n g feat ure of E f ficien c y diagra m s is that bot h t h e hei g h t s an d t h e wi dt h s o f t h e ba rs ca n cont a i n in fo rm atio n – u n lik e a b a r ch art wh ere on l y th e h e igh t s of the bars a r e actively used. Th is is esp ecially u s efu l wh en illu st ratin g th e resu lts o f efficien cy an alysis. Th e efficien cy of each DMU is sh own b y t h e h e i g h t of th e b a r, wh ile its eco no m i c size (m an -years in Fig . 3) is shown b y th e wi d t h o f th e b a rs. Th is m ean s th at it is pos sibl e to e x a m ine whethe r there ar e a n y syste m atic correlations bet w een t h e sizes of the units and thei r effici e n c i e s . Anot he r intere s t i n g geom etri c a s pect of these figures is th at they are sorte d according to increasing efficiency from left to right. The di st a n c e fr om t h e t op o f eac h bar t o 1.00 is a measure of t h at DMU’s ineffi c i e n c y , a nd t h e width of the ba r is a m e asure of its econom ic size. For this rea s on the area above ea ch of the b a rs is propo rt io n a l to t h e eco n o m ic co st of th at DMU no t b e i n g 10 0% ef f i cien t. Th i s m eans that there will typ i cally b e a “wh ite triang le” ab ov e t h e inefficien t u n its , an d th at th e size o f th at area is p r op ortio nal to th e econom i c cost of the t o tal in efficiency in t h e sam p le. 15 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 0.22 0.24 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 Fig ure 4: First step ESL va lues for all the inefficient DMUs, sorted in increasing order. T h e change s in the effici e n c y scores from the f i rst s t ep in the ESL algor i t h m ar e shown in Figure 4. For more than half of th e ineff i c i e n t DMUs the ch ang e s after rem ov i n g their orig i n a l l y m o st inf l u e n t i a l pe er , ESL(1 ) , is larger than 5 percenta g e points, and for a fifth of the DMUs the changes are larger than 10 percentage points. T w o DMUs experience increa s e s in their effici e n c y sc ores larger than 20 percentage points. Th is suggests that the individu a l efficien c y scores in DE A applica t i o n s strong l y depend on the assum p t i o n of no m easu r e m e n t error . If th e m o st influ e n t i a l of its peers is outsid e the tr ue producti o n possibil i t y set, one can get a very large negati v e m easur e m e n t - e r r o r bias in the estim a t e d effic i e n c y score s . 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 ESL(1) 0.1 Original 0 Fig ure 5: Th e original efficiency scores and the ESL(1 ) va lues for all the inefficient DMUs, sorted pairw ise. 16 Figur e 5 is s i m il a r to Fig u r e 4, but n o w the ESL ( 1 ) va lue s f o r th e in ef f i c i e n t units a r e shown toget h e r with their DMU’s origi n a l ef f i c i e n c y (sorte d pairwi s e ) . A visual inspec t i o n of Figur e 1 co nfi r m s that a num b e r of the ineff i c i e n t unit s m ove f r om bein g quit e ine f f i c i e n t to being quite effici e n t (if we place the border betwee n these two condit i o n s at the ad hoc value of 0.85). It is also interesting to notice that there does not s eem to be any strong pattern concer n i n g the correl a t i o n betwee n the orig in a l value and th e ESL( 1) value, especially if one sees it in lig h t of the DMUs that ar e orig i n a l l y assig n e d a high effic i e n c y being lim it e d in how big the ESL(1) value can be since the effici e n c y n u m b er can n o t be larger than 1. Figure 6 is s i m il a r to Fig u r e 5, bu t sh ows the Ef f i c i e n c y step l a d d e r v a lue s f o r the f i rs t six steps of the sequen t i a l Effici e n c y stepla d d e r iter a t i o n s f o r all the in ef f i c i e n t DMUs. The changes in the efficien c y score with ESL(2 ) ten d to b e sm all e r than in ESL(1), but there are exam pl e s of the oppo si t e . A few of the DMUs have low s e nsit i v i t i e s to the v a lid i t y of thei r peers , but the gener a l pictu r e is th at m o st of t h e DMUs experi e n c e large change s in their effici e n c y scores after tw o or th ree Effici e n c y s t ep la d d e r iterat i o n s . 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 ESL( 6) ESL( 5) 0.2 ESL( 4) ESL( 3) 0.1 ESL( 2)ESL( 1) Original 0 Fig ure 6: Stacked b ar chart show ing original efficiency and ESL(1 ) to ESL(6 ) . 17 A( 1 ) D( 19) B( 4) 1 G ( 4 2 )F( 39) H( 56)E( 33)C( 10) ) 0.95 0.9 0.85 0.8 0.75 E- sc or e 0.7 0.65 0.6 0.55 0.5 0.45 0.4 Fig ure 7: The Efficien cy stepladder for a few sel ected inefficient DMUs (h orizon tal ax is is the Efficiency stepladder numb er) . Figure 7 sh ows the ESL curves for seven of the ineffic i e n t DMUs in the datas e t (refe r r e d to as A-H). They are s e lect e d beca use their curves show som e of the differe n t develo p m e n t s . By constr u c t i o n all o f the curv es are non-decreasing. Th e iden tity of each of the curv e s is ind i c a t e d on the top of the f i gur e , toge t h e r with th e required number of Efficiency Stair iterations fo r that DMU to reach full efficiency. D M U A has an orig ina l effici e n c y of 0.85, but it becom e s fully effici e n t after exclu d i n g only one of its peers from the sam p le. DM U B has an origina l DEA efficie n c y score of 0.66, but after four steps it reache s the frontier. DMU C starts at 0.57, but has a very steep incr ease in efficiency. T h e other DMUs (E-G) have slower effici e n c y increa s e s . One natura l summ a r y m easu r e of the steepne s s of the Efficie n c y step la d d e r is th e av er age i n cr ease per st ep. Thi s i s pr esent e d in Table 2. It is worth noticing that D M U B starts at a m u ch lower effici e n c y than D M U D, but because of DMU B’s lo wer average increase per step it reaches the frontie r in on ly f our steps while DMU D needs 1 9 steps. D M U H is a bit d i f f e r e n t f r om the other DMU s in th at it experi e n c e s a m o stly convex (b roadly speaking) developm ent, while the other DMUs that experi e n c e a larg e num ber of ESL steps before they rea c h th e fronti e r (D-G) follow a m o stly concav e patter n . 0 1 2 3 4 5 6 7 8 9 14 16 19 47 49 54 56 10 11 12 13 15 17 18 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 48 50 51 52 53 55 Ef ficiency stepl a dder number 18 Table 2: Av erage increa s e per s t ep for the selecte d DMUs O r i g i n a l value Ineffi c i e n c y S t e p s A v e r a g e increa s e per step A 0 . 8 4 0 . 1 6 1 0 . 1 6 B 0 . 6 6 0 . 3 4 4 0 . 0 8 5 C 0 . 5 7 0 . 4 3 1 0 0 . 0 4 3 D 0 . 7 8 0 . 2 2 1 9 0 . 0 1 2 E 0 . 7 9 0 . 2 1 3 3 0 . 0 0 6 F 0 . 4 7 0 . 5 3 3 9 0 . 0 1 3 G 0 . 7 7 0 . 2 3 4 2 0 . 0 0 5 H 0 5 8 0 . 4 2 5 6 0 . 0 0 8 6.2 Efficiency stepl a dder (ESL) ver sus Efficiency Order T h e Effic i e n c y Order ap pro a c h (S in u a n y - S t e r n et al., 1994) is m a inly concerned with the m i ni m u m num b e r of DMUs that need to be deleted for the DMU in question to reach th e f r ont i e r . In rela t i o n to Figur e 8 th e ef f i c i e n c y order approach would be interest e d in the num b er of steps each of the DM Us requi r e d to get to the f r ont i e r , while the ESL ap pro a c h is more focuse d on the steepn e s s of th e Effici e n c y stepla d d e r . Anothe r differ e n c e is that th e Efficie n c y Order approa c h does not seem t o be interes t e d in the first few steps in th e stepl a d d e r , but only in the tota l num b e r of itera t i v e DMU exclu s i o n s that le ads to f u ll effici e n c y . This m i ght be a weaknes s of the E fficiency Order approach since the likel i h o o d of a low number of observ a t i o n s to be outsid e the true produc t i o n possib i l i t y m a y be low. Correspo n d i n g l y , the likeliho o d of a large num ber of observ a t i o n s to be infect e d with seriou s m easu r e m e n t error s m a y be quite low. In other words, one of the m o st inte re s t i n g indica t i o n s of a larg e l y ineff i c i e n t DMU’s rob u s t n e s s is its sensi t i v i t y to one, tw o, or m a ybe thr ee sequenti a l peer rem ovals . The Efficien c y Order appro a c h w ould not captu r e the f act that the first peer remova l m i ght m ove an ineffi c i e n t unit from 40% to 90% efficie n c y if th e rest of its way to full effici e n c y takes 10 m o re steps. The ESL approac h propos e d in this paper is con cern e d with the changes in efficie n c y in each s t ep, and no t only with the num ber of rem ova l s neces s a r y befor e the fronti e r is reache d . Anothe r d i ffer e n c e is that the ESL appro a c h is als o relev a n t f o r the fully effici e n t units. The first step is then identic a l to the Superef f i c i e n c y (Anders e n and Peterse n , 1993). 19 Third l y , to calcul a t e the Effici e n c y Orde r using the algori t h m propos e d in Cherch y e et al. (2000), specialised software and som e know l e dge of computer programming are required. The authors claim that the calcula t i o n of the Effi ci e n c y Ord e r (they refer to it as “ef fic i e n c y depth” ) with their approa c h “ should not involve subst antial computational burden ” and “ require only minimal effort using an ordinary PC desktop (sic).” The exact app r oa c h for calcu l a t i n g the effic i e n c y depth is uncle a r , and they formula t e a Mixed Integer Linear Program m i n g (MILP) problem without explain i n g how m u ch CPU tim e t h at is required on a deskto p PC. Identi f y i n g which of the DMUs should be rem o ve d and in what sequen c e is left to the CPLEX 1 1 MILP optim i z e r . It is not certa i n th at a dif f e r e n t MIL P optim i z e r would choose the sam e path toward s the fronti e r . The sim p l e r algor i t h m used in th is pape r is more acces s i b l e to practi t i o n e r s . A com pu t e r progr a m that calcu l a t e s th e Effic i e n c y stepl a d d e r has been de vel o p e d to calcu l a t e the num ber s presen t e d . It will be freely availab l e on the In terne t so th at prac t i t ioners can use it to get some crude but useful infor m a t i o n on the sensiti v i t y of the effici e n c y scores in DEA. Since the algorit h m always choose s the one-ste p optim a l soluti o n it is predic t a b l e how the Efficie n c y steplad d e r is constru c t e d . 1 2 7 . Concl u sions Ideally sensitivity analysis, detection of potential outlier s, and estim a tion of sam p ling bias should be carried out sim u lta n e o u s l y . It is easier to detect outliers if we have som e inform a t i o n about the sampli n g bias, and it is easie r to estim a t e sam p li n g bias if we have first ident i f i e d th e outli e r s . There have been devel o p m e n t s m a de on all thes e areas in th e last f e w years, but at the tim e of writing no single m e thod offers a solution to all the m e ntioned challe n g e s . The Effici e n c y stepl a d d e r m e tho d is sim p le and crude , but it can still be usefu l for applied DEA investigations. It s hould be though t of as one way safe: An Efficie n c y steplad d e r that is very steep is a clear indic a t i o n that th e DEA estim a t e d effici e n c y is strong l y depend e n t on the correctness of a low num b er of other observ a t i o n s . A slow increase on the other should 11 C P LEX i s a com m e r c i a l opt i m i z e r capa b l e of sol v i n g M i xe d I n t e g e r Li ne a r Pr o g r a m m i n g p r o b l e m s . See h ttp ://www.ilog . co m / p r o d u c t s /cp l ex / fo r m o re in fo r m a t i o n . 12 Bu t ev en with th e op en algo ri th m u s ed in t h e ES ap pr oa c h there ca n be situati o n s whe r e t h e c o m puter program h a s to choo se b e tween two or m o re equ a lly go od altern ativ es. Howe ve r, in m o st of t h ese cases this is the last step before the DMU in questio n reac h e s t h e fro n t i e r , and t h e “ p r o b l e m ” i s t h at r e m o v i n g any of se ve r a l altern ativ e peers will lead to fu ll efficien cy. In t h is cas e t h e Efficien cy step ladd er cu rv e and all th e efficien cy v a lu es rem a in th e sam e ; o n l y th e id en tity o f th e last p eer rem o v e d will d i ffer. 20 not be interpr e t e d as a strong indi cation that the efficiency is at least this low. The reason is that the m e tho d is only one-s t e p - o p t i m a l . In a dditi o n to m easur i n g the s e nsit i v i t y of the e- scor e s for effici e n t and ineffi c i e n t units, it m i ght be used in com b i n a t i o n with bootstr a p p i n g to identify possible outliers. The necessary softwa r e for carry i n g out th e Effici e n c y s t epl a d d e r calcu l a t i o n s will be m a de availa b l e f r om the author ’ s websi t e . The purpose of the ESL m e thod is to exam ine t h e sensiti v i t y of the efficie n c y scores for m easu r e m e n t error s . Boots t r a p p i n g on the ot her hand is in the DEA conte x t (pr i m a r i l y ) used to m e asu r e sens i t i v i t y to sam p l i n g erro r s . W e would expec t tha t a DMU with a larg e ESL(1) value would also have a large standa r d error of the bias co rrect e d efficie n c y score . The reason is that we ex pect th e part of the (inpu t , outpu t ) sp ace where th e DMU is locat e d to be sparsel y popula t e d . Tenta t i v e runs have shown stati s t i c a l l y signi f i c a n t and posit i v e corre l a t i o n betwe e n the ESL(1) values and the standar d erro r s of the bo otst r a p p e d b i as co rrec t e d effic i e n c y score s . Furth e r m o r e , there is strong em pir i c a l assoc i a tion between the ESL(1) values for the fully effici e n t D M Us (=supe r e f f i c i e n cy) and the sam p ling bias estim ated using bootstrapping. This is a prom is i n g topic for furthe r resear c h . 21 References A f r i a t , S., 1972, Effici e n c y estim a t i o n of produc t i o n functi o n s , International E conomic Review, 13(3), 568-598. Andersen, P. and Petersen, N.C., 1993, A Pr ocedu r e for Rankin g Effi cien t Units in Data Envel o p m e n t Analy s i s . Management Science , 39(10), 1261-1264. Banker, R.D., 1993, M a xim u m Likelih o o d , Consist e n c y and Data Envelopm e n t Analysis , a statist i c a l founda t i o n , Management Science , 39, 1265-127 3 . Banker, R.D., 1996, Hypothesis Tests Using Da ta Envelo p m e n t Analy s i s , Journ a l of Producti v i t y Analysis , 7, 139-159. Banker, R.D., A. Charnes and W . W. Cooper, 1984, Som e Models for Estim a t i n g Techni c a l and Scale In eff i c i e n c i e s in Data Env e l o p m e n t Analy s i s , Management Science 30, 1078-1092. Banker, R.D. and Chang, H., 2000, Evaluating the Super-Ef f i c i e n c y Procedur e in Data Envel o p m e n t Analy s i s f o r Outli e r I d ent i f i c a t i o n and f o r Ranki n g Ef f i c i e n t Units , W o rki n g Paper from t h e Univers i t y of Texas at Dallas. Barr, R.S., M.L. Durchholz and Seiford, L ., 1994, Peeling the DEA Onion. Layering and Rank-Ordering DMUs Using Tiered DEA, Southe r n Method i s t Un ive r s i t y techn i c a l repor t , Dallas, Texas. Cazals, C., J.-P. Floren s , and L. Sim a r, 2002, Nonpar a m e t r i c fronti e r estim a t i o n : a robust appro a c h , Journal of Econometrics, 106, 1-25. Charnes , A., Haag, S., Jaska, P., and Se m p le, J.,199 2 , Sensiti v i t y of Efficie n c y Calcul a t i o n s in the Additive Model of Da ta Envelop e m e n t Analysi s , , International Journal of System Sciences, 23, pp. 789-798. Charnes, A., Cooper, W.W. and Rhodes, E ., 1978, Measuring the effi ciency of decision m a kin g units , European Journal of Operations Research 2, 429-444. Charnes, A., Cooper, W . W., Lewin, A.Y. , Morey, R.C., and Roussea u , J. J.., 1985. Sensiti v i t y and Stabil i t y Analys i s in DEA. Annals of Operations Research 2 139-150. Cherchye, L . Kuosm a nen, T. and Post, G.T ., 2000, New Tools for Dealing with E rrors-In- V a r i a b l e s in DEA, Katholi k e Univers i t e i t Le uven , Center f o r Econom i c Studies, Discussion Paper Series DPS 00.06. Coelli, T.J. 1996, “A Guide to DEAP Version 2. 1: A Data E nvelo p m e n t Analys i s (Com pu t e r ) Program ” , CEPA Working Paper 96/8, Departm e n t of Econom e t r i c s , Univers i t y of Ne w England, Arm i dale NSW Australia. Edvardsen, D.F. and Førsund, F.R. 2003: In ternational benchm arking of electricity distri b u t i o n utilit i e s , Resource and Energy Economics, 25, 353-371. 22 Farre l l , M.J.,1 9 5 7 , The m easure m e n t of product i v e effici e n c y , J.R. Statis. Soc . Series A 120, 253-281 . Farrell, M.J. and Fieldhouse M., 1962, Estim a ting effi cient production functions under incre a s i n g r e tur n s to sca l e , J .R. Statis. Soc. Series A 125, 252-267. Førsund, F.R. and N. Sarafoglou, 2002, On the origins of Data Envelopm ent Analysis, Journal of Productivity Analysis 17, 23-40. Färe, R., Grossko p f , S. and Logan, J., 1983, The relativ e effic i e n c y of Illin o i s electr i c u tilities, Resource and Energy 5, 349-367 . Kittelsen, S . A.C., 1993, Stepwise DEA; Choos ing Variables for Measuring Technical Effici e n c y in Norwegi a n Electr i c i t y Distri bu t i o n , Mem o 06/1993 Depa rtm e nt of Econom i cs, Univer s i t y o f Oslo Kitte ls e n , S . A.C. , G.G. Kjæser u d and O.J. Kvamme : Errors in Surv ey Based Qualit y Evalu a t i o n Varia b l e s in Effici e n c y Model s of Prim ar y Care Physic i a n s , HERO Memora n d a 24/200 1 , Oslo. [http: / / w w w . o e k o n o m i . u i o.no/ mem o / m e m opdf/ m e m o2401.pdf ] Kneip, A., Park, B.U. and Si m a r, L., 1998, A not e on the conver g e n c e of nonpar a m e t r i c DEA estim a t o r s for produc t i o n effici e n c y scores , Econometric Theory , 14, 783-793. Sinuany-S t e r n , Z., A. Mehrez and A. Barboy, 1994, Academ ic De partm e nts Effi ciency via DEA, Computers Ops. Res . , vol. 21, No. 5, pp. 543-556. Timmer, C.P., 1971, Us ing a Probibal i s t i c Frontier Product i o n Functio n to Measure T e chnic a l Efficien c y , Journal of Political Economy , Vol. 79, No. 4 (Jul. – Aug. 1971), 776-794. Thom ps o n R., Dhar m a p a l a , P.S. and Thrall , R.M., 1994, Sensit i v i t y analys i s of effici e n c y m easur e s with applic a t i o n s to Kansas farm ing and Illinois coal m i ning. In: Data Envelopm ent Analysis, T h eory, m e thodology and applications. Edited by Charnes A., W . Coope r, Lewin A.Y, Seifor d, L.M. Kluwer. W ilson, P.W., 1995,D e t e c t i n g Influe ntial Observations in Da ta Envelo p m e n t Analys i s , Journal of Productivity Analysis 6, 27-46. Zhu, J., 1996, , Robust n e s s of the effici e n t DMUs in data envelo p m e n t analys i s , Europe a n Journal of Operations Research , 90, pp. 451-460. 1 EFFICIENCY OF NORWEGIAN CONSTRUCTION FIRMS∗ by Dag Fjeld Edvardsen Norweg ian Bu i l d i n g Research In stitu te, For s k n i n g s v e i e n 3b , NO -0 3 1 4 Osl o , No r w a y . Em a il: d f e@b y g g fo rsk . n o Abstract : Effici e n c y studie s of the constr u c t i o n indust r y at the micro level a r e fe w and fa r betwee n . In this paper infor m a t i o n on m u lti p l e out pu t s is utiliz e d by appl y i n g Data Envelo p m e n t Analy s i s (DEA) on a c r oss sectio n datase t of No rweg i a n constr u c t i o n firm s . Boots t r a p p i n g is appli e d to selec t the scal e speci f i c a t i o n of the m odel . Const a n t retur n s t o scale was r e ject e d . Furthe r m o r e , bootst r a p p i n g was used to esti m a t e and correc t for the sa m p li n g bias in the DE A efficien c y scores. One i m portan t lesso n that can be learn e d from this appli c a t i o n is the danger of taking the efficien c y scores fro m uncorr e c t e d DEA calcula t i o n s at fa ce value. A new c ont r i b u t i o n is to use the inve r s e of the stan d a r d errors (fro m the bias corr ec t i o n of the effici e n c y score s ) as wei ght s in a regres s i o n to explai n the efficien c y scores. Several of the hypothes e s invest i g a t e d are found to have statis t i c a l l y signi f i c a n t e m piri c a l rel e va n c e . Keyw ords : Constr u c t i o n indust r y , DEA, effici e n cy, bootstrapping, weighted two stage. ∗ This study is part of the res earch p r oj ect “Produ ctiv ity in Co n s tru c tion ” at th e Norweg ian B u ild ing Research In stitu te (NBI) fin a n c ed b y the Norweg ian R e search C o un cil. I wou l d lik e to th ank Tho r b j ørn Ing v a l d sen (NBI) for in sigh tfu l co mmen t s ab ou t th e n a ture of con s tru c ti o n activ ities. Fin n R. Fø rsu n d (Un i v e rsity o f Oslo) h a s fo llowed th e who l e research p r o cess and offered d e tailed co mm en ts. Sverre A.C. Kittelsen (Frisch C e n t re) h a s been i n va l u a b l e i n t h e de ve l o p m e n t of t h e so ft w a r e f o r bo ot s t r a p p i n g , an d f o r gi vi n g det a i l e d c o m m e n t s . C o m m e n t s fro m Lenn a r t Hja l m a r s s o n , Hå k a n Eg ge r t an d Hans B j ur e k h a ve hi gh l y i m p r o v e d t h e p r es e n t a t i o n of th is p a p e r. R o g e r Jen s en (St a tistics No rway at Ko ng sv ing e r) prov id ed th e d a taset and offered h e lp in th e d a ta selectio n p r o c ess. An y rem a in in g errors are so lely th is au t h o r ’s respon sib i lity. 2 1. Introduction L o w produ c t i v i t y growt h of the constr u c t i o n in dus t r y in the nineti e s (base d on nation a l accoun t i n g figure s ) is causing substa n t i a l con cern in Norway. To identif y the underly i n g cause s inves t i g a t i o n s at the m i cr o level are need ed. Howeve r , effici e n c y studie s at the m i cro level of the of the constr u c t i o n indust r y are very rare. 1 The objective of this study is to analyz e productive efficiency in the Norwegian constr u c t i o n indust r y . A piecew i s e linear fronti e r is used, an d techn i c a l effic i e n c y m easu r e s (Farre l l , 19 57) are calcul a t e d on cross sec tio n data follow i n g a DEA (data envelo p m e n t analy s i s ) ap pro a c h (Char n e s et al. , 1978). The DEA efficiency scores are bias co rrected by bootstrapping (Sim a r and W ilso n , 1998, 2000), and a bootstr a p p e d scale specif i c a t i o n test is perfor m e d (Sim a r and W ilson , 2002). A new contribution is to use weights based on the standard errors from the bootst r a p p e d bias correc t i o n in the two stage m odel when search i n g for explan a t i o n s for the efficie n c y scores. One reason for the sm all num ber of effici e n cy analyses of the constr u c t i o n indust r y m a y be the probl e m to “iden t i f y ” the activ i t i e s in term s of t echnol o g y , inputs and outputs in this indust r y . It is well known that there are large or ganizational and technological differe n c e s between buildin g firm s. Even when the produ c t s are seem in g l y sim ila r there are large differe n c e s in the way project s are carri ed out. For instance som e building projects use a large share of prefabr i c a t e d elem en t s , while other project s produce alm o st everyth i n g on the buildin g site. This often happens even when the result i n g constr u c t i o n is seem in g l y sim ila r . It is intere s t i n g to note that proj ec t s with such large differ e n c e s in the techno l o g i c a l appro a c h can ex ist at the sam e tim e. Mor e ov e r , the com p o s i t i o n of outpu t va rie s a lot b e twe e n dif f e r e n t construction com p anies so the definition of the ou tput vector m a y also be a problem . Thus to captu r e such indus t r y ch ara c t e r i s t i c s , a m u ltip l e input m u lti p l e outpu t app r o a c h is req u i r e d . A debate d issue is wheth e r an effic i e n c y an alys i s should be carrie d out at the projec t level or a t the f i rm level . In m a ny ways it is more natur a l to think of the proje c t lev e l as the decisi o n m a kin g unit (DMU) in this indus t r y . In addit i o n it m i ght be easier to find relati v e l y hom ogenous projects than firm s. A third aspect is that when one tries to explain any 1 Two Sca n d i n a v i a n st u d i e s ar e Jons s o n ( 1 9 9 6 ) whi c h l o o k e d at cons t r u c t i o n pr o d u c t i v i t y at t h e pr oj e c t l e ve l and A l br ik tsen and Før s und ( 199 1) wh ich inv e stig ated t h e ef ficien cy of Norweg ian con s tru c tio n firm s. Th e l a tter was base d on a par a m e t r i c fro n t i e r a p p r o a c h spe c i f y i n g o n l y o n e o u t p u t . 3 effic i e n c y diffe r e n c e s it is likel y th at there are bigge r diffe r e n c e s be twee n the project s than betwee n the firm s when i t com e s to c hoice of constru c t i o n techno l o g y . Howeve r , the require d data for studyin g produc t i v i t y at the project level is not (yet) avail a b l e , so the firm level is the only available level for an effi ciency study of the construction industr y at the m i cro level in Norway. 2 It should b e noted that the firm level shoul d not necess a r i l y be seen as a higher aggre g a t i o n than th e pro j e c t level . It is not unusual that a pro j e c t in th is ind u s t r y is larg e r th an any of the p a rti c i p a t i n g f i rm s , and quite often a large pro j e c t can span two or three accounting years . The layou t o f the rest of the paper is accor d i n g to the followi n g plan. Section 2 giv e s a brief overview of the methods used in this pa per. The m a in ideas ar e explained, notation is introd u c e d , and the m o st centra l refer e n c e s ar e listed. In Section 2.4 a new approac h is develop e d , that explai n s the possibl e benefi t s of using weighted regr es si o n in a two stage DEA settin g . Secti o n 3 deals with how the data us ed in this pape r was co lle c t e d an d process e d . Selecti o n of the scal e spe c if i c a t i o n in the DEA model is the topic of Section 4. In Section 5 results of the DEA efficie n c y calcu l a t i ons are reported, and the effects of c o rrecting the effici e n c y score s for bias is show n. Som e intere s t i n g hypot h e s i s th at m i gh t expl a i n som e of the diffe r e n c e s in th e firm s ’ effic i e n c y sc o r e s are inve s t i g a t e d in Secti o n 6. Se cti o n 7 rounds off the paper with a summ ar y. 2. The methods T h e effic i e n c y score s in this paper are ca lcul a t e d with DEA and then bias correc t e d with bootst r a p p i n g . The model select ion is also done with the he lp of bootstra p p i n g , while the statis t i c a l power of the stage two reg r es s i o n is increas e d by taking advant a g e of the standar d errors of the bias correcte d efficien c y estim ate s . 2.1 Data Env elopment An aly sis (DEA) T h e idea behin d DEA is to use th e close s t pos si b l e p i ece w i s e linea r en vel o p e of th e actua l data as an estim a t e of the borde r of the producti o n possibil i t y area. A m o re detailed explana t i o n than is given he re can be found in e.g. Cooper et al. (2000). The efficiency of an observ a t i o n (of t en ref e r r e d to as a “ D MU,” Deci s i o n Mak i n g Unit, in th e DEA liter a t u r e ) is 2 Co llectin g d a t a at th e p r oj ect lev e l is p a rt o f th e research with in “Produ ctiv ity in Con s tructio n . ” 4 Input Output P1 k O xh xA x1 h B G C n xB yk yn yB y1 yA VRS - frontier CR S - fr on tie r D F A m Reference point for unit P1(output increasing) Self-evaluator (interior) Self-evaluator (exterior) Reference zone for unit D (shaded area) Peers for unit P1 (output increasing) Reference point for unit P1 (input saving, weighted) Reference point for unit P1 (input saving, radial) Figure 1: Th e DEA method illus trated in two dim ensions. calcu l a t e d a s the re la t i v e dist a n c e to the f r on t i e r . The effic i e n c y score is a num b e r betwe e n 0 and 1, and the units pos iti o n e d on the f r onti e r a r e as sig n e d the ef f i c i e n c y score of 1. Inpu t saving effici e n c y is a m easur e of how m u ch it is possi b l e to sim u l t a n e o u s l y reduce all inputs, while th e o u tpu t s are at leas t the s a m e . Banke r et al. (198 4) f o rm a l i z e d the axiom s tha t an envelopm e n t should satisfy, and showed that the DEA production possi bility set is the sm alle s t set that satisf i e s the follow i n g assum p t i o n s (x 1 , x 2 are vector s of inputs ; y 1 , y 2 are vectors of outputs ) : 5 1) A l l observ a t i o n s are possibl e : If we observe (x 1 , y 1 ), then it is possib l e to produc e y 1 with the use of x 1 . 2) C o n v e x i t y : If (x 1 , y 1 ) a nd (x 2 , y 2 ) are observ e d , then a x i s possible f o r all a in [ 0,1] (this is true when assum i ng Variabl e Return to Scale (VRS) . W h en assum i n g Const a n t Re tu r n s to Sca l e (CRS) any posit i v e a is a llowe d ) . y a x y1 1 2 21,b g b gb+ − , g 3) F r e e disposal : Higher usage of resource s always m eans that it is possi b l e to produ c e th e sam e or m o r e of product s . It is also always possible to produce fewer products with the sam e a m ount of resourc e s . In Figur e 1 the m o st impor t a n t conc e p t s in DEA are illu str a t e d . A, B, C, D, F, and G are DMUs that in DEA would be calculated as techni c a l l y effic i e n t when we assum e VRS techno l o g y , while P 1 is techni c a l l y ineffi c i e n t . W ith CRS only the DMU with the highest output / i n p u t ratio would be conside r e d techn i c a l l y effici e n t , becaus e in this case the produc t i v i t y of all units is com p ar e d – indepe n d e n t l y of the size of the DMUs. In the followin g I will con centr a t e on the VRS producti o n fr on tier in Figu re 1 . The reaso n is that som e intere s t i n g aspect s of the DEA m e thod a pply to CRS only if we have m o r e than two dim e ns i o n s . This again is becaus e un der CRS in two dim e ns i o n s all units are com p ar e d to th e sam e face (=the part of the effici e n c y fronti e r that the ineff i c i e n t units can be compa r e d to ; each linear p a rt of the fro n ti e r ) in CR S, but with VRS we typica l l y have more than o n e facet. The effic i e n c y m easu r e s can be se t u p m a the m a t i c a l l y as Linea r Progr a m m i n g (LP) proble m s in the followi n g way: 3 E 1 : Input oriente d VRS efficie n c y can be calcu la t e d by solving the followi n g LP problem for each DMU. For unit P 1 in Figure 1 this equals x A /x 1 . 1 . . 0 , 0 , 0 1 i i ij mj mi j P i ni ij n j j P ij ij j P E Min s t y y m M x x n N θ λ θ λ λ λ ∈ ∈ ∈ ≡ − ≥ ∈ − ≥ ∈ ≥ = ∑ ∑ ∑ ( 1 ) 3 See Førsund and Hjalm a r s s o n (1979) whe r e these m easur e s are define d in the ge neral case, inde pe n d e n t l y of t h e ch oi c e of f r on t i e r est i m a t i o n m e t h o d o l o g y . 6 The referen c e point for DMU i is ( , )yλ λij njj P ij mjj Px∈ ∈∑ ∑ . DMU A in Figure 1 is the input saving reference point for DMU P 1 . Point “m ” is the radial projec t i o n point on the V R S frontier, and does not take advant a g e of the possi b i l i t y to incr ease output in addition to the reducti o n of input. E 2 : Output oriented VRS efficiency can be calculat e d for each DMU by solving the followin g LP probl e m . For unit P 1 in Figure 1 this equals y 1 /y n . 21 / . . 0 , 0 , 0 1 i i i mi ij m j j P ij nj ni j P ij ij j P E Max s t y y m M x x n N φ φ λ λ λ λ ∈ ∈ ∈ ≡ − ≤ ∈ − ≤ ∈ ≥ = ∑ ∑ ∑ ( 2 ) E 3 : Effici e n c y assum i n g CRS can be calcu l a t e d b a sed on eith e r (1) o r (2 ) if we rem o v e the constraint in the last line which dem a nd s the sum of the weight s to equal one. For unit P 1 in Figure 1 this equals either x 1/ x h or y 1 /y k ; input and output orient a t i o n both return the sam e num b er when CRS is assum e d . E 4 : Input reduci n g scale efficie n c y equals E 3 / E 1 . E 5 : Output increa s i n g scale effici e n c y equals E 3 / E 2 . 2.2 Estimati ng sampling bias using boots trap p ing It is well known that empirical estimations of the DEA m odels define d in the form ul a s above are influ e n c e d by sam p l i n g bias (Sim a r and W ilson , 1998). The reason is that the DEA estim at e of the product i o n frontie r is based on a convex com b ination of best practice observ a t i o n s . If we had sam p le d all possib l e D M Us genera t e d by the sam e underl y i n g Data Generati n g Process (DGP), we w ould expect to get a new producti o n possib i l i t y area that is strict l y outsid e the DEA estim a t e . The sam p li n g bias for a given DMU c a n be expecte d to be higher the lower the number of ot her obser v a t i o n s in the sam p l e . The DEA f r ont i e r es tim a t e is 7 based on the best observed practic e . But this is a biased estim a t e of the best possib le practi c e in any real world (finite sam p le) situati o n . The follow i n g DGP is a ssum e d (Si m ar an d W i l s on , 1998): observ a t i o n s are random l y drawn from the tru e p r oduction po ssibility area. There is a strictly po sit i v e p r oba b i l i t y of drawing observ a t i o n s close to all parts of the true pro duct i o n frontie r , and the DEA assum p t i o n s (no m easu r e m e n t error , conve xity, free disposability) hold. A hom o genous effici e n c y d i stri b u t i o n 4 is assum e d in the following, but th is can be relaxed with a m o re com p li c a t e d DEA bootst r a p m e thod o l o g y (Sim ar and W ilson , 2000). Banker (1993) proved that as the number of draws goes to war d s inf i n i t y , the dis t a n c e betwee n the DEA estima t e and the true effici e n c y score goes toward s zero. 5 In other words , the DEA estim a t o r is consi s t e n t . But it is biase d in f i nit e sa mpl e s . The reaso n is th at ther e is zero proba b i l i t y th at a finit e num be r of sa m p l e s will span the entire outer edge of a contin u o u s product i o n possibi l i t y area. The true efficie n c y of a DMU is the relati v e radial distan c e from the DMU to the true p r o duc t i o n fron ti e r . The DEA estim a t e d effici e n c y o f the sam e DMU is the relati v e radial distan c e from the DMU to the estim a t e d produc t i o n fronti e r . The differ e n c e betwee n these two distan c e s is the sam p li n g bias . One thing we know is that it is strict l y posit i v e , in the sense th at the DEA estim a t e d ef f i c i e n c y is high er than the tr ue ef f i ci e n c y . Sim a r and W ilson (1998) showed how to esti mate the sampling bias in DEA with a m e thod refe rred to as “bootstr a p p i n g ” (Efron, 1979). Bootstr a p p i n g is in general a way of testing the reliabi l i t y of the dataset , and works by creating pseudoreplicate datasets using resam p l i n g . Bias correc t i o n in DEA usin g boots t r a p p i n g is based on the follow i n g assum p t i o n : (E 1 – E 1 *) ~(approx.) (E 1 * - E 1 **) , ( 3 ) where E 1 is the true unknown (input or iented VRS) efficien c y , E 1 * is the o r igina l DEA effic i e n c y es tim a t e , and E 1 ** is the bootst r a p p e d effici e n c y estim a t e . W e can no t direc t l y c a lcu l a t e the lef t hand side of equation (3) since the true production frontier is unknown. However, it can be approxi m a t e d by running com pute r sim u la t i o n s based on the right hand side of the sa m e equat i o n . This is possi b l e s i nce b o th the 4 An ef fi c i e n c y di st r i b u t i o n i s den o t e d as h o m o g e n o u s w h en i t i s i nde p e n d e n t o f i n pu t m i x, out p u t m i x, and scale. 5 B a nk e r ’ s pa pe r sh o w e d t h i s f o rm a l l y i n t h e si ng l e - o u t p u t m u lt i p l e - i n p u t case . Thi s has l a t e r bee n ge ne r a l i z e d t o th e m o re g e n e ral m u ltip le-in p u t s m u ltip le-o utp u t s (Kn e ip et al., 2 003 ). 8 DEA efficien c y scores and th e linear progra m s that create d them are known. The homogeno u s bootstra p used in this paper can be calculat e d with the followin g algorith m (inspire d by Sim a r and W ilso n , 1998) : a) Use the o r igin a l datas e t and calcul a t e th e DEA effici e n c y s c ores . b) Create a Kernel Dens it y Estim a t e 6 (KDE) of the effici e n c y scores from (a). c) Move all the DMUs to their com p ar i s o n point on the fronti e r . d) Create a pseudo- d a t a s e t by dividin g the inpu t values from (c) with values obtain e d by drawing random l y from the KDE in (b) with reflection (Silver m a n , 1986). e) Calcula t e a new series of efficiency scores on th e pseudo- d a t a s e t in (d). f) Repeat (d)-( e ) a large num b e r of tim es (2000 is recomm e n d e d by Sim a r and W ilson , 1998). For E 1 ** in (5) the averag e value of the e fficiency scores in (f) is used. KDE is used to sm ooth the em piri c a l distri b u t i o n of the origina l effic i e n c y score s (bootstrapping without sm ooth i n g is referr e d as “naïve bootstra p p i n g ” ) . Reflec t i o n is used to deal with the boundary conditio n that is problem a t i c for nonpa ra m e t r i c density estim at i o n . The reason is that the K D E s m oothi n g typica l l y results in part of the sm oothe d distrib u t i o n densiti e s at values greater than 1. Denote th e differ e n c e b e twe e n E 1 * and E 1 ** with Bias*. Based on this we can create a bias correc t e d effici e n c y estim a t e . Silver m a n (1993) gives a warning against using bias correc t i o n carel e s s l y . The danger is that th e bias corre c t e d es tim a t o r m i ght have a subst a n t i a l l y great e r stan d a r d erro r th an the orig i n a l estim a t o r . The resul t is that we m i ght end up with a new estim a t o r that is unbias e d , but at the sam e time “m ore wrong on averag e ” than the origi n a l biase d estim a t o r (larg e r MRSE) . See Sim a r and W ilso n (1998 ) for a m o re detai l e d descrip t i o n . One diffi c u l t y with the algor i t h m above is th at the kernel density estim at i o n require s two param e t e r s : the kernel functi o n (i.e. Gau ssi a n ) and the bandwi d t h param e t e r (deter m i n i n g the lengt h of the tails of the kerne l f unct i o n ) . In pract i c e , the choic e of the kerne l is not nearl y as im port a n t as the choice of the bandwi d t h . The theore t i c a l backgr o u n d of this observa t i o n is 6 Ke rne l De ns i t y Est i m a t i o n i s a way get t i n g a sm oot h e r est i m a t e of an em pi r i c a l di st r i b u t i o n ( w he n t h e t r ue sh ap e o f t h e d i str i b u tion is unk now n) . See Silv er m a n ( 198 6) for d e tails. 9 that kernel functio n s can be rescaled such that the difference between two kernel density estim a t e s using two differe n t kernel s is al m o st negligi b l e (Marron and N o lan, 1988). In the kernel litera t u r e it is docum en t e d that the standa r d for m ul a s for choosi n g bandw i d t h s pick too la r g e of a band w i d t h if th e distr i b u t i o n is m u lti - m o d a l or high l y skewe d (Silve r m a n , 1986). Since the latter m a y well be th e case for efficie n c y distri b u t i o n s we avoid using th e n o rm a l refer e n c e ru le. A pplyi n g leave- o n e - o u t cross valida t i o n 7 would probab l y have been the best altern a t i v e (Efron and Tibs h i r a n i , 1993) . The m a in reaso n is that selec t i n g bandwi d t h based on a predete r m i n e d m a them a t i c a l formul a would be less subjec t i v e . 8 Bandw i d t h in this pape r is se lec t e d u s ing visua l inspec t i o n of the k e rnel d e nsit y es tima t e . Th is is done using an interac t i v e tool 9 create d in Object Pascal for visual l y insp ec t i n g th e effect s of differe n t bandwi d t h s . The bandwid t h select ed was 1.0. The effects of choosing other bandwi d t h s (0.5 and 1.5) have also been exam in e d . It m a de quite a large differ e n c e for a low num b er of units, but the differen c e in the overa ll d i stri b u t i o n of the efficie n c y sco r es was rela t i v e l y s m al l . 2.3 Testing the scale sp ecification u sing boots tra p p ing A s point e d out in Sim a r and W ilson (1998 ) the questio n of whether the product i o n possibi l i t y set exhibit s C R S has not only econom i c a l but also statis t i c a l im port a n c e . If the true technology is globally C R S then both E 3 * and E 1 * are consistent estim ators of the true E 3 , but E 1 * m i ght be less efficien t than E 3 * in a sta ti s t i c a l s e ns e due to slower converg e n c e . Sim a r and W ilson (1998) sugges t severa l tests of scale spec ifi c a t i o n using a bootstr a p p e d test. One alterna t i v e is the m ean of the ra tio s (with thei r notation): 10 $ $ , / $ ,S n D x y D x ycrs icrsi n i i i vrs i i1 1 1 = − =∑ b g b g ( 4 a ) Using the F ø rsun d and Hjalm a r s s o n (1979) notation (Section 2.1 in this paper): 7 Leav e-on e-o u t cro ss v a lid ation is a tech n i que to in v e stig ate th e p r ob ab ility th at a certain o b s erv a tio n was d r awn fr om t h e sam e un de r l y i n g p o p u l a t i o n a s t h e r e st o f th e sam p le. See Silverman (198 6) for details. 8 As far as is kno wn , th ere is no t an y g e n e rall y av ailab l e t o o l fo r ban d w i d t h sel e c t i o n bas e d on cr os s val i d a t i o n . 9 Thi s i s a co m p u t e r pr o g r a m (for t h e W i n3 2 pl at f o r m , or Li n u x un de r “ W i n e ” ) d e ve l o p e d by Dag F j el d Ed var d s e n . 10 It m i gh t be w o rt h m e nt i o n i n g t h at t h e n o t a t i o n i n Si m a r and W i l s o n ’ s ( 1 99 8 ) f o rm u l a 4.1 i s a bi t u n cl e a r si n c e t h at pape r uses a “hat ” sym b o l on t op of bot h t h e n o m i n a t o r an d o n t h e de no m i n a t o r . Ho we v e r , i f o n e r ead s t h ei r pape r care f ully, one will see in the text t h at they clearly state that it is the ratio that shoul d be e s ti m a t e d for each o f th e iterations, no t th e no min a to r an d t h e d e no m i n a to r in sep a rate iteratio n s . Th is is to en su re simu ltan e ou s est i m a t i o n . 10 (4b) $ $ , / $ , $ ,S n E x y E x y n E x ycrs i i n i i i i ii n i i n i i1 1 31 11 1 41 = =− = = − =∑ ∑ ∑b g b g b g The questi o n is whethe r the averag e scale effici e n c y we observ e d (us i ng uncorr e c t e d DEA; E 4 * ) could have been genera t e d by a CRS tec hnol o g y . An attempt to answer this is m a de by runni n g a bootst r a p sim u l a t i o n where we assum e that the true techn o l o g y is CRS. In each of the iterat i o n s we reco rd the average value of E 4 . If the averag e E 4 * that we origi n a l l y calcul a t e d using DEA is outsid e the given dens it y range, e.g. 95%, then we choose to discar d the H 0 that “The true technology exhi bits CRS” and use VRS instead. In additi o n to the test above, Sim a r and W ilson (1988) suggest several other tests; am ong these were the ratio of the m e ans: $ $ , / $ ,S E x y E xcrs i i n i i i i ii n 2 31 11 = = =∑ ∑b g b gy ( 5 ) They end u p recomm e n d i n g the ratio of the m eans (S 2 ) since it perform s best in the Monte Carlo tests . But the m ean of the ratio s (S 1 ) perfo r m s al m o s t as well, and has an intuitive geom etric interpre tation. In this paper both S 1 and S 2 will b e c a lcu l a t e d in the sc ale speci f i c a t i o n test. 2.4 Weigh ted regres sion in stage t w o I n the em pi r i c a l DEA l iter a t u r e it is comm o n to use a “two stage ” ap pro a c h 1 1 whe n effici e n c y estim a t e s are to b e bo th m easur e d and “explained.” The firs t s t ag e ref e r s to th e DEA calculation of efficiency scor es, based on the data on input s and outputs. In the second stage it is in ves t i g a t e d wheth e r th e effic i e n c y sco r e s from stage one are empir i c a l l y corre l a t e d with other varia b l e s we belie v e m a y “expl a i n ” the effic i e n c y score s . T h e varia b l e s used in stage two are typic a l l y envir o n m e n t a l or m a na ge r i a l variab l e s (both discre t i o n a r y and non- dis c r e t i o n a r y variabl e s are com m onl y used). This po ssib l e “empir i c a l correl a t i o n ” is inves t i g a t e d using m u lti v a r i a t e reg r e s s i o n m odel s with the eff i c i e n c y sco r e on the lef t side of the equation , though other approach e s can also be used. It is often argued that one should do the estim ati o n of bot h the efficiency explanatory varia b l e s an d the effic i e n c y itsel f in the sam e st age. This argum e n t for im prov i n g statis t i c a l 11 See Før s und an d Sar a fo g l o u ( 200 2) f o r a h i sto r ical acc o u n t of t h e ori g i n of the t w o stage approac h . 11 effici e n c y is frequen t l y put forward in “standar d econom et r i c s .” However, there m i ght be situa t i o n s where that is not pos sible or desirable. If it can be assum e d that th e ex pla n a t o r y variab l e s affect the produc t i o n in a differ e n t way than the re gular inputs, i.e. that the explan a t o r y varia b l e s do not inf l ue n c e the rate of substi t u t i o n betwe e n the latter , then one m i ght not lose statis t i c a l effici e n c y by us ing the two stage approach . The explanat o r y variab l e s m i ght have the charac t e r of gener a l shif t facto r s . Anothe r reason for choosi n g a tw o-st a g e approa c h is if th e explanatory variables are too “rough” for DEA. The assum p ti o n s behind the DEA model do not allow m e asur e m e n t error , even in those cases where the m easu r e m e n t errors can be assum e d to be symmet r i c a l l y distri b u t e d . A second stage regres s i o n m odel will in such a situa t i o n be m o re robus t than DEA. In ad dition there are m o re tools av ai l a b l e for do ing diagn o s t i c s and for co rre c t i n g possi b l e probl e m s . A reason that has been m e nti o n e d when these dis c u s s i o n s arise is that the explana t o r y variabl e s ar e non-discretionary. However, this is not a good motivation to avoid a single stage approa c h . Severa l of the genera l l y availab l e DEA softwar e package s allow m a kin g one or m o re of the inclu d e d varia b l e s “f ixe d ” (n on- d i s c r e t i o n a r y ) . An a ddit i o n a l argum e n t is that we m i ght not know if the variab l e is an input or a nd output (this has been addressed in Sim a r and W ilson, 2001). Lastly, it m i ght be difficul t to incl ude the variable in the s i ngle - s t a g e DEA ca lcul a t i o n if we have r easo n s to belie v e that the rela t i o n s h i p betwe e n this var i a b l e and the ef f i c i e n c y sco r e is not m onot o n i c . I t m i ght be partl y dealt with e i the r by transf o r m a t i o n of data, or possib l y by relaxi n g the assum p t i o n of free dispos a b i l i t y (allow i n g for conges t i o n ) . Given tha t a two stag e approa c h is chosen it will be more ef f i c i e n t , stati s t i c a l l y speak i n g , to bring with us the inve rse of the standard errors over to stage two. Se e for instan c e Carrol and Rupper t (1988) for a more detail e d descri p t i o n of us ing weighted regressi o n fr om an econom e t r i c perspe c t i v e . The m o tiva t i o n for using weight e d regres s i o n is that we have differ e n t degree s of certai n t y when it com e s to th e precis i o n of the es tim a t i o n of the effici e n c y scores in s t age 1 . For this reason we want to put a larger weight on the m o re precis e obs ervations when we fit the regression hyperplane to the data. This m eans that there is a greater penalty if the hyperp l a n e is far away from observ a t i o n s with a high weight than for those with a low weight . When we do the bootst r a p p e d bias correc t i on as described in S ectio n 2.2 we get not only bias co rre c t e d poin t estim a t e s , b u t also s t an d a r d error s . T h ese stand a r d error s are (given the assum p tions) good measures of how certain we are of the value of each of these point 12 estim a t e s . Since higher standa r d errors m ean lowe r certaint y , the inverse of the standard errors will be us ed as weight s in the stage two regres s i o n . Sim a r and W ilson (2003) descri b e two possib l e approa c h e s for how estim a t i o n can be done in a statistically consistent way in DEA in a two-stage setting. They suggest using either a single or a double bootstra p approach , and then argue that the latter is preferable since it has a m o re rapidl y declin i n g MRSE of the intercept and the slop e in the regress i o n . Compari n g these m e tho d s (anal y t i c a l l y or using Monte Carlo sim u lat i o n s ) w ith using weight e d regres s i o n and discove r i n g which of them perform s best is a task for furthe r res ea r c h . One of the motiv a t i o n s behin d this paper is to inves t i g a t e cause s for the effici e n c y differ e n c e s am ong fir m s in the Norweg i a n buildi n g indust r y . As describ e d later in this paper (Secti o n 2.4), weight e d regres s i o n in a two stag e setti n g will be used, and the reas o n is to reduce the influen c e of the bias correc t e d effici e n c y score s with a larg e estim a t e d stand a r d error . The v i ew of the curre n t p a per is that an unbiase d estim a t o r m i ght be useful even if it has an estim a t e d MRSE large r than th e origi n a l bi as e d one. One exam p l e is to use it in a second stage regres s i o n (with weight s based on the invers e of the standa r d errors ) , which is done in Section 6.3. 3. The data I n 2001 the Norwegi a n buildin g industr y cons iste d of about 34 500 enterpri s e s , and em ploy e d a bout 132 500 person s (about 10 percen t of the Norweg i a n labour force) . From 2000 to 2001 there was an 8.7% growth in turnov e r and 7.4% growth in employ e e com p en s a t i o n . The effici e n c y calcul a t i o n s in this paper m u st be seen in the light of the fact that the industr y experie n c e d strong gr owth in the year under investigation. The prim a r y data on the build i n g enter p r i s e s is collected on a yearly basis by Statistics Norwa y . All the firm s in the datase t used in this paper have a NACE code of 45.211. This m eans that at least 50% of their produ c t i o n value is in the catego r y “cons t r u c t i o n of build i n g s . ” The sam p l e colle c t e d by Stati s t i c s N o rwa y consi s t s of all en ter p r i c e s with m o re than 100 employ e e s , and a sa m p le of the s m alle r enterp r i c e s . The sa mple contai n s at least 30% of the total em plym e n t in the NACE 45.211 subgro u p . Based on data for each building enterprise we ha ve c r ea t e d a cr os s section database on produc t i o n and resour c e usage for the m o st recen t year availa b l e (2001 ) . A rather exten s i v e set of input and output data were availa b l e based on annua l com p a n y accou n t s and the structural survey conducted by S t atistics Norway. After extensive discussi ons with S t atistics 13 Norwa y and sector expert s the input- o u t p u t specif i c a t i o n was select e d . Output is measur e d as value sp lit o n thre e d i f f e r e n t categ o r i e s : Res i dential buildings, Non-Re sidential Buildings, and Civil Engine e r i n g . The three i nputs are External Expenditure, La bor, and Real Capital . Details are laid out in Section 3.2. 3.1 Data qua lit y filters S t a t i s t i c s Norwa y has s e vera l routi n e s f o r detecting and correcting erroneous data. This should help im prov e the qualit y of the data. Howeve r , the data collec t e d in the yearly survey s is f o r genera l purpos e s , and the defini t i o n of which observa t i o n s we believe to have good enough quality depends on what they are to be used for. Productivity m easurem ent wit h f r ont i e r m o del s is espec i a l l y sens i t i v e to outli e r s . W h en we use th e DEA model we f o rm a l l y assu m e that ther e are no m eas u r e m e n t erro r s in th e data we feed into th e effici e n c y estim a t i o n model. Howev e r , it is im por t a n t to avoid s h api n g the r e sul t s to co nf i r m a priori suspicio n s . This is espec i a l l y im por t a n t in a frontier setting , because th e frontie r - d e f i n i n g units are b y defini t i o n o u tli e r s . But experi e n c e with em piri c a l DEA applica t i o n s stro ngl y sugges t s that no t clean i n g the data for su spic i o u s un its can lead to very question a b l e and som e ti m e s absurd result s . Very influe n t i a l units should be checked extra carefu l l y , since errors in these DMUs can stron g l y influ e n c e th e effic i e n c y es tim a t e for a large numbe r of other DMUs. 1 2 It was requir e d that all the observ a t i o n s used in the DEA m o del shoul d be able to m e e t the f o llo w i n g three requi r e m e n t s . (1): At least 90% of the produc t i o n (m eas u r e d in total value ) has to be from the const r u c t i o n indus t r y . 1 3 15.8% of the com p an i e s did not m e et this requir e m e n t . (2): All three inputs m u st be grea ter than zero. 23.6% of the com p anie s did not m eet this requirem e nt. (3): The observed usage of labor in m a n-yea r s must be larg e r than o r equal to one. 1 4 4.9% of the datase t did not m eet th is requirem e n t . 3.2% of the companie s failed to m eet m o re than one of the three requir e m e n t s . After this autom a t i c cleani n g , five m o re units were rem ove d from the datase t after test runs of the DEA model (using a V R S specifi c a t i o n ) . They showed up as strongl y influen t i a l 12 In Thorge r s e n et al. (1996) the “P eer inde x” was introdu c e d . T h is m easu r e s the infl uen c e of each of t h e pee r s in the DEA es tim a tion relative to how la rge a sha r e of the im prov e m e n t pot e n tial (for each of the di mensions ) th is p e er is referen c ing . The calcu latio n o f th e Peer i n d e x is b a sed on t h e op ti mal weigh t s i n th e DEA calcu latio n . The m a x i m u m v a lu e is 1 0 0 % , an d can on ly b e attain ed if th is p eer refers all p o t en tial i m p r o v e m e n t s i n t h i s di m e n s i o n . T h e Peer i n dex i s a use f u l m easu r e of the influe n c e of ea ch of th e p eers in th e d a taset. 13 The num b er 90% is a d -hoc, but is selected to m a ke su re t h at t h e b u i l d i n g fi rm s are h o m og e n o u s i n t h e se nse that none of t h e m are allowe d to have a large sh are o f th eir sales ou tsid e t h e bu ild ing i n dustry. 14 (large peer index, see Torgersen et al., 1996) and with high superefficiency 1 5 (s ee A nde r s e n and Petersen, 1993). Originally superefficiency wa s used as a way to rank am ong the efficie n t units, but recentl y it has m o re often been seen as a way of de tecti n g strang e units. Removi n g strongly influential units if they are radial outlie r s m i ght be que stionable, but is probably the leas t evil ch oic e . 3 . 2 Des cripti v e statistic s for th e primar y datase t The resource usage of the bu ilding entrepre n e u r s is captur ed by three inputs (the three f i rs t colu m n s in Table 1): External Expenditure includ e s m a ter i a l s , s ubcontr a c t o r s , energy, tran s p o r t a t i o n etc. Labor in Man-Years is a measur e of th e labor usa g e . Real Capital is a m easur e capit a l servi c e based on the use of pr oduc t i o n equipm e n t , m achin e s , etc. It is calcul a t e d f r om rental expend i t u r e s and depr ec i a t i o n . The last thre e colum n s of Table 1 contain sum m a r y statis t i c s on the product i o n of the buildin g entrep r e n e u r s . Residential is a m easure of the sales value of the re sidential and recreational buildings. Non-Residential is a m easu r e of the sales value of other build i n g s , such as of f i c e build i n g s and in sti t u t i o n a l buildin g s (school s , pr isons , hospita l s etc). Civil Engineering m e asure s the sales value of construc t i o n s such as roads, tunnels, harbors, etc. Because of the data filters a ll three inputs have strict l y pos itive values, and the lowest value for Labor in Man-ye a r s is 1. The lowest va lue for all the three produc t variab l e s is 0, but all f i rm s have a stric t l y posit i v e su m of outpu t va lue s . Const r u c t i o n is clea r l y the o u tpu t with the lowest num b er of strictly positive output valu e s , and is also th e var i a b l e with th e larg e s t CV-num b e r . Concer n i n g the size distrib u t i o n , 39% of the firm s use less than 10 m a n years, Tab le 1: Descrip tive sta tistics for the primary va riab les ( 3 42 ob serva tions after the data cleaning) . Ex ternal Ex p. Labo r in Man - y e a r s Real Capital Re sid e n t i a l N o n - R e s i d e n t i a l Civil Enginee ri n g Minimum 18.0 1.0 2.0 0.0 0.0 0.0 Maximum 4 083 634.0 2 950.0 23 9847.0 1 597 609.0 2 556 175.0 1 170 239.0 Average 44 968.6 38.3 1970.1 27141.4 32 160.2 5 688.4 St.dev 233 063.7 167.2 13 655.7 113 635.8 147 942.9 66 574.4 Count >0 342 342 342 301 218 31 CV16 5.2 4.4 6.9 4.2 4.6 11.7 14 The reas o n i s t h at o n l y re al pr od u c t i o n fi rm s are i n cl u d e d . Fi rm s n o t m eet i n g t h i s dem a n d m a y be pu re accounting units, or m a y be newly started, cl osing down or in hi bernation. 15 Su p e refficien c y is a measu r e o f th e relativ e rad i al d i stan ce fro m th e o r ig in to th e DMU in q u e stion , wh en th e fro n tier is esti mated with ou t th is DMU in clu d e d in t h e da taset. Sup e refficien c y is b y co nstru c tion g r eater th an (or equ a l to ) on e. A su p e reffi c ien c y v a lu e of 1 . 2 i m p lies t h at th e DMU is p o s ition e d “2 0% o u t si d e” wh ere th e fr on t i e r w o ul d have bee n wi t h o u t t h i s DM U ( i n a radi a l se ns e ) . 16 C V i s t h e C o e f f i c i e n t of Va ri a t i o n . It i s defi n e d as t h e rat i o o f t h e st an d a r d devi a t i o n a n d t h e ave r a g e . 15 47% use between 10 and 50 m a n years, 11% use between 50 and 100 m a n years, while 5% use m o re than 100 m a n years . The averag e firm has close to 41 em ploy e e s and uses 38 m a n years . 4. Choosing model specification In Section 2.3 it is explained how one can us e the bootst r a p p i n g m e thod o l ogy to help select the scale specific a t i o n of the DEA m o del. If the s cale effic i e n c y (E 4 ) in th e orig i n a l D E A model is outsid e the (95%) one sided lower confid e n c e in te r v a l we re jec t the H 0 tha t the technol o g y is CRS and apply VRS instead . In F i gure 2 the bootstrapped simulations required are plotted in a histogr a m . If the null hypoth e s i s were true we would expect the observe d E 4 from the uncor r e c t e d DEA estim a t i o n to be lo cate d the in side the 95% co nfid e n c e interv a l . The observed E 4 is 0.777 and we get a strong rejecti o n of H 0 . The histogram of S 2 is pract i c a l l y iden ti c a l . In both cases (S 1 and S 2 ) we get a very solid reject i o n of the null hypoth e s i s (“The true product i o n technol o g y is globall y CRS”). Based on th is result we will in th e followi n g assum e that the techno l o g y exhibi t s variab l e retur n s to sc ale . Fr ac tio n S1 .777 1 0 .229 Fig ure 2: Histo gram of the bo otstrapped distrib ution of the averag e scale efficiency (S 1 ) assuming CRS. 16 In this paper the only statisti c a l tools for choosing the correct m odel are the ones designed for tests of scale sp ecification. A sim ilar set of bootstrapped tests for m odel specific a t i o n could also be used for selecting which variables should be included. This line of thought is based on Banker (1993, 1996) and Kittelsen ( 1993), but it would be better from a statis t i c a l perspe c t i v e to perf or m these tests using the bootst r a p m e thod o l o g y . Howe ve r , it is im por t a n t to selec t the model based on econo m i c theory and the knowled g e of the s ector we are investi g a t i n g – not pure l y on s t ati s t i c a l te sts . 5 . Estimating the efficiency scores 5 . 1 DEA e fficiency scores The figures showing the uncorrect e d DEA efficie n c y scores (E 1 and E 3 ) will only be comment e d on briefly since the numbers change greatl y when correc t i n g for sam p li n g erro r using the bootst r a p m e thod . Howe ve r , it is import a n t to point out that alm o st all of the published DEA papers stop with c a lcul a t i n g o n ly the DEA e fficiency scores, and do not estim a t e and corre c t f o r sam p l i n g bias. It will b e shown that this m a kes a big dif f e r e n c e f o r the interpr e t a t i o n of the result s. Refer to Section 2.2 for explanation of bias correction 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2000 4000 6000 8000 10000 12000 Siz e in m a n- y ears E1 Fig ure 3: Uncorrected efficiency scores assuming variab le returns to scale. . 17 Figur e 3 shows the uncorr e c t e d effici e n c y sco r es , assum i n g VRS, in an Efficiency diagram. 1 7 One intere s t i n g featu r e of E ffici e n c y diag r a m s is that both the heigh t and the width of the bars can contain infor m ation – unlike a ba r chart where only the h e ights of the bars are activ e l y use d. This is espec i a l l y usef u l when illu s t r a t i n g the r e sul t s of ef f i c i e n c y ana l y s i s . The effici e n c y o f each of th e DMUs is shown by th e heigh t of the bar, while its econo m i c size (m an-y e a r s in Fig. 3) is show n by the width of the bars. T h is m eans that it is pos si b l e to exam in e wheth e r there are any system a t i c corre l a t i o n s betwe e n the sizes of the units and their effic i e n c i e s . Anoth e r in ter e s t i n g geom e t r i c asp ec t of these figures is that they are sorted accor d i n g to incre a s i n g effic i e n c y from left to right. The dis t anc e from the top of each bar to 1.00 is a m easure of that partic u l a r D M U’s ineffi c i e n c y , and the width of the bar is a m easu r e of its econo m i c size. For this reason the area a bove each of the bars is propo r t i o n a l to th e econom i c cost of that D M U not be ing 100% efficie n t . This m eans that there will typically be a “white triang l e ” above the ine ffi c i e n t units, and that the size of this area is propor t i o n a l to the econom i c cost of the total ineff i c i e n c y in th e sam p le . The softwa r e used to constr u c t these graphs is an “add-i n ” for Micros o f t E x cel. 1 8 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2000 4000 6000 8000 10000 12000 Siz e in m a n- y ears E3 Fi gure 4 : Tech nical productiv ity ( E3 ). 17 Thi s di ag r a m was fi rs t i n t r o d u c e d i n F ø rs u n d a n d Hjal m a r s s o n (1 9 7 9 ) , and i s base d o n t h e i n p u t c o ef f i c i e n t d i agram in Salter (19 60). 18 Th is add - i n (MS Ex cel under W i n3 2) is au tho r ed b y D a g Fj eld Ed v a rdsen , and w ill be m a d e av ailable for fre e fro m h ttp ://h ome.b r o a dp ark . n o / ~ d fed v a rd . 18 When it com e s to interpr e t i n g Figure 3 ther e are two aspects that are dom i nating. The first is that the averag e effic i e n c y is quite hi gh (83.44 % to be exact), and the second is that all of the largest units have been consi d e r e d fully ef fic i e n t . Figure 4 shows efficie n c y under CRS (E 3 ). E 3 is still inte r e s t i n g ev en when we have chosen VRS as the correct scale assumpt i o n . The reas on for this is that E 3 is also a measure of “technical productivity,” so it is useful even when we don’t believ e in it as a m easur e of effici e n c y . The effici e n c y is m u ch lower with E 3 , and the differen c e is quite striking for the largest units. However, when com p a r i n g Figur e 3 and 4, it is im por t a n t to rem e m b e r that sam p li n g error has not been taken accoun t of. Since the differe n c e between the LP for m ul a t i o n for E 1 and E 3 is that th e f o rm e r impl i e s res t r i c t i o n s on the multipl i e r weight , we have reason to believ e th at the effici e n c y m easur e E 1 is m o re affec t e d by sam p l i n g error than the produc t i v i t y m easu r e E 3 . A rough explan a t i o n is that with E 3 all units are poten t i a l l y com p ar e d indepe n d e n t o f size, so each DMU ha s a higher num b er of un its to be co mpa r e d with. Figur e 5 shows the histog r a m of the uncorr e c t e d effici e n c y scores from DE A, wh ile Figur e 6 sho w s the sam e for the bias corre c t e d effic i e n c y sco r e s . It is obv io u s to the eye that the change of the distribu t i o n is dram atic . The m o st obvious difference is that the strong concen t r a t i o n of fully effici e n t units at the right of the histog r am disappears when we correct F ra ct io n .291115 1 0 .248538 Fig ure 5: Histo gram of uncorrected efficiency scores (E 1 ). 19 F ra ct io n .276906 1 0 .111111 Fig ure 6: Histo gram of b ias corrected efficiency scores (E 1 ). for the estim a t e d sa m p li n g bias. In fact, only three DMUs are ass i gn e d unit effici e n c y score after the b i as correc t i o n . Figur e 7 co ntai n s bo th the bias correc t e d a nd the uncorre c t e d effici e n c y scores in an Effic i e n c y diagr a m . The bias corre c t e d value s a r e the lower ba rs, while the uncorrected values are plott e d in the upper curve . Both serie s are sorte d indep e n d e n t l y of each other . It is obvio u s to the ey e th at the estim a t e d ineff i c i e n c y is m u ch large r with the bi as co rrec t e d values . 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2000 4000 6000 8000 10000 12000 Si z e in m a n- ye ar s E3 , C or re ct ed E3 Fig ure 7: Efficien cy diagram of b ias corrected and uncorrected efficiency scores (E 1 ) , sorted separately. 20 Figur e 8 is based on Figure 7, but the bias correc t e d ef fici e n c y sco r es are sorted pairwise with the uncorrected efficiency scores . This allow s us to compa r e the effic i e n c y score for each individual DMU before and after the b i as correc t i o n , and also ex am in e whethe r there is any syste m a t i c d i ffe r e n c e when it com e s to how the samplin g error influe n c e s firm s of differe n t sizes. Inspec t i o n of Figure 8 confirm s that all of the large constr u c t i o n firm s have a large estim a t e d bias. This is o f ten the case, becau s e th e sam p l i n g error typ i c a l l y hits the large firms harder in a VRS m odel. The te ndency is relatively strong, as shown by the regression on estim a t e d bias versu s produ c t i o n volum e (and its square) below. Figure 5 is quite instruc t i v e becau s e it sh ows the big diffe r e n c e th at th e bias co rre c t i o n do es with the v e ry large un its . This strongly suggests that analyz ing scale econom ies without chec ki n g for sampli n g bias in DEA can give m i slead i n g results . In addition, since the large firm s ve ry often contri b u t e a large share of the product i o n and resourc e usage of an industr y , m easur e s of e fficie n c y at the aggrega t e d level will tend to be m o re distor t e d . The sam e pr obl e m is prese n t for the sm all e s t f i r m s , but this is m u ch mo re difficult to point out in Figure 8 sin ce the width of the bars are proporti onal to the size of the firm . An OLS regression between the estim a ted bias and the size of the firm (m e a sured by the sum of sales and its squar e ) o b tai n s stati s t i c a l signi f i c a n c e f o r both param e t e r estim a t e s (and the intercep t ) , and the R-square d is 0.1755. 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 0 2000 4000 6000 8000 10000 12000 Siz e in m a n- y ears E3, C o rr ec te d E3 Fig ure 8: Efficien cy diagram of b ias corrected and uncorrected efficiency scores (E 1 ) , pairwise sorted. 21 C or rE .291115 1 .276906 1 Fig ure 9: Scatter diagram of b ias corrected and uncorrected efficiency scores (E 1 ). The sugges t i o n that bias tends to be corre la t e d with scale is a little alarm i n g . One reason is that we m i ght get distort i o n s when we investi g a t e the econom i c s of scale. Anothe r reaso n is that it m i ght not be unusua l that the explan a t o r y vari ab l e s are correl ated with the size of th e DMU. A stati s t i c a l l y signi f i c a n t f i nding of the correlati on between uncorrected DEA and the explana t o r y variab l e m i ght be the result, and m i sinterpretations are in these cases eas il y m a de. The effect m i ght be stron g e s t in VRS, where th e effect s of scale are suppo s e d l y rem ov e d from the equa tio n . Practit i o n e r s that have carried out regression analysis on the uncorrected VRS efficiency scores without checking for the effect s of scale (because it was suppose d l y alread y taken care of in VRS) m i ght want to revisit old re sults and see if this happen e d in their case. F i g u r e 9 sho w s alm o s t the sam e infor m a t i o n as Figure 8, but in a scatte r diagra m . The infor m a t i o n about the size of the units is remove d , but on the other hand it is easier to com p ar e the indivi d u a l change s with and w ithout bias correct i o n . The units with an uncorr e c t e d effici e n c y score of 1.00 are the ones that are change d the m o st by the bias correc t i o n . 22 Tab le 2: Original efficiency scores and b ias corrected efficiency scores. Orig.Range Av g CorrE Obs. 1 0.855 77 0.9-1.0 0.891 69 0.8-0.9 0.809 76 0.7-0.8 0.723 71 0.6-0.7 0.634 28 0.5-0.6 0.513 19 0.4-0.5 0.457 1 0.3-0.4 -- 0 0.2-0.3 0.277 1 Sum Obs. 342 Table 2 shows the origina l e-sc o r e s (not correc t e d for sam p li n g bias) in its lef t co lum n . In the m i ddle colum n are the averag e bias correc t e d effic i e n c y score s for the sam e observ a t i o n s . It is interes t i n g to note that the av erag e bias corr ect e d e-s c o r e for the un its with unit uncorrected e-score is lowe r than the e-scores for the gr oup with uncorrected e-score in the range between 0.9 and 1.0. This is a rem i nd e r of the large sa m p li n g bias associ a t e d with the DMUs that were assum e d to be fully e ffici e n t in th e unco rr e c t e d DEA calcu l a t i o n s . The units w e want to learn from are the ones with th e hig h es t corre c t e d effic i e n c y score s and the sm alle s t confi d e n c e inter v a l . W h en choosing between combinations of these two attractive aspects it woul d pro b abl y be wise to focus on the DM Us whose efficie n c y scores have low standar d errors am ong the DMUs with quite high correcte d e-scores . This is done in contras t to being totally f o cused on choosin g the DMU(s) with the very highest correc t e d effici e n c y scores . Table 3 d i s p l a y s the de scr i p t i v e sta ti s t i c s f o r th e VRS (E 1 ) and CRS (E 3 ) effic i e n c y score s , with and witho u t bias correc t i o n . Before exam in i n g the num b er s , one m i ght expec t the differ e n c e b e twe e n the uncorr e c t e d and correc t e d avera g e ef fic i e n c y sco r e to be larg e s t in the VRS case. The reason is that the prob le m with sa mpling error m i ght be expected to be largest in the VRS case, since the LP formul a t i o n (2 ) for the DEA probl e m requi r e s th e sum of refere n c e w e ight s to equal one. This is not the case in the CRS m odel, and for this reason Tab le 3: Descrip tive sta tistics for VRS and CRS efficiency scores, with and with out b ias correction. Obs Average Stdev Min. Max E 1 uncorrected 342 0.848 0.136 0.291 1 E 3 uncorrected 342 0.687 0.167 0.255 1 E 1 biascorrected 342 0.785 0.116 0.277 1 E 3 biascorrected 342 0.615 0.137 0.235 1 23 ther e is a higher num be r of poten t i a l units that a given DMU can be com p a r e d with un der the CRS assumpt i o n . But the differe n c e between th e uncorrec t e d and bias correcte d average efficien c y score is slightly larger under CRS than under VRS in Table 3. R 2 for a regres s i o n betwe e n the uncorr e c t e d and the correc t e d effici e n c y scores is 0.801 if we allow for an endogenous constant term in the fitted regres s i o n equatio n . The fit is quite good, but as m e ntio n e d the differ e n c e is large for a high num b er of units. There is a tenden c y (as s een in Figur e 10) that th e lower the origi n a l effic i e n c y sco r e , the lower the estim a t e d sam p li n g bias. Th ere m i ght be several reasons for this, but one possible ex planat i o n is that the more cen tral l y positi o n e d a DMU is in the da taset (with regar d s to s i ze and inp u t / o u t p u t m i x), typic a l l y the high e r the num be r of units it will be com p a r e d with. In other words , we exp ec t a un it with a low observe d un corr e c t e d D E A efficie n c y score to be closer to its real valu e th an one with a high ob serv e d effici e n c y score. Th e reason is that an ob serv e d unco rr e c t e d effici e n c y sco r e is expecte d to be correla t e d with cen tra l i t y in the datase t . The bias can be expect e d to be th e large s t f o r th e units with the highe s t s c ore s in the uncorrected DEA model, a nd the lower th e uncorr e c t e d effici e n c y score the lower the expec t e d bias. But the decre a s e in bias seem s to be slower the lower the effici e n c y score of the DMU under inv e st i g a t i o n (a sm ooth fitted p a ram e t r i c fu ncti o n would have a po s iti v e firs t and second deriva t e . Bi as .291115 1 0 .237706 Fi gure 1 0 : Scatter diagram of estimated bi as vs t he original uncorrected efficiency scores ( E1 ). 24 S tEr rC or rE Bi as 0 .237706 0 .265485 Fig ure 11: Scatter diagram of b ias versu s standard error of the corrected efficiency scores. This is close l y rela t e d to the probl e m with the curse of di m e n s i o n a l i t y and the speed of converg e n c e of the DEA m odel relati v e to sam p le size. It m i ght be that the em pir i c a l l y stron g relation discover e d in the datase t can be investi g a t e d in light of a theoreti c a l relation s h i p . Such a formula m i ght show the relation between the true efficien c y and the s i ze of the estim at e d bias. It seem s that the low e r th e true e f f i c i e n c y , th e lo wer the size of the b i as (Bias = f(E1), f’>0, f’’>0) . However , this task will have to rem a in for further research . Figure 11 shows the strong em pi rica l correlat i o n between the estim a te d bias and the standa r d error of the bias correc t e d effici e n c y estim a t e . The relati o n is so strong that one m i ght want to exam i n e if this can be estab l i s h e d f o rm a l l y : T h e highe r th e estim a t e d bias, th e highe r one can expec t the stand a r d error s to be. A possible explanat i o n is that the high estim a t e d b i as occurs for sam p le s that one h a s littl e inf o r m a t i o n ab out. If the area of input/ o u t p u t space is scarce l y popul a t e d , we have little inform a t i o n about the lo cati o n of the front i e r ; esp ec i a l l y if the area is outsi d e the cen t e r of the samp l e (we get litt l e he lp f r om the convex i t y assum p t i o n and surrou n d i n g observ a t i o n s ) . At the sam e tim e, this lack of inf o r m a t i o n is a l so captu r e d in a la rg e stand a r d e rror . In othe r words , there m i ght not be two separ a t e pheno m e n a , but rathe r two m a nif e s t a t i o n s of the lac k of inf o r m a t i o n abou t th e area in the input/ o u t p u t space where the DMU is lo cate d . This strong empi rical correlation has 25 probably not been show n before in the literat u r e , and an extens i v e sear ch on t h e I n ter n et for sim ilar finding s turned up with no relevan t finding s . 1 9 As m e ntione d in Section 2.2 one should be careful when using the bias corrected effic i e n c y estim a t e s witho u t evalu a t i n g the st anda r d errors . The reason is that the bias corre c t e d effic i e n c y estim a t e s m i ght get highe r MRSE than the origi n a l DEA estim a t e s . Only 202 of the 371 DMUs (54.4% of th e sam p l e ) in the datas e t get bi as correcte d efficien c y with lower estima t e d m ean square error than the origina l DEA efficie n c y estim a t e . 6 . Efficiency and produ ctivity exp lained T h e effici e n c y estim a t i o n in Section 5 reveale d large differ e n c e s among the fir m s when it com e s to technic a l effici e n c y and pro duc t i v i t y . In this sectio n, different hypotheses that m i ght “expl a i n ” th ese diffe r e n c e s are dev e lop e d and tested in a stage two s e ttin g f o r em piri c a l relev a n c e . In additi o n to in clud i n g th e effici e n c y scor e s from stage one, it is argued in Section 2.4 that the certaint y level for each of the observati ons should also be taken into account. It should however be noted that the availabl e data is not sufficiently detailed to give a clear indic a t i o n of why som e fir m s appea r to be m u ch more effic i e n t than other s . Som e hypothe s e s are associa t e d with statist i c a l l y si gni f i c a n t param e t e r estim ate s , but these should m a inly be viewed as indica t i o n s of in ter e s t i n g top i c s for furth e r res ea r c h . 6.1 Cons tructing h y potheses bas ed on existing th eory and kno w l edge o f the industry The hypotheses developed below were generate d based on knowledge about the industry and given by the lim it e d avail a b i l i t y of da ta. a) Wage cost per hour . Higher wages can attrac t the best workers. In addition, piecework contra c t s can lead to h i gher averag e wages. On e m i ght suspe c t that this facto r is close l y relate d to the hypothe s i s (d) since overti m e is also associated w ith a higher hourly pay. Howeve r , a regres s i o n with W a ge Cost per Hour explai n e d by Hours Worked per Em ploye e gets an R 2 o f only 0.03, so there sho u ld be no proble m with multi- c o l i n e a r i t y . b ) Apprentices . This is define d as the num ber of appren t i c e s relat i v e to the num b er of em ploye e s . The idea is that th e m o st efficie n t co mpani e s hav e low shares of apprent i c e s . The reaso n is th at we expec t the com p a n i e s with hi gh shares of appren t i c e s to have h i g h er cos t s 19 A se arc h was carried out on t h e Internet sea r ch engi n e Goo g l e.co m with th e term s: DEA b o o t strap correlatio n bias standa rd e r ror (a nd confidence interval ) . It did n o t ret u rn any releva n t res u l t s . 26 and lower produc t i o n since appren t i c e s ar e under traini n g , which should im ply lower produc t i v i t y of the appren t i c e s and also m a n- h o u r s used by other em plo y e e s to offer them guidan c e . O n the other hand, a histor y of havi ng a high num b er of appren t i c e s could give good access to high qualit y hum an c a pit a l in the long run. The hypoth e s i s is nevert h e l e s s that the total effec t of a high num b er of appren t i c e s is reduce d effici e n c y . c ) Product Mix . This is m easured , using the Herfinda hl index, by the quadr at i c sh are of th e sales of the com p ani e s divide d betw ee n the seve n underlying m a rkets. The expectation is that the m o st diver s i f i e d firm s are m o re effic i e n t since they have the option of using their resou r c e s in the m o st attrac t i v e m a rke t depen d i n g on short term busine s s cycle s . It should be m e nti o n e d that th e busin e s s cycle s in th e co nstruc t i o n industry can ch ange very fast. In additio n , there is a possibl e select i o n effect sin ce this varia b l e m i ght pick up that the “best ” firm s get contra c t s outsid e of their key m a rke t s . Notic e that testin g the econom i c s of scope using a DEA m odel can be proble m a t i c . The reas on is that DEA assume s global convex i t y . If we f i nd that the m o st diver s i f i e d com p a n i e s a r e th e leas t effic i e n t , this m i ght be a warn in g that we have a s e riou s breac h of one of the assum p t i o n s under l y i n g the DEA m odel used in stage one. d) Hours Worked per Employee. T h e hypoth e s i s is that the firm s with high num b er s of hours per employe e are m o re efficie n t . The reas onin g is that these firm s get m o r e efficien t produ c t i o n by the use of overti m e . It m i ght be that som e em ploy e e s work better under a certai n d e gree of pres su r e . It cou l d also be th at the bes t worker s choo se their em ploye r based on the opport u n i t y to work overti m e . An additi o n a l possi b i l i t y is th at the repet i t i o n effec t is posit i v e and that the use of overt i m e a llow s f o r lo nge r rep e t i t i o n seque n c e s . e) Located in Oslo. Oslo is the c a pita l and th e larges t city o f Norway, and a pressure area. Housin g prices are usuall y higher in the Oslo area, and the way effic i e n c y in this paper m a y be influenc e d by this price effect. It would have been interes t i n g to follow up Albriktsen and Førsund (1991) and examine if the am ount paid to subcontr a c t o r s is correlat e d with effici e n c y . Howeve r , th e quality of the data at hand is too low (a large num be r of firm s have repor t e d a value of zero even when th is is no t believable). 27 Table 4: Correlation table fo r the explanatory variables. Wage cost per hour Share of apprentices Produ ct mix Hou rs worked per employee Location Oslo Wage cost per hour 1 Share of apprentices - 0 . 2 1 8 8 * 1 Produ ct mix 0.0197 - 0 . 1 4 4 6 * 1 Hou rs worked per employee - 0 . 1 5 4 6 * 0.0208 0.0842 1 Location Oslo 0 . 2 2 4 2 * -0.1 -0.0298 0.068 1 Table 4 sho w s the correl a t i o n tab l e for the exp l a n a t o r y var i a b l e s . The pairw i s e sta ti s t i c a l l y signific a n t correlat i o n s (at the 5% sig. level) ar e m a rke d with a as ter i s k (and a bo ld font style) , and only thos e will be co mm ente d on. Wage Cost per Hour is negat i v e l y relat e d to Share of Appre n t i c e s , Hours W o rke d per Em pl o y e e and Locat i o n Oslo. In addit i o n Share of Appren t i c e s is negati v e l y rela te d to Produc t Mix. That W a ge cost is negativ e l y correla t e d with Share of appren t i c e s and Lo ca t i o n Oslo is not surpr i s i n g si nce the pay to apprenti c e s is less than for trained labor, and the wages in Oslo is known to be higher than in other parts of Norway . It is surpri s i n g that W a ge Cost per H our is negatively correla t e d with Hours Worked per Em ploy e e , but it m i ght be that the em ployees with low hourly salarie s choose to com p e n s a t e by worki n g extra hou rs . It m i ght be that the subst i t u t i o n ef f ec t , known f r om Labor Econo m i c s , dom i n a t e s . It is diffic u l t to ex plai n why Share of Appren t i c e s is n e gati v e l y correl a t e d with Produc t Mix. 6.2 Do the suggested h y pothe ses have empirical relevance? T h e regress i o n m odels used below are weighted least squares and OLS (for com p ar i s o n ) . The weight s in the weight e d regr e s s i o n are th e inver s e of the squares of the estim a t e d stand a r d erro r s from the boots t r a p si mula t i o n s . The m o tiva t i o n is to put low weight on an observa t i o n when there is low certain t y of its real valu e . The re gression calculation was carrie d ou t in Stata7 . T h is s t atis t i c s pack age (and m a ny others) has built in support for assig n i n g a priori (in a sense) known wei ghts to the observations. Table 5 shows the result s from an or dina r y OL S regress i o n (left part of the table) and a weight e d least square s regres si o n . A truncate d regressi o n was also comput e d (with right truncation at 1) but the result s were as good as identical to those laid out in Table 4. The coef f i c i e n t s with sta ti s t i c a l l y s i gni f i c a n t par a m e t e r estim a t e s (at the 5% s i gni f i c a n c e level ) are 28 Table 5: Weighted and unweighted regression. Explanator y v a riables Un w eig hte d (R 2 = 0 . 1 4 ) W eig h ted (R 2 =0. 2 6 ) Coef . t P> | t| Coef . t P> | t| Wagecost per hour (a) 0 . 0 0 1 2 6 . 6 9 0 0 . 0 0 1 9 9 . 1 4 0 Share of ap prentices (b) -0.0802 -1.15 0.25 - 0 . 1 4 8 7 - 2 . 1 8 0.03 Produ ct mix (c) - 0 . 0 4 7 8 - 2 . 0 4 0 . 0 4 2 - 0 . 0 8 8 0 - 3 . 8 3 0 Hou rs worked per employee (d) 0 . 0 0 0 1 2 . 6 4 0 . 0 0 9 0 . 0 0 0 1 4 . 0 2 0 Oslo (e) -0.0106 -0.49 0.622 0.0093 0.32 0.752 Interc ept 0 . 4 3 4 9 5 . 9 1 0 0 . 2 4 5 6 3 . 1 0.002 highlighted with bold fonts. A higher num ber of the param e t e r estim a t e s are s t atist i c a l l y signif i c a n t in the weighte d regres s i o n com p ar e d to the OLS regressi o n . The p-values are also lower in the weight e d regres s i o n , with the ex cept i o n of the Interc e p t estim a t e which has a slight l y low e r p-valu e in the OLS regres s i o n . If we belie v e in the sta t i s t i c a l l y signi f i c a n t pa r a m e t e r s f r om the wei ghted regressi o n i(Table 5), then the most effici e n t constr u c t i o n com p ani e s are charact e r i z e d by : – High averag e wage per hour – Low num b e r s of apprent i c e s – Low concentr a t i o n s in product m i x – High num b er s of hours worke d per em ploy e e Not stati s t i c a l l y s i gni f i c a n t : – Located in Oslo Earlie r in this paper it has been shown (Fig. 4) that the bias corre c t i o n and the stand a r d error of the bias correcte d efficien c y score have a strong and positive correlation. Rem e mber that the inv e r s e of the latte r wer e used as weigh t s in th e m a in regre s s i o n m o del . Th e im pli c a t i o n is em pir i c a l l y that low weigh t s are put on the units that have gotten a strong bias corre c t i o n (beca u s e they very ofte n get large co nfi d e n c e inter v a l s ) . It was noted abo v e that th e units with the highes t bias corre c t i o n tend to be found a m ong the units with effici e n c y scores equal to 1 from the uncorre c t e d DEA calcula t i o n s . Many applied DEA papers have used tobit regr essi o n in a second stag e. The reason is probab l y that the author s have observ e d a concen t r a t i o n of DMUs with uncorr e c t e d effici e n c y scores of 100%, and that they based on this have though t of the DEA effici e n c y scores as 29 bein g trunc a t e d . This is wrong since (as seen in the LP for m u l a t i o n ) they are not trunc a t e d 2 0 -- they are serial l y correl a t e d . Sim a r and W ilson (2003) shows that using a tobit regress i o n in stage two is both theoret i c a lly and empirically (using M onte Carlo sim u la t i o n s ) wrong. 6 . 3 Productiv ity and scale E a r l i e r in this paper a bootst r a p p e d scale specif i c a t i o n test re jected the null hypothesis that the correct model specif i c a t i o n was CRS. Howeve r , even when we choose to believe that the true production function exhibits VRS we can fi nd use for the CRS m e asure , and interpr e t it as productivity. This can be used as a m easure of to what degree the s ector has an efficie n t struc t u r e . In Figure 12 the m a xi m a l value plotte d on the horizo n t a l axis is 1,200, 0 0 0 . The inten t i o n is to zoom in on the r a ng e wher e th e r e seem s to be m o st inter e s t i n g sy st e m a t i c tenden c i e s in the sim u lt a n e o u s distri b u t i o n of aver a g e produ c t i v i t y and s cale . It seem s that th e avera g e p r o du c t i v i t y of the const r u c t i o n firm s incre a s e s un til the size is about 100 m illi o n s (NOK). There does not seem to be any system atic change after th is value is reached . Howev e r , there are not m a ny constr u c t i o n firm s with producti o n values much higher than 100 m illio n s NOK in the sam p le , nor in the popula t i o n of all Norwegia n construc t i o n firm s. 1.2 1 0.8 E 3 corrected 0.6 0.4 0.2 0 1000000 1200000200000 400000 600000 8000000 Size in produ ction v a lue Fi gure 1 2 : Scale chart show ing Production Val ue and E3 ( range 0 - 1 ’ 2 0 0 ’ 0 0 0 ) . 20 The e ffici e n c y score s can ne ver be a b ove 1. The reas on f o r t h e c onc e n t r a t i o n at 1 i s t h at the efficiency score s of eac h u n i t d e pe n d o n t h e i n p u t - o u t p u t v ect o r s of t h e ot he r u n i t s (l e a d i n g t o seri a l cor r e l a t i o n i n t h e be st - practice calcul a tion). T h ey are not truncate d at 1 a s s u ch. 30 7. Conclusio ns T h i s paper concern s using DEA to investi g a t e the effici e n c y of Norweg i a n build i n g firm s. Large differ e n c e s in the efficie n c y and productivity scores were discovered. One im por t a n t lesso n that can be learn e d from th is applic a t i o n is the danger of taking the effici e n c y s c ores from uncorr e c t e d DEA calcul a t i o n s at face value. If one decided to learn from a few DMUs based on their uncorrec t e d effi ci e n c y scores , one m i ght get into troubl e . It is not unreas o n a b l e to think that sim ila r things have happened in the last few years as DEA has been em brac e d by a very large n u m b er of practit i o n e r s (resear c h e r s and consult a n t s ) . It would be inter e s t i n g if th e large nu mber of empiric a l DE A papers were recalculated using the bootstr a p m e thodo l o g y . Anecdot a l observa t i o n s indicate that very few practiti o n e r s use bootst r a p p i n g . The reason for this m i ght be th at bootst r a p p i n g is not yet availa b l e in the standar d DE A softwar e package s . Based on a scale sp ecifi c a t i o n tes t , a vari ab l e retu r n s to scale sp eci f i c a t i o n was selected . A scale chart indicate d th at firm s with total produ c t i o n v a lue s lower than 100 m ill. NOK m i ght be operating at a subopt i m a l scale level. The differ e n c e s in th e effici e n c y scores m a y be explai n e d by enviro n m e n t a l and m a nage r i a l varia b l e s . S u ch variab l e s have b e en tr ied in a two stage approach. A ne w contribu t i o n is the dem onstr a t i o n of how one can use the standa r d errors from the bias corre c t i o n in stage one to im prove the power of the regres s i o n model in stage two. Five possib l e explan a t i o n s were ex am in e d fo r empiri c a l releva n c e , and four of them were found to be statis t i c a l l y si gn ifi c a n t in a multiv a r i a t e wei ghte d regress i o n setting . More detailed data would be necessar y before st rong conclusions can be m a de, but there are indica t i o n s that the m o st effi ci e n t buildi n g firm s are ch arac t e r i z e d by h i gh averag e wag e s, low num b ers of apprent i c e s , dive r s i f i e d produ c t m i xes and high num b e r s of hours worke d per em plo y e e . One possi b l e probl e m when it co m e s to inter p r e t i n g th e s e resu l t s is the one of unbala n c e d selec t i o n . It m i ght be that the firm s that were rem ove d from the da tase t belong to a differ e n t p opul a t i o n when it com e s to the in effi c i e n c y d i str i b u t i o n . There m i ght be a positi v e corre l a t i o n betwe e n enter i n g co rr ect data and the true techni cal efficiency of the units inclu d e d . If the units in c l u d e d in the datas e t are o n averag e more effici e n t than the averag e in the populat i o n , then the overall pictur e of the e fficie n c y of the industr y is too optim is t i c . 31 Conce r n i n g furthe r resear c h a possibl e extens io n is to study tim e series by includin g data for other years . T h e Malm q u i s t index co uld be used to decom p o s e the produc t i v i t y develo p m e n t of each firm into fronti e r sh ift and catch i n g up. The relati o n s h i p betwe e n produ c t i v i t y chang e and entry / exit anal ys i s could provi d e ad dit i o n a l in sig h t s . In the curren t paper a bootst r a p p e d model speci f i c a t i o n test is used to selec t th e scale speci f i c a t i o n , but a sim il a r app r o a c h can also be used to h e lp sele c t whic h of the inp u t s an d outputs should be included . It could be rewardin g to exam ine how th e weighted regression m e thod suggested in this paper perfo r m s compa r e d to bootst r a p p i n g in both stage one and stage two. This com p aris o n could be done in a Monte Carlo setting. If data on the projec t level becam e availa b l e , it could be invest i g a t e d whethe r the findin g s in this paper have em piri c a l releva n c e on projec t level data. It would also be intere s t i n g to furthe r i nvestigate the theoretical relationship between the estim a t e d bias an d the orig in a l uncor r e c t e d DEA effici e n c y s c ore, as well as the relati o n s h i p betwe e n the estim a t e d b i as and the s t and a r d erro r of the bias corre c t e d ef fic i e n c y score . 32 References Albriktsen, R, 1989, Produktivitet i byggebransj en i Norden, NBI Pr oject report no. 40, Norwegi a n Buildi n g Resear c h Instit u t e . Albriktsen, R. and Førsund, F, 1990, A produc tivity study of the Norwegian building indus t r y , Journal of Productivity Analyses , 2-1990, pp. 53-66. Anderse n , P. & Peterse n , N.C., 1993, A Pro cedu r e for Ranking Effici e n t Units in Data Envel o p m e n t Analy s i s . Management Science , 39(10), 1261-1264. Banker, R.D.,1 9 9 3 , Maxim u m Li keli h o o d , consis t e n c y and data envelop m e n t analysi s : a statist i c a l founda t i o n , Management Science , 39, 10, 1265-127 3 . Banker, R.D., 1996, Hypothesis T e sts Using Data Envelopment Analysis, Journal of Productivity Analysis, 7, 139-159. Carroll, R.J and D. Ruppert , 1988, Transfo r m a t i o n and Weigh t i n g in Regre s s i o n , Chapm a n and Hall, New York. Charnes, A., Cooper, W . W. and Rhodes, E ., 1978, Measuring the efficiency of decision m a kin g units , European Journal of Operations Research 2, 429-444. Cooper, W.W ., L.M. Seifor d , and K. Tone, 2000, Data Envelo p m e n t Analys i s : A com p r e h e n s i v e text with m ode l s , appl i c a t i o n s , refere n c e s and DEA-sol v e r softwa r e , Boston/Dordrecht/London: Kluw e r Acade m i c Publi s h e r s . Edvardsen, D.F., Førsund, F.R., Kittelsen, S.A. C., 2003, Far out or alone in the crowd: Classifi c a t i o n of self-eva l u a t ors in DEA, Working paper 2 003:7 from the Health Econom i c s research program , Unive r si t y of Oslo. Efron, B., 1979, Bootst r a p m e thod s : a nothe r look at the jackkni f e , Annals of statistics 7, 1-6. Førsund , F. R. and L. Hjalm a r s s o n , 1979, Genera l i z e d Farrel l m easur e s of effici e n c y : an applic a t i o n to m ilk proces s i ng in Swedish dairy plants, Economic Journal 89, 294-315. Farrell , M.J.,1 9 5 7 , The m easure m e n t of product i v e effici e n c y , J.R. Statis. Soc. Series A 120, 253-281 . Groak, S. 1994, Is constru c t i o n an industr y ? , Construction management and economics, 12, 4, pp 187-193. Jonsson, J., 1996, Construction site productivity m easur e m e n t : select i o n , applic a t i o n and evalua t i o n of m e thod s and m easur e s . Doctor al thesis, Lulea Univ ersity of Technology. Kittelsen, S . A.C., 1993, Stepwise DEA; Ch oosi n g Variab l e s for Measuri n g Techni c a l Effici e n c y in Norwegi a n Electr i c i t y Distri b u t i o n , Me m o 06/199 3 Depart m e n t of Ec onom i c s , Univer s i t y o f Oslo 33 Kneip , A., Sim a r, L. and W ilson, P., 2003, Asym p t o t i c s for DEA Estim a t o r s in Non- p a r a m e t r i c Front i e r Model s , Discu s s i o n Pape r 317, Insti t u t e de Stati s t i q u e , U n ive r s i t é Catholique de Louvain. Marron, J. S. and Nolan, D., 1988, Canonical kernels for density estimation, Sta t i s t i c s & Probability Letters 7(3): 195-199. Ofori, G. 1994, Establishing construction ec onom i c s as an academ i c discipl i n e , Construction Management and Economics, pp 295-306, 14, 4, Salter, W . E.G. , 1960, Produc t i v i t y and Technic a l Change, Ca m b ridge, UK: Cambridge Univer s i t y Press. Silver m a n , B.W ., 1986, Densi t y Estim a t i o n for Statis t i c s and Data Analys i s , publis h e d by Chapm a n and Hall. Silver m a n , B.W . and Young, G.A.,1 9 8 7 , The bootst r a p : to sm ooth or not to s m ooth ? Biometrika 74, 469-479. Sim a r , L. and W ilso n , P. W . , 1998, Sensi t i v i t y anal ysi s of efficie n c y scores: How to bootstr a p in nonpara m e t r i c frontie r m odels. Management Science, 44, 49–61. Sim a r, L., and W ilson, P., 2000, A general m e thodol o g y for bootstra p p i n g in nonparam e t r i c front i e r m odel s , Journal of Applied Statistics 27, 779--80 2 . Sim a r, L. a nd W ilson, P., 2001, Testin g restri c t i o n s in nonpara m e t r i c effici e n c y m odels , Communications in Statistics, 30, 159-184. Sim a r, L. and W ilson , P., 2002, Nonpa r a m e t r i c Tests of Return s to Scale, European Journal of Operational Research, 139, 115-132 Sim a r, L. a nd P. W ilso n , 2003, Es tim a t i o n and Inferen c e in Two-Sta g e , Se m i -Pa r a m e t r i c Models of Produc t i o n Proces s e s , Discus s i o n Pa per 307, Instit u t e de Statis t i q u e , Univer s i t é Catholique de Louvain. Torgersen, A.M., Førsund, F.R., Ki ttelsen, S.A.C . , 1996, Slack adjust e d effici e n c y m easur e s and rankin g of effici e n t units , Journal of Productivity Analysis, 7, 379-398.