DEAML — a lingua franca for the DEA community
November 18th, 2008Data envelopment analysis (DEA), as a comparative technique for determining the efficiency of (generally large) sets of decision-making entities, is data intensive. It lives and dies on the availability of suitable quantitative data. Such datasets are expensive to acquire and manage, so it is imperative that the maximum value be drawn from them.
All experienced DEA researchers and practitioners will be familiar with the “joys” of data management. In addition to the data relating to the decision-making entities, DEA models utilize a dizzying array of parameters and a wealth of output data to be mined and studied. Configuration control is a major challenge in many large DEA projects.
Custom spreadsheets, databases and proprietary file formats are the staple means by which we, as a community, record and share our results. After three decades of DEA research, given the critical role data plays in effective DEA modeling, is this really good enough?
DEA is a niche discipline. If we wish to nudge it into the mainstream we need to be better at sharing our models — sharing with other DEA professionals, sharing with other data analysts and sharing with clients/stakeholders.
Sharing using the ubiquitous spreadsheets doesn’t cut the mustard. Every spreadsheet tends to be different and must be interpreted afresh, resulting in pointless errors. While spreadsheets provide a natural way of representing the tabular decision-making entity data, they do not lend themselves to the specification of model parameters. Also, the task of transforming data from a spreadsheet, to a DEA tool and then back again is, at best, laborious and, at worst, error-prone.
The lack of a standard format for representing DEA models retards research. With no agreed standard for publishing models, there is little incentive for researchers to make their data sets available. It is not a simple case of just making the data available on a website. The format of the data would have to be documented extensively — a laborious, tedious, difficult and potentially thankless task. We’re sure that most readers have had the frustrating experience of trying to replicate the results in a paper only to find that crucial information is missing. Standardization opens up the possibility of supplementing all published research with complete, accurate, unambiguous and immediately testable models.
Commercial DEA practitioners would also benefit from the existence of a standard for publishing DEA models and results. Practitioners often work in teams, sometimes utilizing different tools, and a common data format would decrease their work load and reduce the number of errors introduce through data housekeeping activities. In commercial work, DEA results are often being produced for inclusion in other “downstream” systems. An agreed format for DEA results would allow work on downstream systems to proceed independently of the DEA modeling.
Finally, a standard for specifying DEA models must be taken up by the DEA software providers. Interoperability been different applications, commercial and free, allows DEA professionals to pick the right tools for the job. Support for standards in software also makes it easy for researchers and practitioners to publish models that adhere to these standards. Software providers will be led by the cries of the market — if we want it badly enough, we’ll get it.
In this paper, we introduce a potential (XML) standard for representing DEA models – i.e. decision-making entity data, model parameters and results. We’re calling it “DEAML” — DEA Markup Language. We don’t see this as a finished product — far from it. Our ambition is to kick-start the development of a standard that is designed and guided by the DEA community.
If you’d like to join the effort, please contact us at info@banxia.com.