Document vs. Data-Centric Design

Documents are so flexible aren’t they? You can mix text and numbers, tables and charts – all within the same page context. You can even spend lots of time making a page look aesthetically pleasing and make the information enticing. To an individual human reader, a single well-designed page can communicate a lot of information very effectively.

But what if you want to communicate data to many different information consumers with different interests in the data? What if your focus is not a single human ‘page reader’ but the rapidly expanding universe of online web services that ‘consume’ data programmatically to make it easier for any human information consumer to crowdsource and crowdshare and compare and contrast the data?

That’s when document-centric information design falls down in comparison to data-centric design. To illustrate why, I’ll examine the Global Reporting Initiative’s NGO Sector Supplement Economic Indicator NG08 as an example.

Here is the definition of GRI NG08:

Sources of funding by category and five largest donors and monetary value of their contribution

What you can already tell is that this single indicator is combining at least two information concepts together: Sources of funding by category and five largest donors.

This becomes even clearer when you look at what the GRI defines as the ‘compilation’ of the indicator:

2.1 Identify sources of funding by category (e.g. government, corporate, foundation, private, membership fees, in-kind donations, and other).

2.2 Identify the five largest donors in monetary value. For in-kind donations, use estimates of the monetary value of the donation, using standard accounting principles.

2.3 Report aggregated monetary value of funding received by source.

2.4 Report the five largest donors and the monetary value of their contribution

gri-ng08

gri-ng08

So in this one indicator you are expected to identify n sources of funding and their monetary value and the 5 largest donors and their monetary value. This is actually two ‘multi-member’ indicators in one with one indicator having a defined ‘extent’ and one with non-defined extent. And all this works well if the target is a document, which it is. You can imagine the page in an online PDF: Some text followed by a pie chart of funding sources; some text followed by a bar chart of top 5 funders. It looks great and some interactivity is possible using hovers on and drilldowns from the page.

But what if you want to compare that information to another organization’s NG08 or what if you want to allow another software application to consume this data with minimal or no human intervention? Well then you have two options:

  • Cut and paste the data out of the document into something else – say a spreadsheet – massage the data and export it out as a file.
  • Run a parsing program on the document to ‘shred’ it – i.e. identify specific bits of data programmatically and write them to a database.

One of which is a manual, error-prone process and the other requires someone to write an NG08 parsing program and provide it to anyone who wants to use it. Not ideal in other words.

yet if the definition of this indicator had been approached with a different endgame in mind – i.e. not just a pretty page but the delivery of machine readable data that is easy to share and compare – then a data-centric design approach would have been used not a document-centric approach. And the added value of a data-centric approach is that while it’s relatively hard to convert a page into data it’s relatively easy to render data as a page.

If a data-centric approach had been used here then as a minimum this Ethical Fundraising indicator would be split into two indicators – let’s say NG08-01 and NG08-02: Funding by Source and Funding by Largest Donor. And both indicators, not just one, would have an ‘extent’ or boundary. So NG08-01 might be limited to the top 10 sources rather than just ‘sources’ – as NG08-02 is limited to the top 5 largest donors. This makes both easier to use for comparative purposes. Ideally, you would also specify the currency or unitary measure used and the periodicity of the data i.e. year, quarter, month etc. so that you can compare like-for-like.

It’s great that some forward-thinking NGOs have collaborated with the GRI to create a potential source of useful and comparable information about their ethical fundraising efforts. It’s a shame that a document-centric rather than a data-centric approach was used to define this specific indicator concept (and a number of others in the GRI Index generally).

That’s why it’s important to develop content-standards (like the GRI indicators) and data-standards (like an XBRL taxonomy) in tandem and not either as an afterthought of the other. And why more effort needs to be put into the GRI taxonomy so that these kinds of issues become clearer and the indicator definitions more effective for information consumption and comparison purposes.


Tags: