Generating Data is Outstripping our Ability to Interpret It
I was reading a newspaper article the other day about it now being possible to create the genetic map of an unborn child. Fascinating stuff. But there was one line in the article that really got me thinking about data more than genetics itself: “The capacity of genomics to generate data is outstripping our ability to interpret it in useful ways.”
I don’t think this phenomenon is limited to genomics. I can think of lots of fields and industries and individual organizations that place a premium on collecting data but do not have the ability to interpret it in useful ways.
Some errors that drive me around the bend:
Data for data’s sake. I have encountered too many organizations where they jumped on the data bandwagon and collected oodles of info that they never use. They devote loads of time to data for data sake without doing anything with the data. Data should drive decisions and program improvements.
Feeling. I once had a boss that used gut check and nothing else to see if he could use the data to tell the story he wanted to tell. Ugh. If ever the data collided with his world view he would either want to bury the data, have us re-run the analysis (which came up with the same conclusions time and again) or would challenge the methods of data collection (there was nothing wrong with the collection methods). We cannot shy away from data when it tells us something different than what we were expecting or wanted to hear. Data won’t always tell you that you are doing an awesome job.
Incorrect language. This is a well-intentioned error, but a common one all the same. People who don’t use data a lot will use words like “significant” or “sample” in ways that mean something completely different to data nerds and analysts than what they were intending. The result? People question their findings because of the incorrect language use. We could also get into incorrect manipulation of data (like averaging averages) but I won’t go there.
Data analysis plan after the fact. Good scientific and defensible analysis of data requires us to have a good plan for how the data will be looked at before launching into the analysis of the data. Otherwise there can always be the accusation that the data analysis set up after the fact was done in such a way so as to bias the findings.
People without any training drawing conclusions. Data analysis isn’t something that just anyone can do without proper training. We need to be infusing instruction on how to interpret data and take action upon it at the frontline and supervisory levels if we want organizations to use it properly. Oh, and Boards and Funders and Government too where there seems to be no shortage of people who have no clue how to read and interpret data and yet make huge decisions based upon the data.
But it isn’t just about interpretation or the common errors noted above. One major problem is that in many instances the data driven mentality has resulted in groups collecting way more data than they need (or at least trying to collect). A few things that happen as a result:
Incomplete data sets. Hiring an outside expert to make sense of your data when the data sets are largely incomplete will not result in robust findings. Without some key fields filled in within your data system it is sometimes possible to make inferences and use proxy data, but it is not as reliable. We need staff to feel that data entry is part of the real work that they do – not something that happens after the real work is done.
Constantly tweaking data asks. I could throttle (as I’m sure some service providers could as well) senior managers or funders that keep changing what data they want. Doing a file crawl or file audit to try and track down various pieces of data is not only inefficient, it is problematic by way of accuracy in many instances. Data asks should only be altered at the start of a funding year and should only be changed from one year to the next when there is a compelling reason to do so.
Insufficient infrastructure to support analysis. And then there are times when service providers collect all of the data requested and send it to their funder. I have lost count of the number of times there are not enough staff with the Funder to pull the data together across the agencies, undertake the necessary quality assurance analysis on the data, analyze the data appropriately and report back out on the data. It goes into a black hole. Tragic.
So if you want to make the best use of the data collection and analysis in the environment you work in, ask yourself these questions:
Do we correctly capture the information we need to know if we are meeting our stated objectives? Yes, the data collected should be directly linked to the objectives of your activities. Do not collect more than you need to. Keep it simple. And here is a tip – pull together a small group of people who do the work on the frontlines to help define the data to be collected relative to the objectives. They are a great barometer on what is helpful and what is crap.
Do we have a plan for analyzing this information in regular intervals? Set out an analysis and reporting out schedule in advance. Don’t get too ambitious. Figure out what needs to be shared internally and what should be shared externally, how and when. Once you have the plan, stick to it. If you let it slide it is amazing how complacent the organization will be about data and reporting out.
What do we do as a result of this information? This isn’t data collection and analysis just because a funder told you had to in order to get the money. You need to get into a mindset that if you collect only the data you need you should be able to reflect on service delivery and make it better.
How can we do this better? Take a step back once every 6-12 months and ask yourselves how you can collect and use data better. You may find that this is the key to decreasing the amount of data that you collect and doing more meaningful things with the data you do have to improve programming and service outputs and outcomes.
For the umpteenth time, Iain will be presenting Data and Performance Simplified at the National Alliance to End Homelessness Conference in Washington, DC this July – and he is happy to do so. Stay tuned to see the presentation on the Alliance website shortly after the conference.