Reliably biased |
One thing that virtually everyone seems to believe about
survey research is that a bigger sample is better: a sample of 500 respondents
is better than 100, and 1000 is better still. And best of all is a census where,
technically, the margin of error is reduced to zero.
However, the truth of the research adage about bigger sample
sizes holds if, and only if, there is no bias. And the probability of zero bias
is rare.
Researchers know this – but the media and the general public
do not. The press and their audiences fixate on the number of respondents in
research. To overcome this problem, we need to talk about bias.
When the Census 2016 online form was inaccessible for more than 48 hours from 7.30pm on census
night (August 9), it created a media flurry. However, when the ABS reported
that more than 96 per cent of households had completed the census by the closing date (September
23), many concluded that the problem was resolved.
However, the problem of bias remains, and this is not
necessarily resolved with a high response rate. The real #censusfail is less a data collection glitch and more the
threat it has posed to data quality.
This potential for biased results in the census has been
barely discussed and appears to have been largely overlooked or ignored. One
columnist at The Australian notes
that it is a mystery why the government has not
acknowledged the problem.
The data provided by the census are very important, which reinforces the
need to identify potential problems including bias.
According to one news report, ‘the ABS is adamant the
quality of data has not been compromised.’ But is this true
– are the data unbiased?
Whether or not there is bias is vexingly uncertain. Bias is
much more difficult to measure than sample size and sampling error, but it is much
more important to try to do so.
There are perhaps two key measures that might be considered
in assessing potential bias.
Non-response
What proportion of invited respondents did not respond
(which the ABS calls ‘undercount’)? And, more importantly, are
those non-respondents different from those who did respond?
The media, the public and other research audiences need to
be reminded of how wrong even a very large sample can be as illustrated by the
infamous failure of The Literary Digest poll
in predicting the winner of the Landon-Roosevelt presidential race in 1936.
While many remember this case study as an example of a
sample selection bias (using readers of The
Literary Digest and lists of automobile and telephone owners), empirical research to determine why the
poll failed concluded that the incorrect prediction was due to the non-response
bias.
The Literary Digest predicted a Landon win based on an
enormous sample of 2.4 million respondents. However, a total of 10 million were
invited to participate. If the 75 per cent non-response rate had been eliminated, the poll would have
correctly predicted a Roosevelt win.
The danger of high non-response rates and non-response bias
continues to undermine the accuracy and usefulness of much survey research
today including more recent US Presidential polls.
While it appears that only four per cent have not responded
to Census 2016, could the non-respondents differ from those who have responded?
Failing to get responses from a small, distinctly different
segment can have a significant impact. This is why the ABS makes special
efforts to capture potential non-responders such as those living rough (<1 per cent of the population), on the
road or out bush.
If we hypothesise that the four per cent of non-respondents
to Census 2016 were those who tried to respond online, were frustrated in their
efforts by the website outage, and subsequently refused to respond, we could
guess that they would be more likely to be younger, live in major cities and
have children under 15 years, based on ABS research about internet users.
Mis-response
What proportion of respondents mis-responded? Mis-responding
(which the ABS describes as ‘respondent error’) can of course, be
intentional, unintentional or both.
The unprecedented antipathy expressed towards Census 2016 in
the media and in the Twittersphere seems to suggest a degree of
uncooperativeness.
Even before the census collection began in earnest, privacy concerns led many to consider approaches to mis-responding. A number of politicians then publically indicated that
they would not provide their name as required, thereby openly admitting their
intention to mis-respond.
Incipient frustration was likely fanned by the outage of the
website, suspicions about the reasons for the outage, and some fairly heavy-handed threats about substantial fines for not
completing the census form.
Might respondents have intentionally falsified or fabricated
responses? It seems unreasonable not to expect it!
Even if we could rely on the saintly nature of our
respondents to remain unphased by the entire furore, could unintentional
mis-responding have resulted through memory failure due to delayed responding?
One week after census night, only 50 per cent of households were reported as having completed the
census. This means the remaining 46 per cent completed their census form in the
following five weeks up to September 23. They therefore had to remember all the
persons present on census night and to remember (if they ever knew) all the
relevant details for each person: age, previous addresses, religion, race,
occupation, income, education, etc.
Perhaps Australian householders retained all that was needed
in memory and/or were very forgiving and remained helpful throughout the
process of completing their form.
Or perhaps there is a strong likelihood of bias.
Conclusion
The Census 2016 reminds us that:
- All survey research is subject to bias; the census is no exception.
- Bigger samples may be better, but unbiased is best.
- Noisy estimates can be reduced with larger samples, but systematic bias is not so easily eliminated. We need to start talking about bias.
- We need to determine both the non-response rate and the mis-response rate – and establish whether those not responding or mis-responding are different from others.
(Published in Research News, November 2016, pp 6-7, Australian Market & Social Research Society, http://www.amsrs.com.au/searchbook?id=102#page=3).
No comments:
Post a Comment