View Only Articles , Only References , Everything

Friday, October 24, 2008

Applying IDQ Principles of Research To The Bible

By applying principles of Information and Data Quality (IDQ) in research to the Bible, it can be shown that a high level of confidence in the accuracy and reliability of the information in the Bible is irrational, therefore arguments or claims using the bible as a premise are inherently weak.

Cross-check, Cross-check, Cross-check!
Accuracy and verifiability are part of the foundation of IDQ.

Researchers of Information and Data Quality (IDQ) have created classifications for Data Collectors, Data Custodians and Data Consumers. Those that collect the data provide it to those that store it and maintain it, and to those that use it. There are different values associated with IDQ dimensions depending on which categorical context it falls into(16). For example, the data custodian considers accuracy as the number one value while the consumer (depending on the context) may not consider accuracy the most important dimension. In all cases the most important criteria for the user is whether or not it is useful.

The fact that the consumer does not necessarily regard accuracy as the highest value creates a market for less accurate information which enterprising data producers are willing to satisfy. One example is the "tabloid" and "gossip magazine" industry. However, the desire for useful though inaccurate information extends across categories into business, marketing, politics and religion. Unfortunately, to ensure accurate data when needed, some extra work is necessary in the form of cross-checking.

Who is the author?
Like everything in life, cross-checking should be able to be used to verify a piece of information to see if it makes sense from another perspective. One way to do that is by being able to identify the author. When the author can be identified their credentials can be reviewed. Whether or not the author is an expert can be assessed, what their peers thought of them and what environment they lived in. These properties can be used to cross-check to see if the information has external consistency and makes sense from other perspectives. These properties allow the use of inference to assess the credibility, plausibility, believability and most importantly the accuracy of the information. There is no precise definition of accuracy, and in fact many of the dimensions of IDQ are self-referential, but it is the case that what accuracy is NOT is apparent and using that as a criteria, a working definition can be derived.

Accuracy implies that the datum represents a real world state.
It implies that when the data are reviewed, and compared to the real world event or object it describes the real world event sufficiently for more than one person to have as close to the same understanding of it as possible. An accurate representation of a real world event will not be ambiguous, will not lack precision and will not be incomplete because this will lead to inferences about the real world that do not or never existed or that represent an incorrect element in the real world(3).

Accurate and verifiable data are crucial to having enough understanding about the subject to be able to make reliable decisions, inferences and predictions in order to increase the likelihood of successful outcomes. Verifiability increases the credibility of information.

Your spouse, parents and reputable organizations endorse accurate reporting.
Almost everyone that has an interest in making some kind of an investment whether its monetary from a giant corporation or emotional from a trusting spouse desires, requires and demands IDQ. Human understanding and knowledge depend on it. Technology is successful because it builds on the accurate reporting and successful reproduction of work that came before it. Relationships are successful because Information Quality (also known as truth) fosters trust. Since Information Quality is so fundamental, it is easy to find reputable organizations that endorse it and not just your mother, father, spouse or friend.

Reputable organizations such as Cornell University(17), East Tennesee State University(19) and George Mason University(20) and McGraw Hill(21) and the U.S. Government(18) have websites set up which are devoted to promoting criteria for assesing the quality of information from sources. They place a high value on it and stress the importance of it. Two other websites related to education are "The Virtual Chase"(22) which is devoted to "teaching legal professionals how to do research", and Robert Harris's VirtualSalt(15) which is heavily referenced throughout the Internet. VirtalSalt has a checklist called "CARS" which was derived from the first letter of its major criteria, Credibility, Accuracy, Reasonableness and Support. The CARS Checklist encapsulates the research criteria that are endorsed by reputable organizations in an easy to remember mnemonic and can be found here

Criteria for Data and Information Quality in research
Listed below are the components of the CARS checklist. The initials of some of the other organizations listed above are used to show where their criteria fit into it. Their initials are beside the data quality dimension they endorse - vs is VirtualSalt, c is Cornell, vc is VirtualChase,

* Credibility (Credentials)
vs Author, c Author, vc Authority, c Publisher, c Title of Journal
Two relevant indicators of a lack of credibility are Anonymity and lack of quality control.

Critical Questions to ask are:
- Why should I trust this source?
- What is it that makes this source believable?
- How does this author know this information?
- Why is this source believable over any other?
- What are the authors credentials?
- What type of quality control did it undergo?
- Was it peer reviewed?

* Accuracy
vc Accuracy, vs Timeliness, vc Timliness, vs Comprehensiveness, c Coverage, vc Scope of Coverage, vs Audience and Purpose, c Intended Audience, c Edition or Revision, c Date of Publication,
Three relevant indicators of a lack of accuracy are no date for the document, vague or sweeping generalizations and biased to one point of view.

Critical Questions to ask are:
- Is it accurate? Is it correct?
- Is it up to date? Is it relevant?
- Is it Comprehensive? Does it leave anything out?
- What was the intended audience and purpose?

* Reasonableness
vs Fairness, vs Objectivity, vc Objectivity, c Objective Reasoning, vs Moderateness, vs Consistency, World View, - c Writing Style, vs consistency, vs world view
Some relevant indicators of a lack of reasonableness are intemperate tone or language, incredible claims, sweeping statements of excessive significance and inconsistency (written on the VirtualSalt as "conflict of interest")

Critical Questions to ask are:
- Does it offer a balanced, reasoned argument that is not selective or slanted?
- Is it biased?
- Is a reality check in order? Are the claims hard to believe? Are they likely, possible or probable?
- Does this conflict with what I know from my experience?
- Does it contradict itself?

* Support
vs [source documentation or bibliography], vs corroboration, vs External Consistency, c Evaluative Reviews
Some relevant indicators of a lack of support are numbers and statistics without a source, absence of source documentation and/or there are no other corroborative sources to be found.

Critical Questions to ask are:
- Where did this information come from? What sources did the author use?
- What support is given?
- Can this be cross-checked with at least two other independent sources?
- Is the information in the other independent sources consistent with this information?


What are some real world examples of poor Data and Information Quality research?
Conclusions about History are necessarily defeasible. One of the problems is that methodology and techniques improve a little every century. Conclusions made about a certain topic are revised as new information turns up. New information is used to compare to the old information for coherency and consitency. Some of these problems stem from poor data creation by the originator. Data are not accurate or complete. Users still struggle with these problems today. "A Website Dedicated to Information/Data Quality Disasters from Around the World" has been set up by the International Association for Information and Data Quality (IAIDQ) and its called IQ Trainwrecks(14 ). "Poor data quality can have a severe impact on the overall effectiveness of an organization"(3) and "Poor data quality can have substantial social and economic impacts"(11) that span the spectrum from news to marketing to text books to health care. Fortunately we can examine the methods of the ancient historians and scientists to see what led to poor results so that we can avoid those methods, improve what can be improved and derive new ones to replace the old.

Applying Data and Information Quality for research to the Bible.
As accurate as they tried to be, the authors of scripture still suffered from the same sorts of problems common with ancient historians and scientists. They were biased, inaccurate, had no way to verify information, depended on second or third hand information from relatively uneducated people, were influenced by political affiliations and commissions from aristocrats and state leaders and had poor tools to work with.

The Authors of the bible do not do any better job than their historian and scientific peers in documenting the world. In fact, of the three categories, scientists fared somewhat better because of their quality of documentation. The Library in Alexandria was destroyed by fire over time so much of ancient scholarship and science was lost but some of the works that do remain leave little doubt about how to reproduce their experiments or their authorship.

It used to be believed that every author of every book in the bible could be identified but over time, it has come to be recognized that tradition is a poor way to record who authored what. External verification of the data revealed how unlikely it was that the person traditionally believed to be the author actually was or even existed.

According to several sources "The Bible comprises 24 books for Jews, 66 for Protestants, 73 for Catholics, and 78 for most Orthodox Christians." (wikipedia) From others: "The Protetant Bible contains 66 books (39 OT, 27 NT); the Catholic Bible contains 73 books (46 OT, 27 NT); the Eastern Orthodox Bible contains 78 books (51 OT, 27 NT). The Hebrew Bible (the name of the OT by Jews) contains only 24 books.(23)

Most of the authors of the original information about the Abrahamic God are unknown
There are different books in the bible depending on if you use the Hebrew, the protestant, the catholic or the orthodox (for example) If we use the greatest number of books in any bible as our total, then there are only about 21% of them where the author can be identified. 79% percent of them are unknown(24). 79% percent of the original information that exists about the abrahamic god comes from unknown sources. One of the indicators for lack of credibility in a work is anonymity(15). A small percentage of scripture are not considered worthy of inclusion between denominations. What makes one worthy to one group and not worthy to another? Lack of credibility is one criteria that comes to mind.

The bible is an amalgum of scriptures that span years. Some of the scriptures seem to be derived from other scriptures most of which were also included in the Bible. Trying to use the criteria for varied sources for cross-checking with the Bible is difficult because they were derived from each other, a large portion of the authors are unknown and the quality of production was poor. The criteria used to put them together is not clear but a presumption at a minimum of a need for coherency and consistency is warranted.

The word "trust" is used liberally to describe IDQ criteria. While the bible is generally considered to be trustworthy, is it really? What is it about something that make it "trustworthy"? Accuracy? Coherency and consistency with what we know from our experience?

What follows is a summary of principled research criteria standards which the Bible does not meet with some generic examples.
For the sake of brevity I did not include many solid examples but I do welcome audience participation by documenting them in the comments.

* Authorship - Traditional authorship have been overturned by later scholarship
* Not up to date - Leviticus and Deuteronomy in the OT, Pauls bias against women in the NT
* Inaccurate, incorrect - The rivers of Eden in the OT, Inconsistencies between the gospels
* Irrelevant - Leviticus and Deuteronomy in the OT, ambiguous NT fallacy apparently contradictory anyway "Whoever is not against us is for us — Mark 9:40" vs "He who is not with me is against me — Matthew 12:30a"
* Bias - Old testament treatment of worshipers of other gods, NT treatment of Jewish leadership and scholars.
* Unlikely - Most of the OT and in NT Jesus sternly rebuked his disciples for sleeping in the garden of gesthemane so who witnessed it?
* Conflicts with knowledge obtained from our experiences - Magicians do water to wine tricks.
* Contradicts itself - Who discovered the empty tomb?
* Cross-checking with external sources is extremely difficult and does not support to a large degree. There is no verifiable eyewitness account of the existence of Jesus, however that does not mean he did not exist.

Robert Harris's VirtualSalt has a checklist with a mnemonic for how to deal with information.

Living with Information: The CAFÉ Advice from VirtualSalt(15)
Challenge
Challenge information the information with critical questions and expect accountability.

Adapt
Adapt your requirements for information quality to match the importance of the information and what is being claimed. Extraordinary claims warrant extraordinary evidence.

File
File new information in your mind rather than immediately reaching a conclusion. Turn your conclusion into a question. Gather more information until there is little room for doubt.

Evaluate
Evaluate and re-evaluate regularly. New information or changing circumstances will affect the accuracy and the evaluation of previous information.

I will sum it up in a word.
Cross-check, Cross-check, Cross-check.

REFERENCES AND FURTHER READING
1. Wikipedia, "Data Management"
2. Information Quality at MIT
3. Anchoring Data Quality Dimensions in Ontological Foundations
4. DMReview, Data Management Review
5. IQ-1 Certificate Program
6. Wikipedia, 2003 Invasion of Iraq
7. How Accurate Is The Bible?
8. Datalever.com
9. Wikipedia, Tanakh
10. Null Hypothesis
11. Beyond Accuracy: What Data Quality Means To Consumers
12. IQ Benchmarks
13. Reasonable Doubt About Adaption Theory
14. IQ Trainwrecks
15. Robert Harris' VirtualSalt
16. Data Quality Assessment
17. Cornell University Library
18. Guidelines for Ensuring and Maximizing the Quality, Objectivity, Utility and Integrity of Information Disseminated by Federal Agnecies
19. East Tennesee State University Researchers Toolbox
20. George Mason Univeristy
21. McGraw-Hill Higher Education, Evaluating Internet Resources
22. The Virtual Chase, Criteria for Quality in Information--Checklist
23. Know Your Bible
24. Wikipedia, Authors of The Bible
25. Ancient HistoriansPart 1, Part 2
Email this article

No comments: