## “Weight of evidence”, or why internally consistent theories that explain more are more likely to be true

This post is dedicated to Edenist Whackjob because he’s having trouble with this idea and I haven’t been able to keep up with all his comments. Also he is the best example of “massive lateral thinking” that I’ve ever seen :-P

For example, a detective may have two distinct theories about a murder. 1) The butler did it, 2) somebody else did it. His job is to craft an internally cohesive narrative which fits all the facts together with perfect certainty. Not one fact may be out of place. If it can be proven that the butler has a solid alibi, then it doesn’t matter how much of the rest of the evidence points his way.

The question of what is a “fact” is a perceptual matter of induction and belief. Whether we believe that the butler’s alibi is “solid” (sound, in logical parlance) is a matter of perception. It can But if we accept it as truth, then the detective cannot also accuse the butler of the murder.

The narrative (theory) does not have to describe everything that has ever happened, from the solubility of salt to the Warsaw pact, in perfect causal detail. However, it must explain as many facts as possible, with emphasis on those we consider relevant to the case. With this in mind, I am approaching a logical definition for the phrase “weight of evidence”.

Given a set of facts A = {a, b, c, d…} where the values are weighted by “relevance”, then a theory’s “weight of evidence” is the sum of the weighted facts it explains, divided by the total weight of all possible theories. Remember, we are assuming all “theories” are internally consistent. If a theory is not internally consistent, then it has a 0% probability.

For example, there are two competing theories “Butler” and “Not-butler” which are internally consistent. Butler explains a, b, c but d is unaccounted for. Not-butler explains c and d. If all evidence is weighted equally, then the total weight T is (a + b + c) + (c + d). Then the weight of evidence for the butler theory is (a + b + c)/T.

If they’re weighted equally (a = b = c = d), then T = (a + b + c) + (c + d) = (a + a + a) + (a + a) = 5a. And “Butler” = (a + b + c)/5a = (a + a + a)/5a = 3a/5a = 60%. That is, we can say there is a 60% chance that the butler did it. Not enough to convict, but enough to continue the investigation. (Because the other theory is mutually exclusive, “not butler” has a 40% chance, or 1 – P(“Butler”).)

If the probability is high enough, then in court terms we would say it is “proven beyond a reasonable doubt”.

### 40 Responses to “Weight of evidence”, or why internally consistent theories that explain more are more likely to be true

1. Edenist whackjob says:

Nicely explained. Do I have a problem with this? :)

• Aeoli Pera says:

No, I don’t think so. I have merely formalized the source of your uncertainty.

• Edenist whackjob says:

Example where I’ve given voice to that? Just trying to get it.

• Edenist whackjob says:

“Here, particularly,”

I don’t think I’m using your mode of deduction, though. This needs clarified.

• Aeoli Pera says:

I was pointing out where you expressed anxiety over the uncertainty of this method. But maybe I misunderstood your question.

• Edenist whackjob says:

“I may not be able to keep up anymore with my intent to reply to every comment, but reading and considering them all is the least I can do.”

Should I stop posting every random thought I have here? Put down your ground rules. No need to abuse your ingenopathy (ie need to consider everything worthy of a reply).

• Aeoli Pera says:

Of course not, you produce an extraordinary amount of good ideas! You really ought to have a blog, but I’m fine with you writing one in my comments if that’s easiest for you. And I enjoy interacting with you too.

• Edenist whackjob says:

“I was pointing out where you expressed anxiety over the uncertainty of this method. But maybe I misunderstood your question.”

I don’t think my situation is analogous to your whodunnit example. It’s not like I’m stringing together two internally coherent explanations for comparison. My trouble is more with what is actually relevant. Ie pattern-matching more than deduction.

• Edenist whackjob says:

“Of course not, you produce an extraordinary amount of good ideas!”

That’s great to hear. I often doubt myself (not a humble-brag, it’s the truth).

Maybe you should go through all of my comments and create some compilations, like you did with Game. Post it on the blog and see what people say. Maybe I’ll start a blog then.

Main problem with starting a blog: I sometimes suspect I am mentally ill (I *know* I have an anxiety disorder and a generally depressive/anhedonic outlook on life, but I don’t really know if I am eg paranoid schizophrenic too). Having a blog would reinforce those delusions that I maybe-have.

2. Edenist whackjob says:

Problem: how do you weight objectively? In your example you just assume equigravity of all links.

• Aeoli Pera says:

You can’t. There is no such thing as an objective perception.

• Edenist whackjob says:

So you can only say theory 1 > 2 if you first stipulate the weights to use.

• Edenist whackjob says:

Does that mean that your theory needs another paradigm where relevance is the key factor, or am I asking the wrong question here?

• Aeoli Pera says:

Yeah, a complete theory would have to explain how to decide what is relevant. The fact that ordinary people do this very well is just another miraculous property of the supercomputer upstairs.

• Edenist whackjob says:

“The fact that ordinary people do this very well ”

Can you explain a bit further? Not trying to be a dick, I just can’t relate to that statement.

• Aeoli Pera says:

If there’s been a murder, and there’s a gun on the floor, even a dull person will assign a high relevance to the gun.

If you don’t think that’s downright miraculous, then try to write a program to tell a computer how to do that in any situation.

• Edenist whackjob says:

“If there’s been a murder, and there’s a gun on the floor, even a dull person will assign a high relevance to the gun.”

It’s certainly the most salient thing, yes.

I just don’t see how that mode of reasoning is analogous to what I do.

I think a better analogy would be finding a room with a gun, a naked lady (alive), Donald Duck, and a bro from a frat, and then being asked “what genre is the movie?”

“If you don’t think that’s downright miraculous, then try to write a program to tell a computer how to do that in any situation.”

Well if you give it a corpus of text and ask it to pick out the most salient facts, then I think it would be within reason to expect an expert-system AI to pick out “gun” given the parameters “seek for relevant associations to ‘murder'”.

3. Edenist whackjob says:

4. Edenist whackjob says:

“Also he is the best example of “massive lateral thinking” that I’ve ever seen :-P”

Care to expand that a bit? I’m flattered :)

5. Edenist whackjob says:

• Aeoli Pera says:

That is downright pathetic. It is cruel not to inform an idiot that they are an idiot, and it is downright evil to give an idiot a position of real power over others..