Machine Intelligence

http://bi.snu.ac.kr/NRL/HBN/HBN_sf1_5000.jpg

Digital Statisticians

INST 4200

David J Stucki

Spring 2015

Introduction

Suppose you are trying todetermine if a patient hasinhalational anthrax. You observethe following symptoms:

•The patient has a cough

•The patient has a fever

•The patient has difficultybreathing

Introduction

You would like to determine howlikely the patient is infected withinhalational anthrax given that thepatient has a cough, a fever, anddifficulty breathing

We are not 100% certain that thepatient has anthrax because ofthese symptoms. We are dealingwith uncertainty!

Introduction

Now suppose you order an x-rayand observe that the patient has awide mediastinum.

Your belief that that the patient isinfected with inhalational anthraxis now much higher.

Introduction

•In the previous slides, what you observed affectedyour belief that the patient is infected withanthrax

•This is called reasoning with uncertainty

•Wouldn’t it be nice if we had some methodologyfor reasoning with uncertainty? Why in fact, wedo…

Probabilities

The sum of the redand blue areas is 1

P(A = false)

P(A = true)

We will write P(A = true) to mean the probability that A = true.

What is probability? It is the relative frequency with which an outcome would beobtained if the process were repeated a large number of times under similarconditions*

*Ahem…there’s also the Bayesiandefinition which says probability is yourdegree of belief in an outcome

Introduction - Bayes’ Theorem

Test A (test to screen for disease X)

Prevalence of disease X was 0.3%

Sensitivity (true positive) of the test was 50%

False positive rate was 3%.

What is the probability thatsomeone who tests positiveactually has disease X?

Doctors’ answers ranged from 1% to 99%(with ~half of them estimating theprobability as 50% or 47%)

Gerd Gigerenzer, Adrian Edwards “Simple tools for understanding risks: from innumeracy to insight” BMJ VOLUME 327 (2003)

http://www.capitalwired.com/wp-content/uploads/2014/12/a-few-helpful-doctors-ready-to-excuse-you.jpg

Test Terminology

Introduction - Bayes’ Theorem

Test A (test to screen for disease X)

Prevalence of disease X was 0.3%

Sensitivity (true positive) of the test was 50%

False positive rate was 3%.

What is the probability thatsomeone who tests positiveactually has disease X?

Doctors’ answers ranged from 1% to 99%(with ~half of them estimating theprobability as 50% or 47%)

The correct answer is ~5%!

Gerd Gigerenzer, Adrian Edwards “Simple tools for understanding risks: from innumeracy to insight” BMJ VOLUME 327 (2003)

Let’s do the math…

What is the probability thatsomeone who tests positiveactually has disease X?

Conditional Probability

•P(A | B) = Out of all the outcomes in which B is true, howmany also have A equal to true

•Read this as: “Probability of A conditioned on B” or“Probability of A given B”

P(F)

P(H)

H = “Have a headache”

F = “Coming down with Flu”

P(H) = 1/10

P(F) = 1/40

P(H | F) = 1/2

“Headaches are rare and flu is rarer,but if you’re coming down with fluthere’s a 50-50 chance you’ll have aheadache.”

Bayes’ Theorem

where A and B are events.

•P(A) and P(B) are the probabilitiesof A and B independent of each other.

•P(A|B), a conditional probability, is theprobability of A given that B is true.

•P(B|A), is the probability of B giventhat A is true.

Bayes’ Theorem

Belief

Prior distribution

Evidence (observed data)

Posterior distribution

The prior distribution is theprobability value that the personhas before observing data.

Disease X

Tests

The posterior distribution is the probabilityvalue that has been revised by usingadditional information that is laterobtained.

Bayesian Networks

Disease X

Tests

Belief

Prior distribution

Evidence (observed data)

Posterior distribution

BAYESIAN NETWORKS

Bayesian Networks

Disease X

Tests

Belief

DISEASE

Yes

0.003

0.997

DISEASE

Yes

TEST

Positive

0.50

0.03

Negative

0.50

0.97

Belief

Prior distribution

Evidence (observed data)

Posterior distribution

BAYESIAN NETWORKS

So let’s calculate it out…

A Bayesian Network

A Bayesian network is made up of:

P(A)

false

0.6

true

0.4

P(B|A)

false

0.01

false

true

0.99

true

false

0.7

true

0.3

P(C|B)

false

0.4

false

true

0.6

true

false

0.9

true

0.1

P(D|B)

false

0.02

false

true

0.98

true

false

0.05

true

0.95

1. A Directed Acyclic Graph

2. A set of tables for each node in the graph

A Directed Acyclic Graph

Each node in the graph is a randomvariable

A node X is a parent of another node Yif there is an arrow from node X tonode Y eg. A is a parent of B

Informally, an arrow from node X tonode Y means X has a directinfluence on Y

A Set of Tables for Each Node

Each node Xi has a conditionalprobability distribution P(Xi |Parents(Xi)) that quantifies the effectof the parents on the node

The parameters are the probabilities inthese conditional probability tables(CPTs)

P(A)

false

0.6

true

0.4

P(B|A)

false

0.01

false

true

0.99

true

false

0.7

true

0.3

P(C|B)

false

0.4

false

true

0.6

true

false

0.9

true

0.1

P(D|B)

false

0.02

false

true

0.98

true

false

0.05

true

0.95

Inference

•Using a Bayesian network to computeprobabilities is called inference

•In general, inference involves queries of the form:

P( X | E )

X = The query variable(s)

E = The evidence variable(s)

Inference

•An example of a query would be:

P( HasAnthrax| HasFever and HasCough)

•Note: Even though HasDifficultyBreathing andHasWideMediastinum are in the Bayesian network, they arenot given values in the query (ie. they do not appear either asquery variables or evidence variables)

•They are treated as unobserved variables

HasAnthrax

HasCough

HasFever

HasDifficultyBreathing

HasWideMediastinum

The Bad News

•Exact inference is feasible in small to medium-sized networks

•Exact inference in large networks takes a very long time

•We resort to approximate inference techniques which aremuch faster and give pretty good results

Person Model (Initial Prototype)

Anthrax Release

Location of Release

Time Of Release

Anthrax Infection

Home Zip

Respiratory

from Anthrax

Other ED

Disease

Gender

Age Decile

Respiratory CC

From Other

Respiratory

Respiratory CC

When Admitted

ED Admit

from Anthrax

ED Admit

from Other

ED Admission

Anthrax Infection

Home Zip

Respiratory

from Anthrax

Other ED

Disease

Gender

Age Decile

Respiratory CC

From Other

Respiratory

Respiratory CC

When Admitted

ED Admit

from Anthrax

ED Admit

from Other

ED Admission

…

Yesterday

never

False

15213

20-30

Female

Unknown

15146

50-60

Male

Bayesian Networks

•In the opinion of many AI researchers, Bayesiannetworks are the most significant contribution inAI in the last 10 years

•They are used in many applications eg. spamfiltering, speech recognition, robotics, diagnosticsystems and even syndromic surveillance

HasAnthrax

HasCough

HasFever

HasDifficultyBreathing

HasWideMediastinum