Our first presentation for today and our

next for the program is on statistics and probability theory and originally

this was scheduled to be a duet however because of weather issues and

flight cancellations and things of that nature it’s going to be a solo and I can

feel for our speaker House Stern because several years ago I was asked to

moderate a panel at Northwestern and I made it through weather to Chicago but

the panel didn’t and so being resourceful in finding that out when I

got off my flight that night in Chicago I went to an H &M and purchased a box of

socks and drew little faces on the socks so we did sock sock puppets hell is more

I guess more daring than that so absent sock puppets our next speaker is how

Stern how is the Chancellor professor in the Department of Statistics at the

University of California Irvine he came to Cal Irvine in 2002 as the

founding chair of the department of statistics he served for eight years as

chair of statistics and then six and a half years as Dean of the Donald Grand

School of Information and computer sciences prior to joining the faculty at

Cal Irvine he had faculty appointments at Iowa University Iowa State University

and Harvard within Statistics he’s known for his research work in Bayesian

statistical methodology and model assessment techniques and for

collaborative projects in the life sciences and Social Sciences

his current focus is on the application of statistical methods in forensic

science and he is a co-director of the NIST funded center for statistics and

applications in forensic evidence and also serves on the physics pattern

interpretation committee of the organization of scientific area

committees or OSAC please welcome how stern thanks Matt so what we learned

from the introduction is I’m not as creative as Matt that’s that’s that’s

number one a couple of preliminary remarks so as

Matt mentioned this is kind of work I do with Alicia care query who’s the

director of the center C safe at Iowa State University and she was at FD

meeting the firearms and tool-mark examiner’s meeting in Nashville the

presentation I’m gonna give this morning and then the second half this afternoon

which are labeled in the program statistics one and two are based on a

presentation that we developed last summer just a little bit of background

that we did at the ABA meeting in Chicago for the Justice the judicial

division and so it was a partnership with a few judges and what they told us

as we talked with them was that to really motivate it you should have some

cases so we would introduce a case and talk about it and then talk about the

statistics or the case so that’s the genesis of this when we were planning

this you know Matt I sent Matt that material he thought it was good and a

good way to introduce statistics and probability especially probability for

the for this group so so that’s the basis for it I I sent in a set of slides

so that we could get continuing legal education credit for this the slide

you’re about to see look nothing like those slides so the bad news is they

look nothing like those slides if you study them or do anything about them the

good news is these are much and I’m more than happy to share these

after the fact if you’d like a teeny bit about myself and first thing I want to

do is a little experiment close okay okay well I don’t like that either

okay teeny bit about myself I mean beyond what Matt told you so I am NOT a

forensic scientist I do not play one on TV

I’m a statistician that’s all I do but for the last 20 years on and off and for

the last decade quite a bit I’ve been spending time on forensic statistics and

the role of statistics in the analysis and interpretation of forensic evidence

and we have this Research Center C safe it’s called and I’ll give an update

about that later this morning but so that’s kind of this that’s what informs

all of this and so let me get started still in the basic setting the table or

context I think everyone in the room knows this we find ourselves in an

interesting time in forensic science for a variety of reasons the National

Academy report of a decade ago is the top-left the PCAST report is the second

one over they analyzed the state of the world with respect to the forensic

science and its application in the courts and had some expressed some

concerns about especially what we call the pattern disciplines and the

scientific foundation for the kinds of statements that are being made and so a

lot of what you can hear today’s focus in and around that of course there’s a

number of exoneration so I have something from the Innocence Project

website also with about 40% of those having a full misleading or incorrect

forensic evidence as a contributor and then Matt mentioned my involvement with

OSAC the bottom of the slide are some of the things that have happened since

those reports the National Commission of forensic science those at

times two standardized procedures in the light so this part is where you have

more expertise than I so I won’t say a ton about it but the key points for

today is I’m gonna try and refer back to this notion of R’lyeh especially this

afternoon the need for reliability and in general and reliability in the case

at hand and all of this has led to the increase of the role of probability and

statistics I speak to I spoken to a number of groups of lawyers at judges

over the last few years I’m always reminded when I do so that many of the

people who are speaking to went to law school because they had no interest in

probability and statistics or anything else

mathematical so I’m cognizant of that formulas are I can’t say zero but

minimized minimized there’s one that’s central to the application of forensic

evidence the interpretation of forensic evidence is but there should be no you

know no tears no heavy mathematics none of that ok it’s conceptual but it’s real

I’m gonna be talking about probability and where it comes from and how it’s

used and the like ok so as I said most of my work has been around what’s called

the pattern evidence disciplines and within that there’s a focus on what’s

called the source comparison question so there are typically two items of

evidence one found at the crime scene one associated with the suspect and the

goal is to determine if they have the same source and so lots of examples of

this the picture at the bottom left obviously is a latent fingerprint on the

left and exemplar or a known fingerprint on the right and the question is did the

latent print get made by the same person to whom that then you’re on the right

belongs same thing for Hugh prints but even though the DNA is not called

pattern evidence the same question arises right there’s a DNA sample a

blood sample or DNA derived from the crime scene DNA from this us

you know do they match and what’s the importance of that match so I guess I do

have to hope in this it’s a little bit easier and I have to remember not to

touch it okay so the again framework wise I think

everyone agrees on what I have on this slide that is if you talk to examiner’s

or you talk to statisticians about how this should be done they all say we do

the same thing which is we look at that two pieces of evidence we assess

similarities we see maybe any differences we see and try to judge

whether those observations are likely under the first hypothesis that the

objects came from different sources and then from whether they have the same

source and I have my little scales here we try to balance those and have that

help to help us make a decision from there there are a variety of ways that

that gets implemented and so again this is all in the form of background this is

a separate talk that I have where I talk about the three approaches to

interpretation of forensic evidence so there are three approaches that I think

of as common if you bring another speaker here and

say how many approaches you’ll almost certainly get a different number people

parse things in different ways but the way I’m parsing it for today is number

one there is what is the status quo with many of the disciplines that I’m going

to talk about which is expert assessment the expert opinion so an examiner looks

at the evidence based on their experience training and the standards in

their field and that typically gets summarized by a categorical conclusion

in fingerprints you only get one of three answers from an examiner at the

present time there is an identification the two prints have the same source

there’s an exclusion the latent print was not made by the person whose known

print you have or an inconclusive no determination was possible for a variety

of reasons the second approach is called the two-stage procedure in it there’s a

first stage in which a binary decision is made

does the evidence match or not and once you determine that it matches

it’s not always matches matches not a not in fact a popular word

indistinguishable is a popular word if the two evidence pieces are

indistinguishable then we would move on to the second stage which is labeled

identification a little confusing but that was by Parker and Holford and

identification in this case is meant to mean okay they match but how probative

is that how likely is it they might match by coincidence for example and the

third approach is the likelihood ratio the likely ratio at present is applied

only in DNA and really only well in simple DNA problems but it’s the subject

of a great deal of discussion and so I will end up at the likelihood ratio okay

so case 1 first case the first case is not it’s very famous within the world of

statistics it appears in a lot of introductory statistics books and it’s

called it’s from California the People vs Collins and you see a short summary

of it here but the basic description is there was a an elderly woman who was

attacked and there were she remembered something there were other witnesses who

remembered pieces of it and what came out of the some of the witness

statements was that the attack was was carried out by a woman with blond hair

in a ponytail and then she met up with a black man with a mustache and a beard

and a yellow car and they drove off and the police eventually were led to an

interracial couple that lived in the area that hell had a yellow Lincoln so

they were charged and at the trial the prosecutor I said you know these are the

frequencies of these various characteristics that the eyewitnesses

have identified how often you would see a woman with a ponytail how often you

would see a yellow car in this neighborhood etc etc

then the prosecutor called mathematics professor math statistics professor

statisticians make fun of mathematician specters anyway I’m a mathematics

professor told the mathematics professor here are the probabilities for all these

characteristics can you help us combine them to get the probability of all of

these characteristics being found in the same couple and math.max tress are used

rule that I’ll go into it a little bit more detail called the product rule for

independent events to multiply them together so one out of every ten couples

would have a yellow car one out of every four of them would have a man with a

moustache one out of every kind of them would so on and so forth so you multiply

them all together and you end up with a number that’s one in 12 million and the

prosecutor used this to say finding a couple like this that matches these

eyewitness description is so rare they must be the couple that committed the

robbery okay so that’s the setup for this trial and now I’m gonna do a little

bit of a deep tour into probability statistics so that was the partner we’re

supposed to be comfortable with map comes the part you’re not supposed to

come forth and I every single statistics course that I teach from the first one

to the PhD students starts with this picture okay because this is the

fundamentally what goes on when you do statistics in any discipline there is

some population typically that you are interested in learning about and you are

typically not able to see the entire population so you focus in on a sample

and probability and statistics two terms Matt used in the introduction fit into

this story in the following way the laws of probability probabilities of branch

of mathematics would they allow you to do is take information you know about

the population and use it to predict or calculate what you might see in the

sample the statistics is the part that goes the other way from the sample and

what I see in the sample what can I say about what’s in the population

I have the circle kind of because that’s actually the way it’s applied is we will

typically start by making some assumption about the population that

tells us what we should see in the sample we’ll look at the sample and see

if it matches what I thought and if it does I drop one conclusion if it doesn’t

I kind of go back and say maybe I should change what I thought I knew about the

population so that’s the way this gets operationalized you know the two forms

of logic here write probability is deductive going from something I know

about a population to a sample that I’m going to collect and the statistics part

is known as inductive inference trying to get from the specific example to the

general population in forensics it’s not always crystal

clear right this story is crystal clear if you’re going to do a political poll

you want to know there’s this population of people do you want to know how they

feel about something you’re going to take a sample of a hundred or four

hundred or a thousand and this picture works perfectly well not always clear

how it works in forensics but it’s it’s there so for the first part of day I’m

going to focus on probability probability is the mathematical language

of uncertainty and the way that it works is the probability of an event the

chance of rain today for example is a number between 0 and 1 describing the

likelihood of the event sometimes expressed as a percent when we hear

about the chances of rain today it’s 60 percent or today’s I think a little

higher then you know that’s equivalent to saying the probability is 0.6 or 0.8

or whatever so one little bit of notation that I’m going to use is

periodically e or some other letter if we like is an event that I’m interested

in rain today and PR is short for probability of course so PR V is the

probability of the event the probability that it rains today and just to get you

to understand our scale problem something has probability 0 means it

can’t happen won’t happen under no circumstance

happen probably one means a certainty one thing you should know about me

statisticians don’t believe anything has problem

you could probably find things that have probability zero but we never believe in

probability one there are two common interpretations of probability one is as

the long-run frequency of the event a number of repeated identical trials and

so I have a picture a pair of a dice if you took statistics long ago most of you

probably did but have long banished the memory lots of examples of dice and

coins and things like that they have the advantage of being easy to work with

fitting into this long term definition of probability you can do lots of rolls

of dice but they have the disadvantage of not being particularly interesting to

anyone other than compulsive gamblers the second definition of probability is

about subjective belief it’s out of my opinion or my assessment and so I grew

up in Queens not far from Shea Stadium so this is my first foray into Photoshop

so it says 2019 World Series champion New York Mets not not likely to happen

but you could say what’s the probability that the Mets are going to win the World

Series so I said there are things with zero

probability but fans let’s get that for now but you know

there’s no way to answer that question by reference to the long-run frequency

what does it mean to say you know if I played the 2019 season a thousand times

you know that’s we do that where I say that in my introductory courses but it’s

kind of a ridiculous things is that you know it just played once and so that

leads to a more subjective view and we use it all the time right when we have

to talk about how likely something is to happen right in terms of making

decisions all the time all right what’s deductible should I get on my

auto insurance you know requires you to think about you don’t think about it

this way but probabilities of accidents and how big accidents and small

accidents and things like that so that’s the language probability is just our

language and now probability is very confusing that is people who study kind

of the science of science as it were or the psychology of science have studied

and shown people have a very hard time with probability you can ask all kinds

of counter two problems are trying to figure out

which one to demonstrate so I’m only going to talk about the birthday problem

more interesting one but I decided it would take too much time is the Monty

Hall problem so if you have like probability after this and wanna learn

more look up the Monty Hall problem from let’s make a deal but a famous one is

known as the birthday problem okay so we have a room here of forty or fifty

people if I said what’s the probability that somebody in this room shares my

birthday it turns out that’s actually unlikely okay it’s about one you know

fifteen percent chance or something like that if I said what’s the probably two

people in the room share a birthday that answer whether you know this or not is

actually remarkably high I don’t know what it is for the size of this room I

know that the break-even point is twenty three people if you have twenty three

people it’s fifty percent that they share a birthday in fifty it gets up to

seventy-five eighty maybe even ninety percent is closer to ninety percent it’s

finally quite interesting so probability sometimes acts in counterintuitive ways

is the only point in this question you don’t have to know why but it does act

in counter intuitive ways the intuition for what’s going on is obviously the

first one is a specific date and so it’s pretty rare that somebody falls on that

specific date the second allows it to be any date of the year and so it turns out

that it’s hard to place fifty people on the calendar with a gang not get through

in the same day in a same day okay so that was probability it’d be

good if we could stop there I think but what turns out to be most relevant for

forensics is what’s known as conditional probability so I’m gonna take up

probability story and then modify a little bit the story you’re about to

hear is based on a true events I looked up my flight yesterday from LAX to JFK

and based on historical data that flight had is delayed 27 percent at a time I

think which means it lands 15 minutes or more late 27 percent of the time

so that’s what I knew right when I made the reservation probability that I was

gonna be delayed getting into New York is 27% now the flight gets closed and

I’m not a great traveler song checking the weather and I see forecasts of

thunderstorms in New York and now as I’m doing my own personal calculation I’m

saying wow the probably I may be the light now is greater right so more

information let me to change the answer and in mathematics we formalize that as

conditional probability so I had a probability based on one set of

information and now I update it because I have new information forecasts of

thunderstorms a little bit more of notation so there’s a this is work yes

there’s a vertical bar here that’s the tricky part now this is written in words

so you know we notation exists right because it’s shorthand it’d be a pain to

write out the full sentence that I’m about to say over and over again but the

full sentence is important in that’s one you want to know which is this notation

stands for what’s the probability that the flight is delayed vertical bar is

read as given given that thunderstorms are forecast so we can do lots of

conditional probability right what’s about ability that Mets win the World

Series given that Robinson Cano gets hurt

something like that I apologize if you’re not a baseball fan I’ll think of

other examples so maybe and this is where it gets tricky

the 27% actually I could tell you where it came from the 27% came from the FAA

looking at the last month and determining the 27% of the time this

particular flight was delayed the update that I don’t have the data for that so

that became me subjectively saying okay my chances of getting the lane have gone

up to 50% I was delayed by more than an hour just for your information just for

your information how does this fit in with our picture the first statement the

probability of a delay is 0.27 or 27% chance of a delay is based on thinking

of one pocket all flights from LAX to JFK actually all

11 a.m. flights from right so that’s the population and in that case I happen to

have data as I said that tells me that 27 percent of those argue late so one

way to think about that is the population has a bunch of little

airplanes in it you can think about that circles having I should have done this

but I didn’t think of it a bunch of little airplanes in it and

27% of them are red because they’re gonna be delayed and 73% are children

because they’re not gonna be delayed okay

the second statement changes the population I’m interested in I’m not

interested in all of those flights I’m only interested in the flights that

occurred on days for which thunderstorms were fortunate so now I get rid of a

bunch of those dead planes because they did not occur on days when thunderstorms

were forecast and I look at the ones that are left in the population and I

look at how many of them are delayed and how many are not I think so conceptually

that’s what I want to do and we can’t do that very easily in this case so I made

up a number of 0.5 but sometimes you can actually do that you can have one

probability and you condition and you change the information set that you’re

working with I didn’t put this on the side because I don’t some people don’t

find it helpful in point of fact what’s really going on is all probability is

conditional it’s conditional on some set of information that you have we often

start with a very large not set of information when you start with in many

cases an absence of information and so we just think of all possible outcomes

and then we learn something that restricts this set but all probability

is conditional on what you’re assuming about the population so that’s

conditional probability and we’re getting closer to the Collins case okay

a couple of things to help us understand conditional probability so this came

from a published article that looked at sentencing of murders in Georgia

african-american murders and that the people were doing the study

trying to determine what factors are associated with this collect the data on

all of such cases and put them in the table that you see here and so this make

sure we’re on the same page there were 45 cases in which the murder had a white

victim and the death penalty was given there were 85 cases in which the victim

was white and no death penalty was given so that’s how this table works and there

are totals around the edges of the table so if I thought about this as my

population then I could first start by saying let’s think about the entire

population and ask what’s the probability that a death penalty

sentence was given and it turns out in total 59 of the 362 cases had a death

penalty sentence which is 16 percent probability 0.16 now I can ask for more

specific information suppose I look at that only for the cases with a white

victim so this moves me into my conditional probabilities I focus just

on the first row and I find out that in 45 of those 130 cases the murderer got

the death penalty so that was 35 percent and for a black victim it was 6 percent

and so conditional probability again changes where we’re sitting and maybe

provide some clue as to what’s going on this is just really intended to

illustrate conditional probability it’s a very simple analysis it ignores a lot

of things and it’s not intended to be particularly germane to the subject

matter so sometimes learning additional

information doesn’t change the answer so my example here is I could ask the same

question instead of wandering about thunderstorms looking at what I ate last

night for dinner and say what’s the probability my flights gonna be delayed

given that I had pasta for dinner the night before we certainly don’t think

that has an impact you know there is this whole butterfly flapping wings

thing that you know my eating pasta might have changed the world in some way

but it’s not very likely and so when that happens when you end up in I won’t

say good or bad but when you end up in the circumstance that the conditional

probability of a flight delay is exactly the same as the unconditional or the

overall probability we say that the two things involve the flight delay and my

pasta dinner are independent of each other

so independence plays a huge role in your introductory statistics class and

in the application of statistics all of those examples again coin flips dice

rolls are by definition independent when I roll the dice this time it has nothing

to do with what came before or after and so we kind of know just from the

construction that they’re independent in forensics one well-known example of

Independence is in DNA analysis we look at markers on different of different

alleles on the human genome and if you take a marker on one chromosome and a

marker on a second chromosome they are independent because of the way the

biology works and so we actually have real independent events in forensics

occasionally independence is relevant to the Collins case because I mentioned in

giving you the little build up that the math professor used the product rule

what is the product rule well we may be interested in the probability that two

events both happens so far I’ve only been worried about my flight delay

because I’m a very self-centered purpose in it so a person and it’s all about

that but maybe I’m interested in two different events and how they’re related

and so I can ask what’s the probability my flight is delayed and my luggage lost

so it’s still all about me or I could ask what’s the probably get

ahead on my first going to us and my second cording to us that’s what we do

any introductory statistics classes or I can ask what’s the probability the

Toronto Raptors win Game one and what’s the plan Game two for example the NBA I

told you they wouldn’t all be baseball right so basketball – that’s the

basketball championship which starts tonight

so such probabilities of more than one event can be very complicated to figure

out okay why because they if the things depend on each other it can complicate

the world right so if the Raptors lose Game one they may quote unquote try

harder in Game two or something so it might change the probability it may not

be right to think of the two games as just independent efforts by Toronto and

gonna say what happens in one may affect the other the product rule works for

that case where we have independent events and when you have independent

events it turns out that you can just multiply probabilities so that for

example and I apologize I try to avoid the simplistic coin tosses but they’re

helpful here if I do you know if I said what’s the probably I get a head on both

coin tosses I’m gonna toss one cost to I I know that there’s a 50% chance that

I’ll get a heads on that first toss so that’s half the time and half of those

times I’ll get a head on the second pause so that’s how I get my twenty five

percent that we all know is the right answer and it turns out that it’s a

product rule 1/2 times 1/2 is 1/4 again you’re not you have the advantage

of not being asked to do any of this but I do ask your indulgence as we go kinda

back to the Collins case this slide is just a total reminder of the framework

we set up so a set of probabilities were given the expert witness said we can

multiply these together and the prosecutor drew the conclusion that

because the probability of all of those events happening in one couple was so

small that they must be the people that committed the murder sorry the robbery

in this case so let’s return to that case and tell you what happened

were found guilty a couple of accounts were found guilty the husband Malcolm

Collins appealed his wife did not there’s some interesting dynamics there

I think he had previous offenses and they appealed on two bases in

inappropriate admission of statements made while they were in custody and they

felt that the probability evidence was prejudicial to the defendant

I guess against the defendant obviously I’m no expert on admission of statements

so I will not address that part of the appeal process but we will talk about

the probability based on what we just learned and so here’s a little play at

home was the trial court correct in admitting the probability evidence and

yes or no and if not why not talk to your neighbor for a minute or

two let me know what you think and then we’ll talk it through okay I hear people making lunch plans

which means you must which means you must be done with the exercise so quick

show of hands before we reveal the judges the Supreme Court of the State of

California’s decision who thinks the trial court was correct so yes they

should have admitted the evidence nobody who thinks no they should not have

admitted the evidence okay David Kaye you’re not allowed to answer

no okay you’re all correct um you want to tell me why anyone brave okay so yeah

very good very good they’re not permanent either that’s both

of those statements they talk about the same thing exactly right let me walk

through some of this the Supreme Court reversed the conviction they cited two

fundamental prejudicial errors which I’ve broken down into kind of three

points that I want to talk about there first was they they also agreed with

what’s been said the testimony actually lacked adequate foundation in legal

terms was wrong statistically but the second point is also worth talking about

they found that the testimony and in particular the prosecutors use of said

testimony distracted the jury from its proper role in this case so let me

elaborate on those the first question that they asked that we did not hear

here is weird the probabilities come from in the first

place right the prosecution did not provide any sources there we these are

hard things to know trust me I’m going to tell you in a second why I know that

they’re hard probabilities to find actually but the prosecutor didn’t need

to make any attempt at all to justify them okay and the court was concerned

about that in my opinion correctly what the court said in its decision basically

is there needs to be some basis for doing this it doesn’t have to be the

right answer but it needs some basis so for example you might walk around the

neighborhood and count the proportion of yellow cars right that would be some

data that would support a probability it wouldn’t make the number right you know

you could go to DMV maybe and find out right but they didn’t do any of that so

it’s very concerned about where the numbers came from and I mentioned here

there’s a case in the United Kingdom a few years ago about shoe prints in which

a similar not the same but a similar issue came up and the judge there was

concerned about the way the probabilities were created by the expert

and in that case more how they were applied but so so where did the numbers

come from and now I think both of you were pointing out the independence so I

wanted to you know I can claim they’re not independent but I want to try and

find some data and that’s why I know it’s actually really hard to find out

about stuff like this here’s what I could find this is a survey from France

so now particularly germane to Los Angeles in 1968 but data is data and

I’ve served took their survey and distilled it down a little bit so all

men are in one of four categories here no facial hair mustache and beard

mustache only beard only and if you look at these data you can see that the

proportion of the male population with beards is actually 44% so that’s

actually higher than he gave but as I say it’s a different population and you

can also see that the proportion of males with mustaches is 47 pers so now

the math professor multiplied these two together acting as if they were

independent and if you think look at those two numbers and multiply them

together you actually end up with for for 9.47 it’s told there would be no

math in this course but it’s point to one story if they were independent you

would expect 21% of men to have both a mustache and a beard but we see that in

fact 43% do the way that I like to think about this is to actually turn around

that just proves they’re not independent to understand how dependent they are we

can look at the conditional probability and the conditional probability says

let’s take restrict attention on my box at the left to only those who have a

beard so that’s category two and category for

a total of forty four percent of population so that becomes by new

population I’m interested only in that forty four percent I don’t know the

population of France I apologize but you know we’ll call it France fifty million

people I will say the 44 percent of them is therefore 22 million people or

something like that okay so I’m only interests in those 22 million men not

the 28 million who don’t have of those 22 million it turns out that 21 million

also has a mustache and so I end up with this conditional probability of 98% so

they are not independent in fact they’re super super dependents and when

something like this happens you can actually tell what’s going to happen

just this fact if this were true in Los Angeles we change one in twelve million

to one in six million so we start cutting it down so it doesn’t look so

these dependencies will have the effect of cutting down that probability so it

doesn’t seem quite so overwhelming so very important independence yes great

thank you for asking that Jim all that I did for those numbers are say if I want

to know the proportion of the population with beards okay

I take the 43 percent that have a mustache and a beard and the 1% that

only have a beard and add them up so that’s what we were here your question

is very important there’s actually two comments here one is you know these are

based on a survey and so there’s some uncertainty

associated with these that is did as in most surveys there’s kind of a plus or

minus five percent typically depending on the size of the survey and so it

could be as high as you know forty nine percent or as low as thirty nine percent

but that would have very little impact on the rest of what’s been said here we’re gonna come to your other point by

the way to so the the so those were the two things that the court consumer is

concerned with that I would call quote-unquote technical errors right

where did the probabilities come from are they independent or not that gets to

the heart of whether the one in twelve million is remotely the right number but

the court where is probably as much I would say probably even more concerned

about the mathematics in this case being a distraction up to the jury and to the

lawyers you know and so you can see this quote you know about how misguided they

believe this effort was in particular they court believed that this was not

helped this kind of seemed to excuse the jury from doing the heavy lifting of

figuring out whether these were the people or not the prosecution laid it on

heavily that these were the people because there’s this mathematical and

used I believe these terms a mathematical proof of guilt the chances

of finding another couple like this we’re so remote that it had to be this

couple and we’ll come back to that that’s an example of a very common error

that’s made known as the prosecutors fallacy that is looking at a probability

even if the one in twelve men weren’t correct and interpreting it the wrong

way but I’m not emphasizing that now I’m emphasizing more here the courts concern

that there is this kind of exactitude and overwhelming nature of the

mathematical calculation that led the jury not to worry about things here you

see they mentioned exactly what you mentioned that these traits are

changeable so it’s not right to to assess whether columns were the right

people part of the thing the jury has to decide is whether there might have been

the case of disguise or you know cars can be painted

your time frame and so on and so forth and whether there might be another

couple there’s a fairly elaborate mathematical calculation in the appendix

to this Supreme Court decision arguing that it’s quite likely there’s another

couple I didn’t love that part of it but that’s beyond our scope for today so

here are a few more quotes from the court my favorite being the third one

down mathematics a veritable sorcerer in our computerized society and remember

this is 1968 what would they think of ours computerized society today while

assisting the Trier of fact in the search for truth must not cast a spell

over him I found this really interesting I talked about the Collins case as I

said in introductory stat courses and otherwise but it has preparing for this

and really just for this so my previous other legal talks I didn’t quite get it

as much it’s this part about the distraction is really interesting to

look at 50 years in rear view mirror because obviously we do this now and so

you know DNA calculations are very much of this form better justified in fact

but very much of this form and so that kind of evidence is relevant and when

appropriate and is now being used so I think that’s also interesting ok ok I’m

fun ok case 2 state of Connecticut versus

skipper ok the defend in this case was charged with

multiple counts of sexual assault with with a child actually it’s a terrible

case at trial the state’s expert witness from a testing lab report on the results

of a paternity test and the results of that test as well as other evidence led

to a conviction what the expert reported in the case is something that is called

a paternity index turns out you don’t know what this is yet necessarily the

paternity index is a likelihood ratio I mentioned like the ratios are one way of

assessing evidence fraternity index is a likely ratio so it’s a ratio of two

probabilities probably what’s the probability that I

would that the defendant would have produced a child with the given genotype

given the mothers genotype and the second probabilities are probability

that a random man would produce that could produce a child with that genotype

given the mothers genotype and that Humber was 3,500 or so so that’s one way

of assessing evidence and we’ll talk more about this but the state’s expert

went one step further the state’s expert indicated that we can take that

paternity index and turn into a probability of return a statement about

the probability that this man was the father and he did that and the answer he

came up with was ninety nine point ninety seven percent so very compelling

okay so I’m gonna do a little bit of again a probability statistics piece and

then we’ll come back to that case now so far from the Collins case we know about

probabilities and conditional probabilities and whether we should

multiply them or not the skipper case raises the bar on us a little bit

the skipper case uses more sophisticated probability calculations and I won’t go

through them in great detail but in particular it takes those probability

calculations and uses them through the likelihood ratio so remember I mentioned

there are three approaches to analyzing evidence you can think about the Collins

case as being an example of what I call the two-stage process

Stage one was to see I have all these characteristics from the eyewitness and

then I have the suspects they match and then the second stage I should figure

out how important it is that they match what’s the chance of a coincidental

match so that’s what’s known as the two-stage approach this case brings us

to the like the ratio approach and I’m gonna get two likely ratios and before I

do that I need to do a little bit more probability and in fact even a little

bit more confusing probability so apologies in advance historical note

thomas bayes was an english mathematician philosopher and minister

and he’s very well known in the world of statistics for his work on quote unquote

inverse probability a word on inverse probability by picture reproduced on the

bottom here remember I told you probability goes one way from population

sample and statistics goes the other way what is now called statistics is

something that Bayes and other he bases in the 1600s other 1600 and 1700

mathematicians called it inverse probability that is we’re trying to go

in the opposite direction right from the sample to the population so we now use

the term statistics for that and in recognition of Bayes contribution

mathematical contribution it’s the approach that I’m about to describe is

not is known as Bayesian statistics interesting fact for the day so his

contribution whose bit is known as Bayes theorem and to talk about Bayes theorem

it’s always nice to have an example here’s a little forensic example so I

have now two events that I’m interested in G is the event that there’s gunshot

residue on a person on the suspects a and in this case I’ll sometimes want to

refer to no gunshot residue so I’ll just say not G and then I have a test for

gunshot residue you know we’ll call the outcome of the test t key for test so T

is the test is positive the test shows gunshot residue on the person and not T

of course is a negative test so everything about to say you can do with

home pregnancy test or drug test or whatever you like it’s common to think

about these testing scenarios in a two-by-two table this is our second

two-by-two table we’re not even in our Enya 2×2 tables are all over statistics

so in this two-by-two table the two rows are the truth that’s usually not known

to us that’s why we do the test we don’t know if the person has gunshot residue

or not but the roads are labeled by the truth G the person has gunshot residue

Nachi the person does not have gunshot residue and the columns are labeled by

the test outcome T test positive not T test negative why some of the fonts are

different I don’t know they were supposed to be the four cells then

correspond to the following if the person had gunshot residue and the test

was positive we would call that a true possible if the person did not have

gunshot residue and the test was positive that’s what’s known as a false

positive test and the same for the negatives on the right hand column these

are in fact conditional probabilities we love conditional probabilities we

already know that okay so they’re expressed in the bottom here the true

positive rating is the probability that the test is positive given that the

suspect does in fact have gunshot residue on their hand that’s sometimes

known as the sensitivity that is this same language works for medical

diagnostic tests and that’s called the sensitivity and the true negative rate

is known as the specificity how specific is the test when it says know is it

really know okay so how sensitive is assessed is if it’s true will I detect

it am i sensitive enough to detect it and specific is the test is specific

when it says no it also means no and then you have these error rates when I

have false positive false negatives so the way these tests typically work if

you’ve ever done the home pregnancy test thing is the box says they will give you

their sensitivity and specificity based on lab tests we know those none of us

okay so we know how the test works and Bayes theorem he wasn’t working on

pregnancy test but Bayes theorem is actually a mathematical result before

turning the conditional probability around why would we want to do that

because what matters to me as the person who may be pregnant is the test came

back positive what’s the chance that I’m really positive that I’m really pregnant this stuff is can be confusing okay lots

of I tried to minimize symbols but lots of symbols and lots of confusion let’s

think about these two things probability of T given G probability test is

positive given person has gunshot residue

and the second one probability of G given T probability person actually has

gunshot revenue residue giving test is positive those two things are different

so that’s the first thing we want to know right they they’re different

they’re talking about different things though the first is talking about the

probability of getting a positive test remember the conditioning part just says

on what population it would only looking at the population where people actually

have gunshot residue now but it is fundamentally about the probability that

the test is positive the second quantity that I pointed to is talking about the

probability the person has gunshot residue on their hand it’s a probably

the probability that has nothing to do with the test in fact I already know the

test was positive that’s what I’m conditioning on that’s the information

I’m given so the first is a probability about the test the second is a

probability about gunshot residue the conditioning says which population you

look at that Bayes theorem is a mathematical theorem that tells us how

to slip it to get from one to the other okay I’m sparing you the formula by the

way base theorem it turns out to do this and this is a critical point actually

pay attention this comes back critical point to do this you have to know

something about the prevalence of whatever it is you’re testing for if

everyone in the world has gunshot residue Wyoming that’s different then if in New York

who’s always smaller proportions I won’t hazard a guess I haven’t lived

here for many years so that prior prevalence turns out to

matter I will demonstrate that but without demonstrating the formula that

matters a lot so let’s see how this works

and these are made up examples intended to be plausible so here’s what I am

assumed known on the back of my gunshot residue test box it says the test is

sensitive if you if there is gunshot residue present if you are pregnant

there’s a 98% chance that the test will come back positive and the test is

pretty specific there are a few false positives if you do not have gunshot

residue or if you are not the press will come back positive only 4% of the time

so it gets it right 96% inside there sounds pretty good test gets it right 98

and 196 on the other and for my first demonstration let’s assume that we are

in Wyoming and that 90% of the people have gunshot messages on their hands so

unfortunate you can’t do Bayes rule without that information okay so how do

we do this the dreaded tree diagrams okay we have a thousand let’s imagine a

thousand people because the probability math is not fun but we can all do this

because it just requires thinking about people let’s suppose you have a thousand

people and I told you a few things I told you that 90% of the people in this

population in Wyoming a thousand that’s the whole population I think that some

have gunshot residue so 900 of them actually have gunshot

residue on their hand a hundred of them do not the second part of the tree is

let’s do the test on those people and we if the of the 900 people have residue we

run the test and I told you that 98% of them the test is going to come back

positive and 2% it’s gonna come back negative which means that 90% of 900 or

882 people have the following characteristics they

have residue and the test correctly identified 18 people have residue but

the test made in error and said they did not on the bottom we do the same thing

400 people that don’t have residue and here I told you that the test was right

96% of time and wrong 4% of time so four of the people got a positive test even

though they did not have gunshot residue and 96 96 percent in this case 96 of the

people did not have residue had correctly got a negative test questions

I should have said this at the very beginning it’s a little late now there

were an hour into the show interrupt with questions as you like I welcome

them okay at the end of the day what Bayes did is collect the information in

the right hand column what do I know now I want to look at the pup some part of

the population that got a positive test how many people got a positive test 82

plus 4 886 got a positive test how many of them actually had gunshot residue 882

the other four were errors so the probability of a person has gunshot

residue given that the test was positive is very very high there’s a way we think

the world should work I’m not a half-percent

have a very accurate test I got my answer back boom person has gunshot

residue we can bring that into the courtroom let’s do example two same

exact calculation I’m just gonna change one number we live in a tranquil place

where gunshot residue is rare so in this place only 5% of the population has

gunshot residue 95% do not go through the same spiel and I won’t

bore you with the calculations I just make that number up by the way would

that’s it’s not knowable right it turns out in this case at the end of the day

there are many fewer positive tests of course because there are fewer people

who have gunshot residue but there are a total of 87 positive tests only 49 of

whom have gunshot residue the probability that you actually have

gunshot residue in this place given that the test came back positive is 56% there’s this large number of people

going through the no gunshot residue and some of them come back as false

positives when people talk about drug testing everyone this is what we worry

about is there’s a small probability that people are using drugs maybe

hopefully and when you do the Bayes rule calculation for the drug test you end up

with a reasonably high proportion of false positives and you know they’re

debates about breast cancer exams how often you should have them same exact

issue if you do exams on everyone including right the debate is around

young women they have a lower probability and so they have more false

positives and so there’s a cost to doing that both monetary class but it costs on

people’s lives and there’s no obvious answer what’s the right trade-off this

okay so that’s why Bayes theorem is important it allows us to kind of do

this calculation this quote-unquote reversing the conditioning so of them

I’m a Ghostbusters you’re right don’t cross the beams you’re kind of crossing

the beams right that is we have information about the test given the

true state of the world but we’re much more interested in the true state of the

world given the test okay so how did what does this have to do with forensics

okay so here’s a framework that will take us through most of the rest of my

remarks today both now and in the afternoon I

should have said that the same set asides kind of carrying

into the afternoon I have a possible breaking point we’ll see if I get there

or not the e in this case is evidence just as a generic placeholder it could

be two DNA profiles it could be two sets of measurements on glass fragments

fragments found at the scene of the crime

fragments found on a suspect so lots of examples and there are two hypotheses

the hypotheses here are in some respects playing the role of tests no sorry I

said that wrong hypothesis here are playing the role of the true state of

the world what does the person have gunshot residue or not in this case the

two states of the world that I want to talk about are what’s labeled here H

with a little s which is the same source view of the world these two sets of

measurements I have came from the same source the glass of the crime scene and

the glass on the person’s jacket are from the same window okay H sub D is the

different source proposition the two samples have different sources

this guy has glass on his window because he was working on his house this morning

do two different pieces of glass okay and then you can see how I can kind of

put this in the framework where you just talked about for the test and the

gunshot residue we can talk about first the probability of the evidence the test

given the true state of the world how likely is it that I would see evidence

like this if the suspect is the source and the

second how likely is it that I would see evidence like this if it were not the

source if the suspect were not the source if they were not from the same so

it turns out that’s something we can often say something about just as we

could in the gunshot residue we may be able to say something about the

probability of the evidence given the true state of the world we knew you know

the drug companies that develop pregnancy tests do experiments with

samples where they know the answer and that’s why they can tell you the

probability the test will come back positive if you aren’t right

in this case there’s scientific knowledge often that allows us to say if

in fact these two samples came from the same source I know how similar I would

expect their glass measures to be I know I would expect their DNA profiles to

match exactly so there’s stuff that we can figure out for those two but what do

we want to know in the classroom in faster in the courtroom advertisers we

actually want to know the opposite the flip right we want to know I’ve seen the

evidence can you tell me now about the probability of these two hypotheses so this is why what we just went through is

relevant to us we can do one version of the probability and we want to flip it

if we could okay but remember flipping was hard it required some information

about the population so a little bit about the likely ratio and then we’ll

return to the skipper case the likely ratio as I said before it’s getting a

lot of attention it it’s a statistical concept and there are some people and I

would probably count myself among them who believe that it provides a single

unifying logic for thinking about forensic evidence that everything kind

of could fit in there not it not everything the likely ratio I want to

point out is not specific to forensics at all it was not developed for

forensics it doesn’t care forensics uses it or not okay it has a very nice

business on the side on its own already it’s used in statistics to generate all

kinds of tests and it is used with medical diagnostic tests Europe loves

likely ratio I can’t tell you why they just do I’ll tell you more about it

later but they’ve they’ve kind of agreed with the statement that I made that they

think the likely ratio is the right way to do this more on that later so here’s

the formula this is one way of writing Bayes theorem the mathematical theorem

it turns out to be super super important for the ongoing discussions about the

likelihood ratio and it turns out to be super super important for the skipper

case that we’re talking about so again evidence H sub s same source H sub D D

source I’m not gonna prove this this is just proof I’d like to say that isn’t

just true this part is not up for debate mathematically true okay sometimes

people say I don’t know if I want to use the likely ratio you don’t have to use

the likely ratio but this is mathematically true and what it says is

the piece in the middle is what’s known as the likelihood ratio it is a ratio of

two probabilities corner to likelihood probability is to likelihood of an event

happening so one is the probability that I would see evidence like this if the

two if the two evidence samples come from the same source and the denominator

what’s the probability I would see evidence like this if it has different

source and so you know if you think about fingerprints what’s going on the

evidence is the latent print and the known print the evidence is I look at

them and I said oh my god they agree on this bifurcation they agree on this

Ridge ending this has an original but gee it’s kind of blurry there so maybe

it’s there and I just not seen here that’s the evidence so it’s a little

fuzzy evidence but that’s evidence and I say how likely are the examiner say how

likely is it that I’d see these matching features in these few dissimilarities

that are in those Blodgett area if it was the same source how likely would I

see this much matching and these smudges if it were a different source that’s the

likelihood ratio what Bayes theorem tells us how to do is take those

probabilities and turn them into what I think we all agree the jury wants to

know which is given this evidence what’s the probability that this suspect was

the source of the evidence and it’s expressed here as a ratio what’s the

odds that the suspect was the source of the evidence the ratio of two

probabilities are the odds so this is the odds that it’s the same person /

over it relative the odds that it’s the same person rather than a different

person and what Bayes theorem tells us what we

saw in the simple examples was the probability around to turn the

conditional around you need some information about the true state of the

world you need to know beforehand sometimes in mathematics we say a priori

before looking at the evidence how likely is it that it’s the same source

how likely is it this it’s the difference so Bayes theorem is a way of

taking prior information you know you can think about guilt and innocence

really only talking about one piece of evidence here but prior information

about the same source versus different source updating it with the information

I see the evidence this is a good part we actually have data about this and

that allows you to speak about the posterior probability with this what’s

known as the a posteriori probability the after looking at the evidence okay

there’s actually a lot of subtlety here so I know that lawyers like fine print

and footnotes so this in my fine print and sweet notes but there’s a lot that

you could talk about here that we’re not going to talk about today including

something that somehow it confuses a lot of people even in the statistics well

which is not likely calling it a likely to make sure we’re calling it a Bayes

factor for us today they’re the same thing okay okay

so again I already said this my good ratio is an attempt to generate a

quantitative summary of the strength of the evidence by comparing two

probabilities how likely the evidence is if the same source how likely the

evidence is if it’s a different source okay so as we talked about there’s a lot

of room for confusion here so let me tell you a couple of common confusions

that arise we’ll start with an example so let’s think about DNA because it

generates big numbers suppose I determine that the probability that the

two profiles match at random that is the suspect is not actually the source of

the DNA sample at the crime scene the suspect is innocent they were nowhere

near Earth what’s the probability that they would

show up with a matching being April huh turns out to be one in a million in this

case because I made it up so I know it okay

so that’s the probability of the evidence given a specific hypothesis in

this case that it was not the suspect the prosecutors fallacy which is the

term I introduced earlier and talking about the Collins case is to look at

that and say wow the chances of getting evidence like this if the person was

innocent is one in a million and reading that as saying the probability that this

person is innocent is one in a million those are not the same thing the first

is a statement about the evidence the second is a statement about the

hypothesis so that’s why it’s called fallacy obviously so if you do that then

you are led to as the prosecutor did in the Collins case say wow these really

low numbers mean I must be guilty because the chances they’re innocent

they’re so low but that is not what the data say the data are telling you about

how likely you are to get the evidence under one particular hypothesis but as

we saw we want to compare two hypotheses so we know we can’t get that answer that

the prosecutor wants by just looking at that number there’s a corresponding I

guess since everything is fair play in there our adversarial system there’s a

corresponding idea that’s known as the defense attorneys fallacy there at least

in this case one in a million in a population in the u.s. more than 300

million nearly 300 such people in the US so the evidence is not helpful because

yeah my sus my you me the defense attorney my suspect the suspect matches

this but gee they’re 300 other people so you know who cares not that unusual in

event so both of those are wrong in part because you need to look at the two

hypotheses to judge them the how much they tell you about the evidence so how

do you describe the likely ratio this how I describe it may or may not be

helpful in court the evidence in this case if the person is the suspect

we assume the probably they’ll match is one certainty that’s not true Oh we’ll

add that if it is if the problem if they’re not the person I told you the

probability of the evidence would matches one in a million the likely

ratio is the ratio of those two numbers one divided by one in a million and the

answer turns out to be a million and then I would say the evidence that we

see in this case yeah ladies and gentlemen of the jury is a million times

more likely if the suspect is the source of the evidence then if a random other

person was the source of the evidence that should sound familiar because in

the skipper case that’s what the paternity index set right I compared the

probability that the suspect was the father to the probability that random

man was the father and I came up with a number I’m gonna say a little bit about

this probably later this afternoon you probably can vouch for this this stuff

is hard to understand and if this group of highly educated people has trouble

understanding it what’s a jury gonna do with this interesting question

I mentioned the Europeans like likely ratios very much they have endorsed this

as the appropriate way to report evidence they recognize that the numbers

are tricky for people and so they created a verbal equivalent scale if the

likely ratio is really large like at the bottom they you can either give the

likely ratio that is their preference that you would give the likely ratio but

if you don’t want to give like the ratio you can say the evidence provides

extremely sore but strong support for the same source proposition relative to

the different source proposition and so there’s a verbal scale they found that

these are less precise obviously than the numbers but may be easier for people

to understand this is so important that it feels like on 12 slides okay so you

know we don’t wanna cross the beams the probability of the evidence given the

same sort of hypothesis is not the same as the probability of the same source

hypothesis and in words thinking about this giver case the probability of

getting the evidence given that the suspect was in fact the father is not

the same as the probability that the suspect was the father given the other

the probabilities of different things one is probability of evidence

probability of a specific profile the other is probability of a hypothesis of

a specific state of the world they have to be different

and the last bullet the last two bullets here the last two items say remember to

turn it around in this way we must have what I could here pre evidence before

looking at the evidence if I’m going to give you this posterior probability I

must have had an a priori opinion about the same source and there’s an open

question about whether we want our forensic experts to form such opinions

before they look at the data most people I don’t think believe they should how we

doing okay so I’m gonna quickly run through

the yes the argument goes argument I make goes the likely ratio is summary of

the evidence and it gets on to this scale that the juries are not managing

within their friends right I have to decide which hypothesis is true it is

appropriate for the jury to have some opinion once we get to this point in the

trial they’ve heard a lot of stuff already

alright the evidence now comes in they’ve heard stuff they have some prior

opinion or not then an expert sits down and says you know wow this glass

evidence is much more likely if the suspect was at the scene the crime than

if not that’s that is to my mind statements of fact not so close but

there’s evidence there there’s a mathematical summary the evidence there

I made certain calculations and I got this answer I’ll come back

but so the argument that I’m making is that part is reasonably can be

reasonably shared as the experts view of the strength of the evidence and then

the jury gets to update that I worry about what this says is I worry about

the expert saying based on my analysis I’m 99% sure that the suspect was at the

scene of the crime because for the expert to do that they must have had a

prior opinion that in my view they’re not entitled to okay I’m not sure I

completely address your problem well the likely Rachel I would argue

like the ratio to me in court because that’s the summary of the evidence so

that’s the piece that should inform the decision maker the second bullet here is

if the forensic expert is testifying about the second quantity they are

usurping that responsibility from the decision-maker they are up ining about

the hypothesis they should apply it only about the evidence in my view questioning house once presented with

discovery did intend to use this evidence challenge these people and that

would be hearings before the Trier of fact would then determine whether than

that doesn’t offend when rules of evidence in terms of Brittany did

whether not he it’s prominent on and outweighed by the

prejudicial about the quality is this confused enjoy the trial effect that’s

usual without you so you know but it’s not good to put here for a trial using

it you were a little lost that particular commotion area and you would

have formulated some counter which is usually the flip side of the statistical

armament that the individual make to begin up there and intelligently

cross-examine them that okay this could be said such as propeller XYZ is also

that not correct I’d have to say yes inspector so now

we’ve given the Trier of fact lead to silence so obviously that’s a critical

part of what we do but what I’m talking about

sorry to me but that’s a critical part of what we do I try not to say but it’s

a critical part of what we do the discussion we’re having now somehow it’s

a teeny bit different than that which is take something like fingerprint evidence

which is very well accepted so you’re not going to challenge fingerprint

evidence I mean you might but it’s it’s probably coming in car under current

interpretations okay the discussion I want to have is there’s a committee

right now that’s working on what standard should we make for fingerprint

examiner’s testifying about fingerprint evidence and I want my points to inform

that and I want you to be aware of that so that when you cross-examine you can

kind of say wait didn’t your committee say that you shouldn’t make this

statement about posterior probabilities that you should only make this other

state so you’re correct that is that it plays out often the way you described it

but we’re talking about here at a different level of the discussion even

once the evidence has been admitted about what that exchange might look like

and we’ll talk a lot more about that in the afternoon so I’m gonna put it aside

for now great questions yes sir here well thank you the question is what

happens in Europe you know with respect to this potential source of confusion

about probability of evidence versus popular hypothesis

the nice thing Europeans are very clear on this that is they are supporting the

likelihood ratio the probability of making statements about the evidence as

the way to summarize the evidence for the Trier of fact to add to their scales

and make their decision so there on the side I am in terms of like the ratio

sometime later this afternoon we’ll talk a little bit more about what the

Europeans say there are some things they do that I don’t necessarily like very

much right the question is where does is likely ratio come from haven’t said much

about that yet I was about to but questions are good yeah great questions great question so you

didn’t like mine sorry nobody knows the answer to that we’re struggling for that

one idea that people have floated is an idea of saying how strong is this

evidence well a million is as if I got heads 20 times in a row you know that

you’re surprised at this evidence should be about the same as you’re surprised if

I got 20 heads in a row that’s one idea people have thrown out but it is not

obvious how this should be and anyway a little bit more on that later – yes at

lunch or otherwise okay great questions I think most valuable is that we go to

the skipper case and kind of wrap that and then I’ll come back in the afternoon

talk a little bit more about likely ratios and then some of the issues we

were just discussing so I’m gonna goes through a few few slides about how

it works in DNA but I think it’s important to wrap the case so we return

to the skipper case and again these are the things I told you before okay the

expert reported a paternity index which is like the ratio and then said I can

turn this likely ratio into probability of paternity and you now have the

mathematical background for doing that but Bayes theorem was the way that we

could take a likelihood ratio multiply some prior odds and turn it into a

posterior okay so here’s what happened in the case the defendant was convicted

based in part on the testimony of expert there was an appeal in this case on

several issues again to totally outside my lane so I stay out of them one in my

lane statistically they have filed an appeal that the statistical evidence

associated with the paternity test was improperly admitted and again was the

trial court correct and admitting the probability evidence if not why not so take a minute talk to your neighbor

the again the evidence was this probability of paternity is 99.9 percent

persons found guilty defendant appeals saying the statistical evidence was

improperly admitted what do you think alright I’m gonna call us back quickly I

apologize but I value I value your break and my break too much

to go on too long so how many people think the probability events was done

correctly a small number how many think it was not still a small number I hate

to ask the third option how many people have no idea what we’re talking about okay anybody want to explain there no no

anyone want to explain there yes so that’s the key issue I disagree on the

substance but it’s the key issue which is you know the prior probabilities

right and in this case the expert from the company made a choice and the state

found the application of Bayes theorem in this case was inconsistent with the

presumption of innocence why did they say that the Court determined

interesting enough the expert was not explicit about what they had but the Court determined that they had

used a 50% prior probability of paternity and therefore 50% probability

of not being the father and so the prior odds were 1 and the court says you know

it’s antithetical to our criminal justice system to presume anything but

innocence at the outset of the trial so assuming that the probability was 50%

that he was the father assumed he was guilty that he had considered this that

he had committed the sexual assault there are several pieces here and in the

end the court said you know we can’t say this didn’t influence a jury so remanded

for a new trial ok that’s a lot you’ve seen before the

paternity index is the likely ratio that was three four nine six and as the judge

the Supreme Court correctly intuited the expert had put in a prior odds of 1 to 1

so the right hand side was 1 over 1 the middle was 330 496 you multiply them

together that’s an easy one you got 3 496 on the left so the post view they’re

called the posterior odds and then the expert turned that into a probability

which is very easy to do and ended up with ninety nine point ninety seven

percent so if you start with 50 percent prior probability the data changes that

to ninety nine point ninety seven percent as we saw with the gunshot

residue test if you change the forest you can see what would happen to the

last and so that’s sometimes people say one way to do this is to provide the

jury with the table if your prior opinion was this and I showed you this

like the ratio your posterior probably should now be this that doesn’t happen

David can tell us why later that but that would be one way to think about

this is right yeah to get a posterior probability which we kind of all agree

the jury wants to is needs to form we could give them a table that says if you

were here before the evidence this is where it should be after the evidence

but not everyone likes that idea so that’s what happened in this case that’s

what this slide says that’s exactly what happened in this case but we talked

about this before the odds here are the the difference between is a critical

because it requires the prior odds and the court was very concerned about that

use and found that inconsistent with the presumption of innocence now there’s

something interesting about this I think it’s on the next slide no good place to

break is on the next slide but we’re not there yet there’s something injuries

about this which I did not put on the slide but is in the decision which is

the Court recognized that this logic fails at some point which is any

presumed prior probability that the suspect is the father one percent two

percent right in making that presumes presumption you’re allowing for some

prior probability that this person committed the crime and therefore if you

didn’t make that such a presumption you’d have zero prior probability and no

evidence we change that and so there’s some piece of logic here that doesn’t

quite work in the courts reasoning I actually agree with the decision they

reached about 50 percent probably not being right or it leads it needs at

least to be much more explicit I would say we shouldn’t do it I just like to

stop at the likelihood ratio which is where the court won’t stop because the

likely ratio is considering the two possible states of the world but it’s

not attempting to assign a probability so it is 10 o’clock I left out I skipped

a few slides which I’ll add into the afternoon piece but this is a good time

to break I thank you for your questions and your attention I’m gonna pick up pretty much where I

stopped in the morning so if you’re new here apologies um I will try to bring

you up to speed but we discussed two cases in the morning so a little bit of

a reviews last preview I mentioned at the start that there were three ways in

which forensic evidence gets analyzed and interpreted one of those is as

expert opinion and we heard the firearm examiner really talk about that that’s

what they do in firearm analysis the second is in some kind of two-stage

process where we see if there’s a match and then we try to assess the probative

value of the match and the Collins case was an example of that much of what

happens in DNA can be thought of in that way we see if there’s a match and then

we get a number that says how significant or how important that match

might be how probative that match might be and then the third approach is the

likelihood ratio which I talked about in terms of a particular case I’m gonna go

back there to mention one point that I think led to a lot of the questions a

place where I was not as clear as I might have been I won’t say a lot about

it because I know that David okay we’ve talked and he’s going to come back and

talk about it in the fraternity context again but you

may remember this side this is the base theorem written in a form that it is

most commonly written in to apply to forensic evidence in which the

likelihood ratio is the piece in the middle looking at the probability of the

evidence and I mentioned that to get the left-hand side the posterior odds of the

hypotheses you multiply by prior odds and so that’s the famous formula what

maybe didn’t come across as clear and is what I’ve been advocating what people

who like the likelihood ratio advocate it’s basically that what is the role of

the forensic examiner the forensic examiner is to analyze the evidence and

so that if all goes well and we provide the right tools the right mathematical

or quantitative tools that won’t speak to the middle term they could make

statements about evidence how likely is the evidence

if this suspect was the source how likely is the evidence if another random

person was the source and that would be where the forensic examiner would exit

the scene and the notion is the both the prior

odds and the posterior odds are the domain of the Trier of fact the judge

the jury to kind of look at based on other evidence that may have come up in

the case up to this point they would take that evidence from the examiner and

then up so I won’t say more about it but I apologize if that was not clear

earlier today that’s old old what I wanted to do I

said I was going to do a recap and look forward so the things we’ve looked at so

far have been about the likely ratio in the two stage I saw I have a couple more

cases to discuss and they relate to expert opinion and I think make a nice

compliment to the bullet discussion that we had and some of the questions that

were raised and so maybe can continue that discussion before I do that though

I think the likely ratio is important and interesting and I think a lot of the

ongoing discussion is about it so I wanted to try and elaborate a little bit

on the likelihood ratio in terms of so what exactly do we mean and how might it

work in some of the more complicated forms of evidence and so it’s easiest to

start with a place where likely ratio we know works because it’s used and so I

set up with a very simple scenario here in which there’s a crime scene sample

and I have in parentheses known to be from a single source since the people

who are familiar with DNA will know that mixtures of DNA samples that contain

multiple individual contributors are more complicated I don’t want that for

now and then we have a sample from the suspect and it turns out their DNA

profile matches and so a DNA profile if you have not seen one before is

something like what you see at the left there’s some data that goes before it

gets to this point that is an someone has to analyze the the basic data and

what’s called called the genotype the alleles and so there’s a bunch of

markers here I think 16 or 17 and so the first row identifies a marker a spot

along the genome and at that spot this particular person has two alleles one is

a twenty and one is a 24 they get one for mom and one from death and then you

go to the second marker and they have these other alleles what the

is not so important other than that there’s a small alphabet of alleles that

are possible at each location the evidence in this case is the two

matching profiles and the numerator is the probability of observing matching

profiles if there is a single source which for the moment we’ll just take to

be one approximately is the and so the interest in the likely ratio in this

case turns out to be how do you calculate the denominator the

probability of observing matching profiles if in fact they came from

different sources and the way that works is we have we understand the biology and

we have data and so I’ve circled one of the markers th oh one for which this

individual has a seven and a nine and we ask the question how likely is it that a

random person would match this seven nine sample and you can determine that

because we have a table here of each allele and how often it occurs in the

population and so to get a seven in a nine you have a sixteen percent chance

of inheriting a seven and a roughly 20 percent chance of inheriting a nine and

they get multiplied and they get multiplied by – why – because you don’t

know which one came from harm and that so you could have got a seven from mom

and a 9 from dad or vice versa and if you do that multiplication you find out

that that probability is 0.06 and the likelihood ratio wants 1 over 0.06 and

so that ends up being 15 or 16 so what’s the big deal about DNA right that’s not

huge likelihood ratio well that’s just one marker and so if you have more

markers and they’re independent if we discuss this morning you get to multiply

probabilities and multiplying probabilities in this case turns out to

be the same thing as multiplying likelihood ratios and so the original

CODIS system that was in use had 13 markers so you take a number like 15

from each smart and multiply together 13 times that’s a pretty big number and now

they do 20 so that’s even bigger number okay so they’re not all 15 some are 5

some pretend whatever but you end up with very very large likely ratios

obviously so we talked about this case because it works well and

points out what you need to do the likely ratio which is helpful for a lot

of the ongoing discussion about hey can we quantify firearms evidence or not how

might we do that so here are the key points that I always think about when I

look at the DNA case it works because we know the biology we know how we inherit

our genes from our mom and our dad and that biology gives us a probability

refor probabilities and we have databases that tell us how likely each

allele is and there’s been a large peer-reviewed literature so the likely

ratio in DNA works very well although as we’ve talked about mixtures are

challenging still to this day and there are other contamination and other issues

as well so no evidence is perfect this but DNA is a place where at least we

know how the likely ratio works let’s get back because we were did the skipper

case it was a very nice break okay now we’re back we’ve already done this

review I thought I would kind of finish discussing the likelihood ratio by

talking a little bit about likelihood ratios for pattern evidence as I

mentioned that’s kind of the thing that C safe my Center is involved in and

therefore something I’ve been spending a lot of time on the lots of things are

called pattern evidence but they’re basically when a pattern or an

impression is left at the crime scene we call that pattern evidence and so that

includes latent prints which we heard a little bit about with the new technology

and include shoe prints which you see a picture here of a known shoe or on the

left and a print at a crime scene on the right you also can include questioned

documents signatures and things like that and Firearms and to Marx again

which we heard about so the likely ratio here is the what we’re aiming for but

very very hard so I wanted to say a little bit about why so hard and you

cannot you can if you wish kind of contrast with the DNA story that we were

able to tell the starting point is the data is very different the data in this

case really is something like this we have two images so

images are typically tens or hundreds of thousands of pixels basically and that’s

all you have to start with okay so the date is very high-dimensional and

there’s a lot of flexibility in defining what features to look at so most people

are familiar with what we mean by features in fingerprints we have

matching minutiae meaning Ridge and endings or ridges coming together and so

those are the kinds of things in ballistics as I mentioned in my update

about sea safe people look at things like matching stria as a feature but if

your starting point is an image there’s a untold amount of data here that you

could call upon and so how do you do it and how do you collect data about it

those are the hard things so for now I’m just going to take e to be kind of fuzzy

fuzzier then as a statistician how comfortable I like when the e is numbers

I know what to do with numbers but e in this case is I have these two images and

I have some ideas about similarities and differences I see in these two images so

that’s the first step now top to draw some conclusion I feel like I’m missing

a slide but it’s not there so there’s not there it’s not there right there’s

nothing you can do okay suppose I was going to talk about um trace evidence

like glass in between but you won’t that’s why there’s a reference to trace

evidence here but to understand likely ratios for pattern evidence we need to

understand kind of two things one of which is information about how

much variation you would expect if you had two images from the same source

because that’s one of the hypotheses is that these two prints typically one is

relatively pristine because it was taken from the suspect one from the crime

scene which is kind of messy typically right but so under the hypothesis that

they came from the same original finger I need to understand how much variation

I should allow there won’t be a perfect match how much should I allow and then I

also need to have some information about how much variation you’ll see in

features from different fingers and there’s a lot of differences you might

see of course but characterizing that is complicated and even more complicated in

some of these cases like shoe print you have to know something about

manufacturing distribution where to understand how common some features

might be so to do a likely ratio for pattern evidence we would want to

measure how likely am I to get the evidence these similarities and

differences if it’s the same source and how likely if they’re not and so this is

really really hard that’s what people are working on so I’m going to describe

three approaches but before I do that I want to say a teeny bit more about why

it’s so hard and why method especially thinking about it as a statistician I

read a paper once in which it’s written by an academic so I’m not impugning any

examiner’s or anything this was written by an academic researcher and he was

said I’d like to demonstrate how the likely ratio could work for handwriting

so he had two signatures it was his own name he just wrote is in the

neighborhood and he and he said here’s how it would work I would look at these

two signatures and I see certain similarities a certain differences no

none of us are machines we have slight variation when we saw our name again and

again and you said and I look at these two and I say what’s the probability

that I would see this much similarity in these kinds of differences if there were

two signatures by the same person and you said I’m missing that’s 90% so now

we haven’t spent a lot of time talking about probability but I encourage you to

think about it what’s the other 10% going how do you assign probability to

something as amorphous as this the other 10% is what signature looks like this

but as a few more dissing Alera T’s that gets 1% has a lot more that gets 2% it’s

impossible to put we put probability distributions live always on numbers

because we can do that we know how many alleles are

possible and we can figure out which ones come more frequently we can assign

numbers to them so we know how likely certain evidence are in this case you

can’t assign an evidence to images probabilities to images the space of

possible images is huge every pixel in that image could be dark gray light gray

black quiet and there’s tens of thousands of those so there’s you know

the number of images is you can see can assign probability and images so when we

find probability – that’s why it’s so hard that the likelihood ratio for

pattern evidence as it’s framed is virtually impossible so what are people

doing since people don’t take impossible for an answer it’s part of what defines

us as Americans we don’t take impossible so there are three approaches that I

want to describe one you’ve already seen which is one way to do this is to reduce

the image to a smaller set of features where you might possibly be able to

assign numbers to them and so this is from a paper by a fellow named Cedric

Newman who’s done a ton of work on fingerprints and here’s his one of his

ideas that he’s published I’m not sure of Cedric believes that this is

ultimately gonna yield success but it was a step along the way so what he did

is he said let’s focus in on some of the minutiae and so these have red arrows

type things or Lanisha from the print and he gave them a mathematical

representation so that’s what’s happening a B C D and E is each feature

got three numbers assigned to it one which is the angle in which it sits

another which is the type of minutiae aid is is it Ridge ending or not it’s

and it’s not a number but it’s a small number of possibilities and the third

was some measure of size of this triangle that co-located the minutiae

there’s a lot of work that’s going on here but the end result of that work is

to try and address this problem I put my finger on by saying okay this

fingerprint is hard to think about probability

let’s replace it by their seven Manisha’s minutia so he’s replaced by in

his approach twenty-one numbers that print replaced by twenty-one characteristics the seven minutia the

the angle type and size protocol of each is minutia so that’s not enough for you

to do it it’s not enough for me to do it I don’t know but a so Cedric did it once

you have the numbers though you can begin to say what is the distribution of

different minutiae types what is the different the different distribution of

these triangle sizes across different prints from the same person and across

different people still very hard right there’s an enormous data collection I

spoke this morning about the challenge you don’t have a database of multiple

prints from the same person repeated for many different people if you had such a

database you begin to say oh I could fill in some of these probability

distributions but it’s really hard so that’s what I’m gonna say not really our

problem there are people thinking about it but most of the progress that’s being

made is not being made in that way so what else might one do one might use

what some people call subjective likelihood ratios and another approach

are score based likely ratios so I’m gonna say a little bit more about those

last two I guess only one of them know the subject of likely ratios I kind of

already showed you a little bit of what I would want to say which is you we had

a slide about the European standards that have been written with European

guidelines not standards guideline so they’ve been written and the European

guidelines say the likely ratio is clearly the right way to summarize

forensic evidence ideally that summary will be informed by data collected but

if not it’s okay for the examiner to assign a likelihood ratio based on their

experience so they believe so strongly like the ratio that they say that’s the

right framework but we don’t have data to fill in those probabilities that’s

okay that’s where we’ll take the expert opinion so it’s okay in Europe for an

expert to say I’ve looked at this fingerprint I see this many matching

features I see you know a couple of discrepancies but they all appear in you

know the fingerprint was in a bloody mess and those the discrepancies

appeared there so they’re kind of easily explained away

I consider this a thousand most folks in the u.s. get a little queasy about that

including me so you know where do these numbers come from and where does

allegation come from and so on and so forth if you do that but the Europeans

have made their peace with it that is they belong folks believe so strongly in

front of likely ratio that they have really changed the problem to how do we

best inform the creation of a likely ratio and so they’re looking at

techniques two and four now the thing that’s happening more often in the

states are ideas based on scores and I don’t know if anyone heard of fr stats

which is a fingerprint so all right fr stats is coming up so the score based

idea is the following evidence II is complicated I argue that you can’t put

probability on the space of signatures it doesn’t I don’t understand what it

means so one approach to avoiding that is to replace the evidence by a single

score that takes the two images and says similar or not and then once you have

that score you have a much simpler problem not a simple problem by the way

a much simpler problem but still very hard and what does that what might that

look like so I’m going to show you two things the first is what we would what I

would call a score based likelihood ratio and the picture to look at is this

one on though over here on the bottom right this is the distribution of scores

for two different kinds of exemplars the black curve so here my similarity score

is actually a difference score large values of the score mean not similar

images small values of the score means similar images the black is derived by

looking at a set of instances in which I know the two images came from the same

source and I compute the score for each pair and I have this black curve which

shows the distribution those scores are between 0 and 20 there

are some still out here 20 to 40 very rare out to 60 and almost never eighty

or hundred I made up this data so that’s the black curve the red curve are no not

matches no not matches we would expect to be less similar more different and so

the red curve it was alluded to originally – but I was a little hard to

read so I’ve sketched in a red curve over it is more likely to be 20 40 60 80

100 or something like that so you have these two distributions one of which is

the distribution of scores from known matches the other is the distribution of

scores from known non matches remember that’s what we wanted we want what’s the

probability of seeing the evidence if it’s the same source what’s the

probability of seeing the evidence if it’s a different source and one way to

use that here is suppose now I have a case the case I take the two images from

the case and I compute my score it’s s right here the relative height of these

two curves basically gives me my likelihood ratio the black curve says

this is how likely it is to get a score like that from matching pairs the red

height this is how likely it is to get a matching score from non matching pairs

the ratio is the likelihood ratio a what would be called a score based likelihood

ratio so you can do that in this case you know it looks like it’s around five

to one or something like that and so that would be one way of summarizing the

evidence here but that’s it said this this is simpler you can see how it might

work but this is very challenging why is it challenging the score could depend on

lots of things and we don’t even know all of them let’s take fingerprints the

score may for example depend on the number of minutiae and so I said Fri we

talked about if our stats this is from the peer-reviewed journal article that

provides the description and argument for FR stats

Henry Spofford who would help developed it has a score in this case his score is

a similarity score so I apologize for that little bit of a switch of the

sequence here high scores mean better match low scores

mean worse match in this picture and to use it he realized that you need to

think about it separately depending on the number of minutiae because why at

the bottom one if you have 15 minutiae notice that the same the matching pairs

in the light color I have scores around 50 to 75 the dark non matching pairs

have scores around negative 75 and there’s almost no overlap go to the

opposite end look at the 5 suppose you only have 5 min you’re sure the known

matches actually look kind of similar they’re a little bit smaller but they’re

basically the same but what’s happened is with 5 the probability of getting a

coincidental match has increased and so it’s not unusual with just 5 minutiae to

have non matching pairs show up with good scores I’m not sure how clear it is

but you can see that it’s there’s actually a big hump inside the no match

that belongs to they’re not matched so there’s a chunk of overlap here so this

was really good that Henry did this because it’s a critical factor to

interpret the evidence you need to know the number of minutiae I’ve talked with

how many many times about this the issue that is challenging for people who

develop score based likely ratios is this is great we knew this was going to

be an important factor and it is are there other factors that we haven’t

looked at yeah does the actual physical size of the latent-print matter does the

substrate that was used to develop it matter there are reasons to believe the

answer to some of these questions are no but kind of the sciences in

which say let’s check that and a little anecdote not meant to belittle anyone I

in a colleague were called in to be with Henry about F our stats and that he

works at the defense forensic science center in Atlanta Georgia they do

forensic support for the DoD and they wanted us to kind of bless it and say

this is great you should use this and I said you know it’s really good we gave

him some feedback which he was happy to get about ways of improving his scores

and things like that and we had a really wonderful day and a half there and the

other day I said you know what to be honest I really like it it’s the

direction I think the field ought to be moving but I’m not prepared to bless it

in part because I’m not a fingerprint expert and I don’t know what factors you

should be looking at you looked at Dahmer minutiae that’s great but there’s

the attitude was we’ve done enough right and I said that’s not me to say you need

people who are fingerprint experts to say that and you’re the fingerprint

experts so you know the way science typically works in the university

setting is the people who are doing the experiments are their own worst critics

they say I’m about to submit this paper what is the referee going to look at and

then they try and address that in advance to the degree possible so they

would say I wonder if the size of the print matters because somebody’s gonna

bring that up let me test that I wonder if the

substrate matters so you know he’s in blood or it was on I should test that

and they really hadn’t done that work and so ultimately I spoke with Henry

supervisor I had a couple of back-and-forth and we reached an

agreement that you know this is for them to determine they should do some more

experimentation and they started to use it not exclusively it’s in concert with

their expert opinion approach but so this has some real potential and in fact

the ballistic approach that see safe is developing that I showed you before

lunch looks a little bit like this it ends up with a score which is the

probability that you’re a match or not and that’s what they propose to use so

score based approaches seem very promising so there are a couple of last

things to know about likely ratios before we leave them goodbye one

people may know this because it got a fair bit of publicity to NIST scientists

Steve London her and I are both statisticians wrote a paper in an

integer kn’l that pointed out you know what one of the things about likely

ratios is you need these probabilities and there’s some subjectivity in

choosing the probability models that you use the mathematical tools you use to

estimate those and they said if you change those you can get radically

different answers and so it gave them some pause now believe it or not London

I are actually reasonably supportive of likelihood ratios that they did not set

out to destroy likely ratios though that is how a lot of their research was

portrayed but the point they make is a good one and this picture for those who

are repeat of tenders here actually was shown last year at the same meeting by

Steve lund that is this is an example that you can think about as being a tool

mark examination again it’s kind of made-up data but so the measure that

we’re looking at the feature that we’re looking at is the number of

consecutively matching straya and as in most of my pictures he’s got two

distributions here the numerator is from a series of mated comparisons that is

where the suspect who made the print and the denominator is from non mated pairs

where the suspect tool did not make the print and they found the number of

consecutive matching stria and you can see among the mated there typically are

a lot between seven and tennis match Australia and the denominator not made

it’s very weird to see a large number most of the time you only get one two or

three mats and try if there’s you know there’s sort of mismatch but Steve asked

what happens if you saw four four is not super likely under each under either of

these distributions in fact in non mated he never saw a fourth but of course we

know for as possible why do we before as possible well we saw five if I happened

presumably a four can happen in the numerator we saw one I think it’s one

four but you know to create like the ratio you kind of want to estimate how

likely are fours in these two groups and Steve went on to show that the way we

tend to do this that is this we mean statins in this case is we fit a curve

to these and use the curve to interpolate how likely force might be

the problem that Stephen how are pointed out is there are different curves that

you could think that are all possible so you have a black a red and a green here

and they’re all plausible summaries of the data some may be better than others

but they’re all plausible in the way that Harris leaves to find possible

which I have problems with but we won’t discuss that but so they are all

plausible and the same thing out here they don’t even look that different on

the bottom but they’re surprisingly different because remember the thing

that we’re looking at is the height of the curve right here and they’re all

near zero but there’s a big difference between 0.01 and 0.001 and 0.0001 right

those are really important differences and so these three pictures would give

you three different likelihood ratios and they can be very different in

Stephen Ahri’s paper one of the things that freaked everybody out is they

showed that if you take a very generous definition of plausible you can even

have the likelihood ratio flip sides that is a likely ratio bigger than one

supports same source a likelihood ratio less than one supports different source

and they showed for the same data depending on the model you can have a

hundred or a half so that’s a little scary for me too but as I said there’s

issues about what plausible models are and what they might mean but their point

was to be very well taken likely ratios are not a panacea they’re gonna have

issues too and we need to think about that so your nice summary of likely

ratios and then I’ll leave you alone I won’t leave but I’ll leave you alone

about life I won’t leave you alone Matt won’t let

me leave I won’t let you I’ll leave you alone about likely ratios because as I

said I think it’s important topic which is why I spend so much time on you

likely ratios are a roughly the logical way to evaluate and summarize forensic

evidence so why do I say that some of the advantages are it makes explicit

this notion of we have two hypotheses and we try to assess the evidence under

those two hypotheses it provides when it can be done a quantitative summary of

the evidence that’s good we don’t make a match non match decision which is

important sometimes in glass we look at two glass samples we said they don’t

match completely but they kind of match and I’m gonna declare them

indistinguishable and we go from there for better not to make that decision in

my opinion and the likely ratio can handle virtually anything you throw at

it and it when done appropriately quite

transparent that is you lay down what you assumed whether you use the green or

the red or the black and in our system as we discuss this morning you know

someone else said well use the green why’d you use the green isn’t it true

that if you use the black you get a likely ratio right so it’s got that

advantage now of course there are huge challenges where the data come from to

support these pictures is a big problem and a big issue we need to find these

data that tell you how big a difference you might get in a non matching sample

so Henry Swafford had the advantage of working at

a DoD lab and a database full of fingerprints to help build his method if

one wanted to do that for shoe prints where would the data come from unknown

and as we just discussed the life of a shoe is a subjective beast it’s not as

subjective as expert opinion may be but it’s still subjective there are choices

that will be made and it must be questioned so that’s the end of the

likelihood ratio story question that if one was using a likelihood ratio

it’s a great question I don’t think I’m gonna leave to David the two comments I

would make are one one is you know in an ideal world we would actually show up

and say here’s the likely ratio and here’s what it means that is this

evidence that I’ve seen that I’ve shown you I show you my there’s two

fingerprints maybe some match inside places they differ

you know the models we have tell us that evidence of this type is a thousand

times more likely if the suspect was the source of the latent print then if some

if then a random person was the source of that print that’s the precise way to

say it now that’s hard to understand for us very hard to understand for jurors so

a lot of people are trying to understand to come up with better ways to report

that you know so as I said one way to think about it is to say you know it’s

it’s a measure of support of strength of evidence and so you could try to link it

to things people maybe have a better conceptual understanding of I mentioned

or this morning linking it to for example you know how many heads you’d

have to get in a row that is this evidence is a thousand times more likely

so it provides the same degree of you know surprise if you will as getting ten

coin tosses in a row and that is that the coincidence that would be involved

so that’s our but as I said I think David intends to talk more about that

and that is part of what it’s it says conveying like the ratios to the Trier

of fact is difficult that is definitely one of the challenges in using these so

there is a precise mathematical thing that we want to say that part’s clear

but it’s definitely not obvious how to do that in a way that juries would would

find easier to work with okay thank you no question time how do

you challenge Omar see the presentation presupposes that tell me no tricks

there are traces everywhere have mind you know criminals think I hate science

I don’t what that seems to be the situation so some idealistic or instance

would have amounted loves or programming put some food on their shoes or whether

it’s a great great question to the extent that they leave no trace if

they’re successful in the limit there is no evidence everything I’m talking about

starts with the evidence so there’s no evidence there’s nothing to do if for

example they you know it smart enough to know hey if I throw some whatever down

my shoe print will look there or you don’t think they march around the room

it doesn’t tell you whatever they do you know again if they change shoes for

example right and you capture them they don’t have shoes you know so you can’t

do anything with that to the degree that masking really adds noise to the data

that is makes it harder to read you know that if a technology existed that they

could wear that wouldn’t avoid leaving a fingerprint but would automatically

smudge the print saying then that would kind of filter into whatever we were

doing it would kind of mess up the score distributions they kind of move together

the evidence would be weaker but a couple of people want to comment and I

need help so I’m at six we don’t know that because there are

situations where for instance fingerprints no could come from trees

but the criminal might not have met any trees because the last prosecuting the

people most races who have found they see them cry mentioned in the wrong yes

excellent excellent point Simon Cole who is a colleague of mine at

UC Irvine has kind of written extensively about this that is there are

variety of errors that happen when we talk about errors of evidence analysis

we tend to primarily worry about false positives that is someone being falsely

identified but that even false negatives are potentially harmful because the

false negative means that the person was let go typically someone else got blamed

and you know they just said it was them they were wearing gloves so we don’t

have fingerprint evidence but we have this other evidence and things like that

but again these are kind of far from my expertise so my expertise really kicks

in at the point at which we have evidence how best to characterize and

report on that evidence yeah a likelihood ratio if done correctly is a

positive number great question by the way I did not say

this earlier it’s the ratio of two probabilities and I should maybe make a

little side come in here the reason we don’t call it a probability ratio and we

call it a likelihood ratio is because if you have a score or whatever you’re

measuring and it’s continuous it no longer makes sense to talk about

probability because what’s the probability that my refractive index on

a piece of glass is one point five six two you know instead of one point five

six two or one so the probabilities become likelihoods

that’s a technical point please ignore that okay you don’t need to know that

won’t be on the test but so the things that we’re working with our basically

probabilities and you have two probabilities which are numbers between

0 & 1 their ratio is between 0 and infinity numbers at 1 the likelihood

ratio is even that is the data the probability of seeing as evidence is the

same under the two hypotheses so the evidence provides no information to

allow me to distinguish between the hypothesis at 1 less than 1 means the

different source is supported by the evidence more than the same source

bigger than 1 means the same source is supported more so that’s what you end up

with and I once heard someone say this you know it seems not symmetric because

0 & 1 is a pretty small interval and 1 to infinity is a big interval they’re

actually the same size remarkably enough because you just flipped it 1 over so

every point right so so that’s kind of answer your question yeah I won’t say

there are n things people are doing that might show up looking negative but they

really should if they’re done correctly with a score based or with the real

probability model you’ll end up with two positive numbers and the ratio is this

kind of relative support yeah so the meaning meaning actually does not

change that you know the DNA is just that strong of an evidence you know

where you’re getting numbers in the millions or billions although you know a

strict likely ratio would of course account for probability of contamination

and other things which might change that number but ignoring that for the moment

if it turned out that you prints were 40 or a hundred that would be good to know

and in fact I don’t I told someone in the story over lunch which is that my

first forensics experience was working with the FBI trying to quantify the

strength of bullet led evidence and the way the bullet led evidence works is

bullets are made primarily out of that but the manufacturers add other elements

for a variety of reasons which I don’t completely understand but manufacturers

that have their own formulas and so the trace element compositions the amount of

silver or whatever aluminum whatever are found in the when you analyze chemically

the bullets maybe helps you identify the manufacturer and perhaps even the time

at which was manufactured that was the hypothesis and they asked us to explore

that as a research project and again data was an issue they had four boxes of

bullets from five four different manufacturers four box of bolts from

each I think approximately anyway and so we looked at the chemical profiles from

those bullets and we did a number of things to try and figure out what it

like the ratio might look like along the way we learned a little bit about how

bullets are made and they make from a big vat of material it’s all molten and

mixed together with these trace elements and then they spit it out into long

wires and cut it into bullets and the interesting thing is they sit around for

a while and then they kind of need to create bullets and so they grab some and

ice and then it get packed into these boxes of I’ve never had a box of bullets

twenty or fifty when our forty bullets I think forty dollars maybe and when we

analyzed them what we found was in a single box of bullets you would have

what appear to be bullets that were manufactured

different points in time and so when we did our best bet to calculate the

likelihood ratio for bullet let evidence I won’t go into more detail we ended up

with numbers literally like two so you know two or three or something like that

is based on the chemical profiles we see a match like this these two bullets that

seem to have similar trace element profiles is twice as likely if they’re

from the same source if they were manufactured together than if they were

not because of the way they were distributed and manufactured and

distributed and that was one of the pieces it’s a long story that we won’t

go into here it’s just not that probative in that case and so you know

the FBI no longer uses bullet let evidence and so we don’t know what it is

for shoe prints we may never know to be completely honest but if we were able to

develop likely ratios and find out that they were a hundred or a thousand or ten

thousand that’d be great to know and we would be able to calibrate how much

weight to give them in the courtroom relative to DNA so in cases especially

without DNA you take okay you know ten thousand from the shoes that’s pretty

significant great question so I’d like to move us on let me get through it side

I mentioned three ways of analyzing evidence everything I’ve said has been

around for the second and the third because they’re more quantitative but I

want to return to the first which is where expert opinion is the realm of the

coin that really know the point of the realm and so that’s the case for most

patterned evidence and so what is case law say about this just a reminder we

heard this this morning in the context of the firearms analysis the examiner

analyzes the evidence based on their experience training and accepted methods

in the field that’s exactly what we said this morning and then when that’s done

the evidence reflects the examiner’s expert opinion and the conclusions are

typically reported as these categories for firearms the speaker’s warning said

we you identification exclusion inconclusive or unsuitable sometimes

so very similar to well so that’s pretty standard there may be other categories

you may say you know in issue that the shoe manufactures on size and

manufacturer but doesn’t I don’t have any details sometimes and I give one

example here document examiner’s have their own scale that they use and they

tend to say things like based on the evidence the author of the known samples

wrote the question sample that’s their strongest statement analogous to

identification but then they have weaker statements the author of the known

samples highly it’s highly probable that the author of the known samples wrote

the question sample it’s the author of the known samples probably both question

and they so they have a total nine-point scale for positive inconclusive and for

negative statements but notice that if you read those statements carefully they

kind of bring us back to the skipper things that is the hand the fingerprints

are the document examiner’s are not making statements about the evidence

here they’re making statements about the writer the hypothesis so it opens up the

concerns that we talked about this morning so I’m not going to revisit that

here but I’m going to focus on places where they’re using these smaller scales

identification exclusion and so you know the Federal Rules of Evidence for the

most part and David will say more about this I bring it up just because the key

issue here of course is reliability and both of the method and as applied in the

case so two cases I want to talk about that raise issues that I think are

interesting here and the first is the US vs Glenn where the defendant was charged

with number crimes including murder through the use of a firearm and we had

questions about this this morning the government at the first child ended up

being to trial it sought to introduce expert testimony of the firearms analyst

that to a reasonable degree of ballistic certainty the bullet in the victim and

the casings from the scene came from guns owned by the defendant

and the defendant objected that ballistics is not based on sufficiently

reliable methods as required by the govern and so at that first trial the

judge said that the ballistics we can’t testify that ballistics as a

science would not allow testimony to a certainty a ballistic certainty and

instead said that there is evidence that ballistics analysis is a value in the

courtroom and so that he would allow the examiner to say based on my analysis it

is at least more likely than not that the bullet and casings came from the

same from the guns in question the first trial ended in a deadlock jury and a

mistrial and so a second trial was scheduled and Rakoff noted there was a

put out an opinion referee what it’s called but an opinion that said in

advance of the second trial i an’t and I intend to stand by all of my evidentiary

rulings in the first case except for one small change and the small change was he

changed it so not to have the at least so the evidence would be just more

likely than not so this gets at an important issue in pattern evidence

which is what kind of statements what we allow based on the evidence that’s out

there and when statisticians think about questions like that they think about

reliability and consistency I put those together because reliability is kind of

a more mathematical version of it but when we say something is reliable we

just mean that it can be consistently measures that if you do it again you’ll

get a similar answer so you can ask about reliability consistency in the

measurements that are being made so in the consecutive matching stria or in the

image analysis that’s being done you can ask about reliability consistency in the

decisions that are being made with the same examiner given the same evidence at

another point in time make the same decision great to know that what

different examiner is facing the same evidence right these are kind of for me

reliability question measurement questions okay and then none

of those first two reliability topics touch on whether they got it right

they just say would you get the same answer we might all get the same answer

we might all be wrong that’s not good either but as the starting point we

asked are we reliable do we get the same answer and as the second we asked do we

get it right a couple of examples the first is from a project I was involved

in when people analyze handwriting one of the questions they have a one of

their basic principles is the more complex the handwriting the stronger the

statement the examiners should be able to make logical lots of whistles and

curves in your signature easier to do the analysis and make a stronger

conclusion but not all that much data on it and so working with a colleague at

the Los Angeles Police Department she obtained a bunch of signatures and

sent them out to five different examiner’s and asked them to rank them

for complexity how easy or hard would it be to simulate this signature and you

can see the first row is the first signature the five examiner’s gave

different scores so I’m David forth I’m gave the five so I’m give it three to be

expected that’s these are subjective scales the second needs more agreement

everyone agreed it’s pretty complex the four and a five and so forth but there

are numbers of physical approaches to summarizing data like this the simplest

thing to do is just to ask how correlated these two guys are and so the

car correlation if you remember is a number between 0 & 1

just like probability but it’s not the same correlations actually run from

negative 1 to 1 but a correlation again the higher the correlation the more

similar the two scores are as a peer and these correlate around 0.65 which in the

measurement world is considered good there are some measurements if we you

all measured this table and wrote down our numbers and measured you know 125

tables and then I looked at how much we agree we would expect to agree much more

than 0.65 it’s important to know we would not agree 100% right that is we

don’t you know you’d start a little bit off a so the tables whatever 65 feet

long and I might get 61 inches you 2:59 but we looked at a bunch of

different tables we would tend to agree at a much higher rate so that was

interesting to me just as a starting point right how repeatable are something

like a complexity judgment good but not great second interesting point that came

out of the study is for a small amount of signatures only seven of them they

actually sent the same signatures back to the same person I think a month later

and they asked they compared the rankings given by person time one and

time two I don’t have the data up here and it wasn’t a lot of it but what I

could discern from that data was interesting to me because it turned out

that me today and me next month are no more alike than me and you so that’s

kind of interesting right that is this complexity judgment is a fairly

subjective call and there’s a lot of wiggle room you know plus or minus one

point but it just provides a limitation that we should understand because if we

can’t all agree on what’s complex it impacts what happens next when we

analyze the evidence so that was reliability of a measurement that goes

into a decision we can also look at reliability of decisions and of course

the PCAST study made a strong endorsement for black box studies and

one example that PCAST found valuable in which I also think was good work is this

study by jewelry at all on finger print decisions so they had examples where

they knew the truth latent print and unknown print some of which were matched

some of which were not and I forgot the numbers but there was I think 170 69

examiners and 700 prints all together and not every examiner did every print

but they had a lot of comparisons and when you do a study like that you get

some information now both about reliability they could see whether the

examiners were giving the same answers but also in this case because you know

the truth you can see whether they’re getting it right or not and they found

out of six thousand or so comparisons the false

rate was about one in a thousand there were five false identifications and the

false negative rate was 7.5 percent and one of the things I personally like

about science and statistics and all this stuff is you know you learn stuff

it’s just good for you you learn stuff in this instance forgetting about the

false positive rate I think the examiners themselves were

surprised that the false negative rate the authors report that most of the

examiner’s that participate in the study they were given a questionnaire we’re

very confident that they had never made a false negative yeah in this study 80%

of them had a false negatives so the rate was higher and it was the false

negatives were pervasive the false positives had 170 examiners only four or

five of them made false positives but they almost all made false negatives and

it’s not bad I mean that’s our system of justice maybe is happy this way but it’s

important to think about this this is the point where the statistician steps

away that is what’s the right trade-off between false positives false negatives

I don’t know that that’s not my call that’s society’s call at some level the

link the case Glenn case was about firearms and so here are results from a

study that was done in published in 2014 on cartridge cases so this is very

similar to the fingerprint study who just described it’s interesting to see

here that the false positive rate here was higher it was 1.5 percent and the

false negative rate was lower was 0.4 percent so so again I find the data very

valuable and knowing and it allows us to you know the the two places appear to

have different cultures the two disciplines in terms of the trade-offs

so so that’s the kind of idea of a black box study a study with known truth which

examples are put into the system a judged and answers assessed as a way of

learning about the process so we can then return to the Lin case

and I mean in terms of coverage don’t want to talk a ton about the PCAST

report so I’m going to leave that aside but say a little bit more about the

judges thinking in the case the definition of match which someone asked

the firearms analyst this morning to talk about that after ye definition of

what an identification is and the half the definition of identification

basically says that there was a match when there’s a sufficient amount of

agreement and the features and more than would be and no more disagreements than

would be expected above of us there’s a bunch of language like that but the

judge in this case decided you know relying on a notion of sufficient

agreement when it’s a subjective judgement for the this judge said he

would not call it science at that point that it’s a opinion and again he found

also the judge also found concerning the lack of standards in ballistics but he

said there’s evidence out there and he cited the 2008 NRC ballistic imaging

report that ballistic images could be useful in identifying the source of guns

and so he said there’s reason to leave it in but not allowed statements that

implied strong science and based on past experience statements about matches

being certain or about error rates of methods being 0 that the court had a

role to play in and learning the jury to the limitations and that’s what led to

his two statements at least more likely than that and then the second trial just

more likely than not I know but I need a reason to talk about this so but I think

the question going forward of course is what the testimony really ought to look

like and obviously quantification would help

maybe bring everyone together but what happens in the interim is an ongoing

discussion the last case I want to discuss is again

in one of these kind of special situation but it’s a special situation

that’s useful for me in terms of talking about what happens when we have expert

opinion so this is the case that occurred in the state of North Carolina

in which a pizza delivery driver was assaulted and there’s a history of the

cases long in which I’m not going to go through but they ultimately figured out

that the order for the pizzas came not from though the pizzas were being

delivered to an abandoned house and they were able to track down the order was

made online and online the source of the order was not too far away and they they

went to that address and found the pizza boxes and the chicken wing boxes

I got fingerprints off of them also finger prints from the car and they

matched the suspect okay so that’s what the case was about and at this point the

where the McFaul case shows up is the Federal Rules of Evidence require

reliable principles and methods but also reliably applied to the facts of the

case and the McFaul case focuses attention on Part D here rather than

Part C that is for fingerprints we have a very long history and the study that I

just cited that suggests hey experts can do very well at identifying fingerprints

so we can leave that in but we still have to ask whether it was done well in

the case at hand so here’s all the charges that’s one of the worst acronyms

I’ve ever seen so once again we had this trial and the defendant moved at the

trial to suppress the evidence recovered from the residence but the court denied

it and the defendant was found guilty so the defendant appealed on several issues

again as is the case for the other cases we’ve talked about

most of the issues don’t have anything to do with me and statistics or

measurement so I don’t talk about them but in particular one of the issues the

defendant appealed on is that the court made a mistake by allowing the state’s

expert to testify on the fingerprint evidence and the court ultimately agreed

on that but found the arrow is not prejudicial so did not change the

decision so let me go back and say a teeny bit more about that case and what

the judge was kind of saying in the decision the basic statement was as per

our the dry-out court can be reversed with respect to evidence only for abuse

of discretion that is a finding that the ruling was not supported by reason and

in the context of this case what the judge cited was two pieces of the story

early on the expert was brought in and certified as an expert based on

experience training passing of proficiency tests and the light the

defense did not object the expert was seated and the fingerprint analyst

testified how fingerprint analysis works and so you see some of the types of

languages we just talked about they look at the two print side by side when there

is enough sufficient characteristics and sequence of similarities that’s when

they make an identification decision and then towards the end as I understand

it they went back and – as part of the wrap-up to say okay now let’s look

specifically at the evidence in this case can you tell me in that can you

tell show the jury how the comparison works here for these items and the

analyst says yes these match and when asked on what is that conclusion based

my training and experience in looking at the minutiae with the two Prima prints

is that right that’s correct the process I explained earlier and what the judge

found objectionable was there’s no reference to what exactly in the two

prints that were being compared was the analyst testifying to you know in

talking to other people about this to some extent you never quite know what’s

going on the person may have thought it self-evident that because I told you how

it works I was looking at similarities and features but the judge appears to

have been looking for as you could see on this feature you know things here and

there so that’s what in this case led the judge to say it had not been applied

appropriately in this case because there was nothing specific to this case so

that’s basically all I have I have a conclusion slide which I attempt to find

of tying together all the things I’ve talked about in particular the we start

with the notion that Dalbir makes the judge the gatekeeper and I tried to pull

together today a number of cases that talk about the role of probability and

statistics in all forensics but especially in the pattern matching

disciplines and in particular I ask you to think about the three different ways

that one can think about evidence well that is expert opinion which we talked

about this afternoon and is the state of the art in the pattern disciplines you

know match probabilities or coincidental match probabilities which you can think

about happening in DNA and some other places and the likelihood ratio and to

me they all have in common that one should be accounting for these two

hypotheses same source different source in my world

and that we should be explicit about the modeling the reasoning that we’re doing

there should be empirical support and to me that means things like the black box

studies that is there there tends to be empirical support if you can build a

likely ratio or coincidental match probability in expert opinion what is

empirical support looked like the kinds of studies we looked at are one example

so I miss up there I’m early and we have time for questions galore on any of the

topics I thank you that’s a great point David K has an article on the subject

and he reviews I don’t know if they’re the same decisions but in some of those

decisions and it’s it’s certainly more than n of one I think the evidence is

still allowed predominantly as testified but it is I think as you point out and

as I tried to point out you know I think part of the conversation that ought to

be happening now is you know what can science say about this how can science

inform this thank you Plan B works but I don’t know Matt is

there a mechanism so they’re now on the computer here

thank you I did close with my email so because you can’t ask me directly if you

want yes my memory was last year when Swofford

presented every likelihood ratio for notifying the number database because I

think he’s using like a smaller database then yeah he’s certainly using a smaller

database and he’s also you know a little bit into the weeds of the ratio these

actually computing I’ll show you one thing which tends to

which would push in the direction that you’re observing so sorry it’s always

nice to have a picture but it’s further back than I thought it

was almost there so the likely Rachel I think of it’s the

height of these two curves in fact what that far stats does is it kind of says

you know what if I got this height if I got this score let me compare the

chances of getting this or something bigger from the known on matches and

this or something smaller from the no matches and that was an attempt on his

part I believe to be somewhat conservative that is you’re adding in

not just what you saw but other evidence that would have been more in that

direction and that would tend to have the generally speaking the effect of

keeping the like the ratios from going astronomically high he made a few

choices to try and do that now it’s impossible to make a choice that you

know will go in a certain direction it’s the truth of it and as anything you do

you know we have techniques for how these curves come to pass and they

require choosing a specific distribution to build from and he tried to build one

that would have heavy tails rather than lucky else we’re talking about heavy

tail this how much probability is out here and a light though would go down

very quickly and have no probability out here and he thought it was important to

have heavy tails to allow for things that he hadn’t seen because he has a

limited amount of data for example so I don’t know for sure I’m not that expert

in it and in particular it’s been developed further than when I saw it so

I don’t know exactly where he ended up he had if he did not run off anything

like a giant aces database they ran off I think 50,000 prints or something like

that good question thank you yeah I give the two comparisons of

probabilities being the exhaustive set of worlds that could happen and to bring

it back to baseball analogy difference I thought I use it’s the comparison of the

national league winning very first genome you know one of those versus the

nests so just to have something to point that your question is a good one it

happens to be one of the places where some statisticians are more sticklers

than others that is I’ve split up the world in the way that you describe

there’s only two possibilities either the suspect is the source my age sub s

or the suspect is not the source H sub D and so these are the terms that’s used

is mutually exclusive that is if one of these is true the other can’t be and

exhaustive they cover the entire world okay suppose instead you know it’s an

Agatha Christie novel okay and there’s 12 people in a mansion and a murder

happens okay so now maybe a hypothesis that looks a little different okay

you know instead of this we know the buzzer did it but let’s ignore that for

the moment okay so instead of having hey Sebastian suspect that needs to be for

the someone else you might have H sub one that it was person one at the dinner

party and H sub two there was person – in H sub 3 I and then we’re not up to

this scientifically but suppose we were and we could compute how likely is it

that I would see this fingerprint evidence if it was person one this

degree of match between the known and the question I could do that for person

two and so on and so forth you could actually do a month two likely ratios

there’s a likely ratio comparing person 1 person 2 there’s likely ratio

comparing person 7 to person 80s or so forth the problem with those likely

ratios for some people is they don’t look like this that is the ratio of two

probabilities but they’re not mutually rusev and exhausted both person seven in

person II could be extremely unlikely because we don’t know that one of them

did it so some people say you should not use likelihood ratios in that case

because you don’t have mutually exclusive and exhaustive I would not say

that that is the likely ratio is what’s like a ratio is to sound a little Tao

Tao you know it is what it is if you compare only those two hypotheses then

that’s what you get back which is whether the evidence supports seven

versus eight okay and it could be that seven is you know one in ten thousand

and eight is one in a million in which case there’d still be a

likelihood ratio I should have picked numbers I could divide in my head but I

think I think that’s a hundred so the likelihood ratio in that case is a

hundred it says the evidence that I’ve seen is a hundred times more likely if

seven is the person then if a is the person but it obfuscates the notion that

in fact neither one of them is very likely so you know that’s kind of what

I’d say about it it’s you know if they’re not mutually exclusive and

exhaustive if you’re in their efficacy then you have to be very careful in

thinking about what you want to do in that case we would almost certainly want

to look at all of the twelve possibilities and see where the evidence

points thank you for asking yep