class: center, middle, inverse, title-slide .title[ # Probability and inductive reasoning ] .subtitle[ ## How To Think - Week 11 ] .author[ ### Fernando Alvear ] .institute[ ### University of Missouri ] .date[ ### Apr 3 ] --- <script type="text/x-mathjax-config"> MathJax.Hub.Config({ TeX: { Macros: { And: "{\\mathop{\\&}}", Not: "{\\sim}" } } }); </script> # Course overview 1. The psychology of reasoning ✅ 2. Deductive reasoning ✅ 3. Probabilistic reasoning ⬅️ --- class: center, middle # Logic Logic is the study of _correct reasoning_. --- # Logic: branches Logic has two main branches: - Deductive logic - Inductive logic -- Think about these chains of reasoning: .pull-left[ 1. James is in Missouri. 2. If James is in Missouri, then James is in the US. 3. Therefore, James is in the US. ] .pull-right[ 1. In a recent poll, 99% of people said they will vote for James as the next mayor. 2. Therefore, James will be the next mayor. ] --- # Deductive and inductive logic .pull-left[ 1. James is in Missouri. 2. If James is in Missouri, then James is in the US. 3. Therefore, James is in the US. ] .pull-right[ 1. In a recent poll, 99% of people said they will vote for James as the next mayor. 2. Therefore, James will be the next mayor. ] The reasoning on the left is an example of **deductive logic**. - The truth of the premises *guarantees* the truth of the conclusion. - If the premises are true, then the conclusion *must* be true. - It's *not possible* for the premises to be true and the conclusion false. The reasoning on the right is an example of **inductive logic**. - The truth of the premises *does not guarantee* the truth of the conclusion. - However, in certain conditions, the truth of the premises makes the conclusion *highly certain*. --- # Validity and strength Validity is an _all-or-nothing_ notion: either an argument is valid or not. Strength comes in degrees. An argument's premises can make the conclusion: - somewhat likely, - very likely, - almost certain, - or perfectly certain. Correspondingly, in terms of strength, arguments range from - weak, - to somewhat strong, - to very strong, etc. .center[ <img src="assets/probability-line.svg" alt="" height="170"/>] --- # Validity and strength .center[ <img src="assets/probability-line.svg" alt="" height="170"/>] A valid argument is a _maximally_ strong argument. If the premises are true, the conclusion must be true. But notice that an invalid argument can be a strong argument. - The premises don't guarantee the truth of the conclusion. - But they might make the conclusion highly probable. Example: 1. I've seen 10 ravens and they've all been black. 2. Therefore, all ravens are black. --- # Three kinds of inductive arguments #### Generalizing from observed instances 1. Every raven I have ever seen has been black. 2. Therefore, all ravens are black. #### Inferring an instance from a generalization 1. Most birds can fly. 2. Tweety is a bird. 3. Therefore, Tweety can fly. #### Inference to the best explanation 1. My car won't start and the gas gauge reads 'empty.' 2. Therefore, my car is out of gas. --- # Evaluating strength of inductive arguments Example: 1. I've seen 10 ravens and they've all been black. 2. Therefore, all ravens are black. -- In deductive logic, we evaluated validity by answering the following question: > Assumming that the premises are true, is the conclusion true? We checked this by using truth-tables: - We examined each case (row) in which the premises were true, - and then check whether the conclusion in those cases was always true. -- In inductive logic, we usually deal with _invalid_ arguments, so to evaluate strength, we need to answer something like this: > Assumming that the premises are true, is the conclusion highly _probable_? --- # Probability 1. I've seen 10 ravens and they've all been black. 2. Therefore, all ravens are black. > Assumming that the premises are true, is the conclusion highly _probable_? This statement conveys a _conditional probability_: the probability of an event **given that** other events are true. - `\(E\)`: I've seen 10 ravens and they've all been black. - `\(H\)`: All ravens are black. (Note: `\(H\)` stands for _hypothesis_, `\(E\)` stands for _evidence_.) .shadow[ .emphasis[ __Argument strength__: An argument with premise `\(E\)` and conclusion `\(H\)` is strong if `\(P(H|E)\)` (the probability of `\(H\)` given `\(E\)`) is high. ] ] --- # Questions .shadow[ .emphasis[ __Argument strength__: An argument with premise `\(E\)` and conclusion `\(H\)` is strong if `\(P(H|E)\)` (the probability of `\(H\)` given `\(E\)`) is high. ] ] - How to calculate the _conditional probability_ of an event? - For the argument to be strong, how _high_ the probability should be? These are just some of the reasons why we need to become good at thinking about probabilities. --- # Independence My wife’s family keeps having girls. My wife has two sisters and she and her sisters each have two daughters, with no other siblings or children. That’s nine girls in a row! So are they due for a boy next? Here are three possible answers. > Answer 1. Yes, the next baby is more likely to be a boy. Ten girls in a row would be a really unlikely outcome. > Answer 2. No, the next baby is actually more likely to be a girl. Girls run in the family! Something about this family clearly predispose them to have girls. > Answer 3. No, the next baby is equally likely to be a boy vs. girl. Each baby’s sex is determined by a purely random event, similar to a coin flip. So it’s equal odds every time. The nine girls so far is just a coincidence. Which answer is correct? --- # Independence - `\(E\)`: My wife's family has had nine girls in a row. - `\(H\)`: The next baby in the family will be a boy. Does the probability of `\(H\)` change when `\(E\)` is true? Suppose that, with no further information, the probability of `\(H\)` is 50% (or .5, or 1/2). In other words, initially, it's equally likely to have a boy than a girl. Now you learn `\(E\)`. Should the probability of `\(H\)` be different now? > Answer 1. Yes, the probability of H should increase, so the next baby is more likely to be a boy. > Answer 2. No, the probability of H should decrease, so the next baby is actually more likely to be a girl. > Answer 3. No, the probability of H should stay the same, so the next baby is equally likely to be a boy vs. girl. --- # Probability $$ \text{Probability of an event happening} = \frac{\text{Number of ways it can happen}}{\text{Total number of outcomes}} $$ Example: Six-sided die .font-big[🎲] - Six possible outcomes: .font-big[⚀ ⚁ ⚂ ⚃ ⚄ ⚅] - One way that rolling a 4 can happen: .font-big[⚃] $$ P(\text{rolling a 4}) = \frac{1}{6} $$ ### Three main axioms of probability: 1. __Non-negativity__: For any proposition `\(A\)`, `\(P(A) \geq 0\)`. 2. __Normality__: For any _necessarily true_ proposition `\(A\)`, `\(P(A) = 1\)`. 3. __Finite additivity__: For any mutually exclusive propositions `\(A\)` and `\(B\)`, `\(P(A \vee B) = P(A) + P(B)\)` --- # Probability of _what_? Logicians talk about the probability of __propositions__ (sentences). $$ P(\text{The result of rolling a six-sided die is 4}) = \frac{1}{6} $$ Statisticians talk about the probability of __events__. $$ P(\text{Rolling a 4 on one throw of a six-sided die}) = \frac{1}{6} $$ .shadow[ .emphasis[ Propositions are __true__ or __false__. Events __occur__ or __do not occur__. Most of what we say in terms of propositions can be translated into event-language (and vice versa). ] ] --- # Useful illustrations As introductory examples, we will talk about the probability of these kinds of events: - Results of throws of six-sided dice. - Results of coin tosses. - Card drawings from a standard 52-card deck. - Sampling of balls in a urn (or jar). - Results of standard rullete spins. .center[ <img src="assets/standard-52-card-deck.jpg" alt="" height="300"/>] --- # Mutual exclusivity Two propositions are _mutually exclusive_ if they can't both be true at once (or at the same time). Consider these two events. Are they mutually exclusive? - The die falls 4. - The die falls 5. -- Yes, these events can't both be true at the same time. -- Consider these two events. Are they mutually exclusive? - The die falls 4. - The die falls with an even number up. No, these events can both be true at the same time. --- # Adding probabilities The probabilities of mutually exclusive propositions or events add up. .shadow[ .emphasis[ **Probability of disjunction**:<br> If `\(A\)` and `\(B\)` are mutually exclusive, `\(P(A \vee B) = P(A) + P(B)\)` ] ] Example: `\(A\)`: The die falls 4. `\(B\)`: The die falls 5. What is the probability that the die falls either 4 or 5? -- `$$\begin{aligned} P(A \vee B) &= P(A) + P(B) \\ &= 1/6 + 1/6 = 2/6 = 1/3 \end{aligned}$$` What is the probability that the die falls either 4 or on an even number? .font-small[(Answer: These events are not mutually exclusive, so we can't use the formula given above.)] --- # Independence Two propositions are _independent_ when the truth of one doesn't make the truth of the other any more or less probable. Two events are _independent_ when the occurrence of one does not influence the probability of the occurrence of the other. Consider these pairs of events. Are they independent? 1. Drawing the two of spade, put the card back on the deck, and then drawing the king of hearts. 2. Drawing the two of spade without putting the card back on the deck, and then drawing the king of hearts. 3. Getting a 2 on a six-sided die and then an even number on a second throw. 4. Getting a 2 on a six-sided die and an even number on the same throw. -- Answers: 1. Independent events. 2. Non-independent events. 3. Independent events. 4. Non-independent events. --- # Multiplying probabilities .shadow[ .emphasis[ **Probability of conjunction**:<br> If `\(A\)` and `\(B\)` are independent, `\(P(A \And B) = P(A) \times P(B)\)` ] ] Example: `\(A\)`: Drawing the two of spade. `\(B\)`: Drawing the king of hearts. What is the probability of drawing the two of spade and then the king of hearts (with replacement)? -- `$$\begin{aligned} P(A \And B) &= P(A) \times P(B) \\ &= 1/52 \times 1/52 = 1/2704 = 0.00037... \end{aligned}$$` -- What is the probability of drawing the two of spade and then the king of hearts (_without_ replacement)? .font-small[(Answer: These events are not independent, so we can't use the formula given above.)] --- # Summary .shadow[ .emphasis[ Two propositions are __mutually exclusive__ if they can't both be true at the same time. Two events are __mutually exclusive__ if they can't both occur at the same time. ] ] .shadow[ .emphasis[ Two propositions are __independent__ when the truth of one doesn't make the truth of the other any more or less probable. Two events are __independent__ when the occurrence of one does not influence the probability of the occurrence of the other. ] ] --- # Summary .shadow[ .emphasis[ **Probability of disjunction**:<br> If `\(A\)` and `\(B\)` are mutually exclusive, `\(P(A \vee B) = P(A) + P(B)\)` ] ] .shadow[ .emphasis[ **Probability of conjunction**:<br> If `\(A\)` and `\(B\)` are independent, `\(P(A \And B) = P(A) \times P(B)\)` ] ] --- # Quiz - Published on Canvas (as a .pdf file). - You can grab a physical copy here. - Due on Monday, at class time. --- # Activity 1. Take a penny from the jar. 2. We will toss each penny at the same time. 3. For each toss, write your result (H or T). 4. The person with the longest streak (of either heads or tails) wins! The winner gets **two extra points** in Exam 3. -- Questions for discussion: - Before the coin tosses, how probable was the longest streak produced? - Is the coin that produced the streak "special"? Is the coin biased towards a certain result? - Consider these two coin flip sequences. One sequence was the result of actually tossing a fair coin. The other sequence is made up. Which one is real? Which one is made up? Why? 1. T T T T T H H T H H H T T 2. H T H T H H T H T H T H T --- # Problem Suppose I am doing coin flips. The coin is fair. `\(H\)`: The coin lands heads. <br> `\(T\)`: The coin lands tails. I toss the coin 9 times and I get this result: `\(H H H H H H H H H\)` (9 heads in a row) What is the probability of having obtained this result? -- `\(P(H \And H \And \ldots \And H) = 1/2 \times 1/2 \times \ldots \times 1/2 = (1/2)^9 = 0.00195\)` Now let's think about the tenth toss. Which reasoning is best? 1. The next toss is more likely to be tails, because getting ten heads in a row would be really unlikely: `\(P(10 \text{ heads in a row}) = (1/2)^{10} = 0.000977...\)` 2. The next toss is more likely to be heads, because something is weird about this coin that makes it fall heads so frequently. 3. It's equally likely to get tails or heads in the next toss `\(P(H_{10}) = P(T_{10}) = (1/2)\)`. --- # The gambler's fallacy The best reasoning is this: > 3: It's equally likely to get tails or heads in the next toss. This is so because the coin-toss in our example is **fair**: - Tosses are independent of each other: the result of one toss doesn't influence the probability of the next toss. - The possible results are unbiased (they are equally likely to occur). .shadow[ .emphasis[ **Fairness**: A chance process is fair if and only if: 1. The outcomes of the process are _independent_ from each other. 2. The process is _unbiased_: the outcomes of the process are equally likely to occur. ] ] --- # The gambler's fallacy The gambler's fallacy occurs when an individual believes that the probability of an event depends on the occurrence of other events, in circumstances in which the events in question are independent from each other. Simply put, the "gambler" fails to consider the _independence_ of the outcomes. In the case of coin tosses, the previous nine coin tosses don't influence the probability of the result of the tenth toss. So this reasoning is incorrect: > 1: The next toss is more likely to be tails, because getting ten heads in a row would be really unlikely: `\((1/2)^{10} = 0.000977...\)` --- # The gambler's fallacy .shadow[ .emphasis[ The **gambler's fallacy** occurs when an individual mistakenly thinks that an event's probability depends on the occurrence of other events, in circumstances in which the events in question are independent from each other. ] ] > The most famous example of gambler's fallacy occurred at the Monte Carlo casino in Las Vegas in 1913. The roulette wheel's ball had fallen on black several times in a row. This led people to believe that it would fall on red soon and they started pushing their chips, betting that the ball would fall in a red square on the next roulette wheel turn. The ball fell on the red square after 27 turns. Accounts state that millions of dollars had been lost by then. From Investopedia: https://www.investopedia.com/terms/g/gamblersfallacy.asp --- # The gambler's fallacy and ignorance What about this answer? Is it wrong? > 2: The next toss is more likely to be heads, because something is weird about this coin that makes it fall heads so frequently. -- In this case, the reasoner considers the fact that the coin has landed 9 times heads to question the information that the coin is fair. But notice that this reasoner is not questioning or ignoring independence. Rather, the reasoner is questioning the _biasness_ of the process. The reasoner thinks that nine heads in a row is evidence that one outcome in the process (heads) is more likely than the other (tails). --- # Independence My wife’s family keeps having girls. My wife has two sisters and she and her sisters each have two daughters, with no other siblings or children. That’s nine girls in a row! So are they due for a boy next? Here are three possible answers. > Answer 1. Yes, the next baby is more likely to be a boy. Ten girls in a row would be a really unlikely outcome. > Answer 2. No, the next baby is actually more likely to be a girl. Girls run in the family! Something about this family clearly predispose them to have girls. > Answer 3. No, the next baby is equally likely to be a boy vs. girl. Each baby’s sex is determined by a purely random event, similar to a coin flip. So it’s equal odds every time. The nine girls so far is just a coincidence. Which answer is correct? --- # Independence - `\(E\)`: My wife's family has had nine girls in a row. - `\(H\)`: The next baby in the family will be a boy. Does the probability of `\(H\)` change when `\(E\)` is true? Suppose that, with no further information, the probability of `\(H\)` is 50% (or .5, or 1/2). In other words, initially, it's equally likely to have a boy than a girl. Now you learn `\(E\)`. Should the probability of `\(H\)` be different now? > Answer 1. Yes, the probability of H should increase, so the next baby is more likely to be a boy. > Answer 2. No, the probability of H should decrease, so the next baby is actually more likely to be a girl. > Answer 3. No, the probability of H should stay the same, so the next baby is equally likely to be a boy vs. girl. --- # Independence and inductive logic The gambler's fallacy in an argument: 1. My wife's family has had nine girls in a row. 2. Therefore, the next baby in the family will be a boy. - `\(E\)`: My wife's family has had nine girls in a row. - `\(H\)`: The next baby in the family will be a boy. We said that, since these propositions are _independent_, the probability of H should stay the same _despite the fact that E occurred_, so the next baby is equally likely to be a boy vs. girl. Since `\(E\)` doesn't make `\(H\)` more probable, this inductive argument is rather weak. In contrast, in strong inductive arguments, the conclusion (hypothesis) should be _dependent_ on the premises (evidence). --- # A good inductive argument 1. In a recent poll, 99% of people said they will vote for James as the next mayor. 2. Therefore, James will be the next mayor. - `\(E\)`: In a recent poll, 99% of people said they will vote for James as the next mayor. - `\(H\)`: James will be the next mayor. In this case, the propositions are dependent, in the sense that the truth of the evidence increases the probability of the conclusion. In a sense, `\(E\)` and `\(H\)` are probabilistically _connected_. Thus, in a good inductive argument, premises are related to the conclusion in this way: they are **probabilistically dependent statements**.