BIOS601 AGENDA: Tuesday September 15, 2015
[updated September 07, 2015]
 Agenda for Tuesday Sept 15, 2015 
  
  -   Discussion of  issues
  in JH's 
  Notes and assignment on C&H Ch01 [prob. Models] and Ch02 [conditional Prob. Model]
 
 Answers to be handed in for: 
  Supplementary exercises 1.1, 1.2,  2.1, 2.2, 2.3, 2.4, 2.5
 
 Remarks:
 
 Chapter 1 of C&H introduces some ways of looking at statistical
  entities and concepts that you may not have met, as well as some 
  terminology that is used in a more specific way in epidemiology. You might want to
  look at section 1 of JH's notes, from earlier years, on
  
  Concepts involved in Occurrence Measures in Epidemiology.
  JH has also included the first page of this section (mostly definitions) in
  the notes that annotate the C and H chapters: he has placed it under the heading 'Important: Concepts and Terms in Epidemiology'
  after his notes on 1.2 Binary data, and before 1.3 The binary probability model.
 
 JH's notes on Section 1.4 of C&H (and Supp. exercise 1.2)
  are intended to 'shake you up a bit' and force you to think
  outside the box as for how you used to estimate the parameters
  of a simple linear regression. This model is usually
  shown as a 2-parameter (slope, intercept) model, but JH has
  deliberately reduced the model to a 1-parameter version,
  with the "line" going through the origin [other examples
  might be trying to estimate (from error-containing
  measurements of the volumes of 2 spheres of different radii:
  radii measured withut error!)
  the constant in the relation:
  Volume of a sphere = "some constant" times the cube of its diameter.]
  The fewer the elements involved, the more chance there is to really 
  master the fundamentals and 'join the dots.'
 
 Chapter 2 of C&H is -- to JH at least -- a very elegant and simple
 and graphic way to introduce probabilities, and particularly
 those that are linked to each other in time, or by
 additional pieces of knowledge. And notice how many probabilities
 of interest go from right to left, i.e., from after to before.
 
 It is worthwhile to work through C&H's own exercises and then check
 your answers agains the solutions they provide at the end of Ch 2.
 
 Fig 2 in JH's Notes on Ch 2 has several simple but educational
 examples showing the different 'directionalities'. It also
 emphasizes that products of probabilities are like 'fractions of fractions'
 but that sometimes, the probabilities depend on what has gone before,
 and sometimes do not.
 
 The 2 stories accompanying the Notes on section 2.2  should serve as a stark
 and frightening reminder that P(theta|data) is a very different 'animal'
 than P(data|theta) and that the consequences of mixing them can be enormous.
 
 If you want a topical example, think of the difference between
 P(A|B) and P(B|A), where A = the hypothesis that Higgs Boson particles exist, 
 and B = the bump in the curve. Btw, JH likes
 to label the elements in what appears to be the best 'logical' or
 'chronological' or 'causal' order, i.e., A -> B, but notices
 that many textbooks teach the concepts using arbitrary letters.
 
 JH's notes on Section 2.3 have a genetics (haemophilia) example that is
 still very relevant. But, since he first encountered it 40 years ago,
 medical science has advanced , and so one doesn't not now need to wait
 until the woman has one or more offspring before learning about her carrier status.
 JH would be grateful for a different example where one would still
 need to wait.
 
 At a debate a few years ago, JH came up with the challenge of
  estimating/judging a person's age from various pieces of information.
  You might like to take a quick look at 
  the 
  example & pieces of information provided
 
 
 Supplementary Exercise 2.2 ('The Monty Hall Problem') can be very frustrating
  and is easily misunderstood. JH has had to break up
  fights between people who are over-confident but under-listening.
  Key is the fact that Monty Hall KNOWS
  which door contains which: sometimes (how often?)
  he has a choice of 2 doors that he could open
  to reveal nothing, and sometimes (how often?)
  he only has 1 choice.
 
 In Exercise 2.3, it is equally important to be precise as to the 
  information provided.
 
 In Exercise 2.4, we have another good example of the difference between P(H|data)
  and P(data|H). Notice here that we are not examining a range of possible
  H's, just 2 specific H's. Notice further that in the Bayesian approach we do not consider
  data values that have not been observed; in contrast, the p-value does consider data values
  that have not been observed (we should not call such unobserved values 'data', but
  rather, potential data values.
 
 JH finds that diagrams, especially 'tree' diagrams, can be very
  helpful in these types of problems, and again when we revisit the Binomial.
 
 Q2.5 is new this year, so the wording hasn't had the same beta-testing as 2.1-2.4.