{"id":90,"date":"2016-09-10T08:13:49","date_gmt":"2016-09-10T08:13:49","guid":{"rendered":"http:\/\/www.aarondefazio.com\/tangentially\/?p=90"},"modified":"2023-10-13T10:32:30","modified_gmt":"2023-10-13T14:32:30","slug":"a-complete-guide-to-the-bayes-factor-test","status":"publish","type":"post","link":"https:\/\/www.aarondefazio.com\/tangentially\/?p=90","title":{"rendered":"A complete guide to the bayes factor test"},"content":{"rendered":"<p>(<a href=\"https:\/\/www.aarondefazio.com\/adefazio-bayesfactor-guide.pdf\">PDF version<\/a> of this article is available)<\/p>\n<p><!--l. 31--><\/p>\n<p class=\"noindent\" >The Bayes factor test is an interesting thing. Some Bayesians advocate it unequivalently, whereas others reject the notion of<br \/>\ntesting altogether, Bayesian or otherwise. This post takes a critical look at the Bayes factor, attempting to tease apart the<br \/>\nideas to get to the core of what it&#x2019;s really doing. If you&#8217;re used to frequentist tests and want to understand Bayes factors, this<br \/>\npost is for you.\n<\/p>\n<p><!--l. 38--><\/p>\n<p class=\"indent\" >   The Bayes factor test goes all the way back to Je\ufb00reys&#x2019; early book on the Bayesian approach to statistics [<a \nhref=\"#Xjeffreys-top\">Je\ufb00reys<\/a>,\u00a0<a \nhref=\"#Xjeffreys-top\">1939<\/a>]. Evidence for an<br \/>\nalternative hypothesis <!--l. 40--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/math> against<br \/>\nthat of the null hypothesis <!--l. 41--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/math><br \/>\nis summarized by a quantity known as the Bayes factor. The Bayes factor is just the ratio of the data likelihoods, under<br \/>\nboth hypotheses and integrating out any nuisance parameters:\n<\/p>\n<div class=\"math-display\"><!--l. 44--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                                                <msub><mrow \n><mi \n>B<\/mi><\/mrow><mrow \n><mn>1<\/mn><mn>0<\/mn><\/mrow><\/msub \n><mo \nclass=\"MathClass-rel\"> : =<\/mo> <mfrac><mrow \n><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>D<\/mi><mo \nclass=\"MathClass-rel\">|<\/mo><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow> \n<mrow \n><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>D<\/mi><mo \nclass=\"MathClass-rel\">|<\/mo><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfrac><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 46--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 49--><\/p>\n<p class=\"indent\" >   If this ratio is large, we can conclude that there is strong evidence for the alternative hypothesis. In contrast, if the<br \/>\ninverse of this ratio is large, we have evidence supporting the null hypothesis. The de\ufb01nition of what is strong evidence is<br \/>\nsubjective, but usually a Bayes factor of 10 or more is considered su\ufb03cient.\n<\/p>\n<h3 class=\"likesectionHead\"><a \n id=\"x1-1000\"><\/a>The standard Bayes factor test against the point-null<\/h3>\n<p><!--l. 58--><\/p>\n<p class=\"noindent\" >Although Bayes factors are sometimes used for testing simple linear regression models against more complex ones, by far<br \/>\nthe most common test in practice is the analogue to the frequentist t-test, the Bayes factor t-test. Under the<br \/>\nassumption of normality with unknown variance, it tests a null hypothesis of zero mean against non-zero<br \/>\nmean.\n<\/p>\n<p><!--l. 64--><\/p>\n<p class=\"indent\" >   This test is implemented in the <a \nhref=\"http:\/\/bayesfactorpcl.r-forge.r-project.org\/\" >BayesFactor<\/a> R package with the ttestBF method. For the pairwise case it<br \/>\ncan be invoked with <span \nclass=\"ectt-1000\">bf &#x003C;- ttestBF(x = xdata, y=ydata)<\/span>. For a non-pairwise test just x can be passed<br \/>\nin.\n<\/p>\n<p><!--l. 70--><\/p>\n<p class=\"noindent\" ><span class=\"likeparagraphHead\"><a \n id=\"x1-2000\"><\/a>Statistical details of the test<\/span><br \/>\n   To apply the test we must formally de\ufb01ne the two hypotheses in the form of statistical models. Je\ufb00reys&#x2019; book is very<br \/>\npragmatic, and the modelling choices for this test re\ufb02ect that. Since this is a t-test, we of course assume a normal<br \/>\ndistribution for the data for both hypothesis, so it just remains to just de\ufb01ne the priors on it&#x2019;s parameters in both<br \/>\ncases.\n<\/p>\n<p><!--l. 79--><\/p>\n<p class=\"indent\" >   We have essentially 3 parameters we have to place priors on: the mean<br \/>\n<!--l. 80--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03bc<\/mi><\/math> of the alternative, the variance<br \/>\nunder the alternative <!--l. 80--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msubsup><mrow \n><mi \n>\u03c3<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msubsup \n><\/math> and<br \/>\nthe variance under the null <!--l. 81--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msubsup><mrow \n><mi \n>\u03c3<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msubsup \n><mo \nclass=\"MathClass-punc\">.<\/mo><\/math><br \/>\nThe <!--l. 81--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03bc<\/mi><\/math> under the null<br \/>\nis assumed to be <!--l. 82--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mn>0<\/mn><\/math>.<br \/>\nWe have to be extremely careful about the priors we place on these parameters. Reasonable seeming choices can lead to<br \/>\ndivergent integrals or non-sensical results very easily. We discuss this further below.<\/p>\n<p><!--l. 88--><\/p>\n<p class=\"indent\" >   <span class=\"subparagraphHead\"> <a \n id=\"x1-3000\"><\/a><span \nclass=\"ecbx-1000\">Variance<\/span><\/span><br \/>\n   Because the variance appears in both the numerator and denominator of the Bayes factor<br \/>\n(the null and the alternative), we can use a \u201cnon-informative\u201d improper prior of the form<br \/>\n<!--l. 92--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msup><mrow \n><mi \n>\u03c3<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">=<\/mo> <mn>1<\/mn><mo \nclass=\"MathClass-bin\">\u2215<\/mo><msup><mrow \n><mi \n>\u03c3<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/math> for both.<br \/>\nThis prior causes the marginal likelihood integrals to go to in\ufb01nity, but this cancels out between the numerator and<br \/>\ndenominator giving a \ufb01nite value.\n<\/p>\n<p><!--l. 98--><\/p>\n<p class=\"indent\" >   <span class=\"likesubparagraphHead\"><a \n id=\"x1-4000\"><\/a><span \nclass=\"ecbx-1000\">Mean<\/span><\/span><br \/>\n   The choice of a prior for the mean is a little more subjective. We can&#x2019;t use an improper prior here, as the mean prior<br \/>\nappears only in the numerator, and without the convenient cancelation with factors in the denominator, the Bayes factor<br \/>\nwon&#x2019;t be well de\ufb01ned. Je\ufb00reys argues for a Cauchy prior, which is just about the most vague prior you can use that still<br \/>\ngives a convergent integral. The scale parameter of the Cauchy is just \ufb01xed to the standard deviation in his formulation.<br \/>\nThe BayesFactor package uses 0.707 times the standard deviation instead as the default, but this can be<br \/>\noverridden.\n<\/p>\n<p><!--l. 111--><\/p>\n<p class=\"noindent\" ><span class=\"likeparagraphHead\"><a \n id=\"x1-5000\"><\/a>What are the downsides compared to a frequentist <!--l. 111--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math>-test?<\/span><br \/>\n   The main downside is that the BF test is not as good by some frequentist measures than the tests that are designed to<br \/>\nbe as good as possible with respect to those measures. In particular, consider the standard measure of power at<br \/>\n<!--l. 116--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b3<\/mi><\/math>: the probability of<br \/>\npicking up an e\ufb00ect of size <!--l. 117--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03b3<\/mi><\/math><br \/>\nunder repeated replications of the experiment. Depending on the e\ufb00ect size and the number of samples<br \/>\n<!--l. 118--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>n<\/mi><\/math>, a Bayesian<br \/>\n<!--l. 118--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math>-test<br \/>\noften requires 2-4 times as much data to match the power of a frequentist<br \/>\n<!--l. 120--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math>-test.\n<\/p>\n<p><!--l. 123--><\/p>\n<p class=\"noindent\" ><span class=\"paragraphHead\"><a \n id=\"x1-6000\"><\/a><span \nclass=\"ecbx-1000\">Implementation details<\/span><\/span><br \/>\n   The Bayes factor requires computing marginal likelihoods, which is a quite distinct problem from the usual posterior<br \/>\nexpectations we compute when performing Bayesian estimation instead of hypothesis testing. The marginal likelihood is<br \/>\njust an integral over the parameter space, which can be computed using numerical integration when the<br \/>\nparameter space is small. For this t-test example, the BayesFactor package uses Gaussian quadrature for<br \/>\nthe alternative hypothesis. The null hypothesis doesn&#x2019;t require any integration since it consists of a single<br \/>\npoint.\n<\/p>\n<p><!--l. 135--><\/p>\n<p class=\"indent\" >   Je\ufb00erys lived in the statistical era of tables and hand computation, so he derived several<br \/>\nformulas that can be used to approximate the Bayes factor. The following formula in terms of the<br \/>\n<!--l. 137--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math>-statistic<br \/>\n<!--l. 138--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mover \naccent=\"true\"><mrow \n><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-op\">\u0304<\/mo><\/mover><msqrt><mrow><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>n<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mn>1<\/mn><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-bin\">\u2215<\/mo><msup><mrow \n><mi \n>s<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow><\/msqrt><\/math> works<br \/>\nwell for <!--l. 138--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>n<\/mi><\/math><br \/>\nin the hundreds or thousands:\n<\/p>\n<div class=\"math-display\"><!--l. 140--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                                            <msub><mrow \n><mi \n>B<\/mi><\/mrow><mrow \n><mn>1<\/mn><mn>0<\/mn><\/mrow><\/msub \n> <mo \nclass=\"MathClass-rel\">\u2248<\/mo><msqrt><mrow> <mfrac> <mrow \n> <mn>2<\/mn><\/mrow>\n<mrow \n><mi \n>\u03c0<\/mi><mi \n>n<\/mi><\/mrow><\/mfrac><\/mrow><\/msqrt><mo class=\"qopname\">exp<\/mo><!--nolimits--> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><msup><mrow \n><mi \n>t<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-bin\">\u2215<\/mo><mn>2<\/mn><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 142--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 146--><\/p>\n<p class=\"noindent\" >\n<h3 class=\"likesectionHead\"><a \n id=\"x1-7000\"><\/a>The sensitivity to priors<\/h3>\n<p><!--l. 148--><\/p>\n<p class=\"noindent\" >There are two types of priors that appear in the application of Bayes factors; The priors on the hypothesis<br \/>\n<!--l. 149--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math>,<br \/>\n<!--l. 149--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math>, and the priors on parameters that<br \/>\nappears in marginal likelihoods, the <!--l. 151--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n><\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><mo \nclass=\"MathClass-rel\">|<\/mo><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-punc\">,<\/mo><\/math><br \/>\n<!--l. 151--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n><\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><mo \nclass=\"MathClass-rel\">|<\/mo><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math>. Your prior<br \/>\nbelief ratio <!--l. 152--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-bin\">\u2215<\/mo><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math> is<br \/>\nupdated to form your posterior beliefs using the simple equation:\n<\/p>\n<div class=\"math-display\"><!--l. 154--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                              <mfrac><mrow \n><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><mo \nclass=\"MathClass-rel\">|<\/mo><mi \n>D<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow> \n<mrow \n><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><mo \nclass=\"MathClass-rel\">|<\/mo><mi \n>D<\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfrac> <mo \nclass=\"MathClass-rel\">=<\/mo> <mfenced separators=\"\" \nopen=\"[\"  close=\"]\" ><mrow><mfrac><mrow \n><mo \nclass=\"MathClass-op\">\u222b\n <!--nolimits--><\/mo><!--nolimits--><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>D<\/mi><mo \nclass=\"MathClass-rel\">|<\/mo><msub><mrow \n><mi \n><\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n><\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><mo \nclass=\"MathClass-rel\">|<\/mo><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mi \n>d<\/mi><msub><mrow \n><mi \n><\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/mrow>\n<mrow \n><mo \nclass=\"MathClass-op\"> \u222b\n <!--nolimits--><\/mo><!--nolimits--><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>D<\/mi><mo \nclass=\"MathClass-rel\">|<\/mo><msub><mrow \n><mi \n><\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n><\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><mo \nclass=\"MathClass-rel\">|<\/mo><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mi \n>d<\/mi><msub><mrow \n><mi \n><\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><\/mfrac><\/mrow><\/mfenced> <mo \nclass=\"MathClass-bin\">\u22c5<\/mo><mfrac><mrow \n><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow> \n<mrow \n><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfrac><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 156--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 159--><\/p>\n<p class=\"indent\" >   The ratio <!--l. 159--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-bin\">\u2215<\/mo><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math><br \/>\nis not formally considered part of the Bayes factor, but it plays an important role. It encodes your prior<br \/>\nbeliefs about the hypotheses. If you are using Bayes factors in a subjective fashion, to update your<br \/>\nbeliefs in a hypothesis after seeing data, then it is crucial to properly encode your prior belief in<br \/>\nwhich hypothesis is more likely to be true. However, in practice everybody always takes this ratio to be<br \/>\n<!--l. 165--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mn>1<\/mn><\/math>,<br \/>\nfollowing the ideals of objective Bayesian reasoning, where one strives to introduce as little prior knowledge as possible into<br \/>\nthe problem. If your writing a scienti\ufb01c paper, it&#x2019;s reasonable to use 1 as anybody reading the paper can apply there own<br \/>\nratio as a correction to your Bayes factor, re\ufb02ecting there own prior beliefs.\n<\/p>\n<p><!--l. 172--><\/p>\n<p class=\"indent\" >   The prior ratio of the hypotheses plays an interesting part in the calibration of the Bayes factor<br \/>\ntest. One way of testing Bayes factors is to simulate data from the priors, and compare the empirical<br \/>\nBayes factors over many simulations with the true ratios. This involving sampling a hypothesis,<br \/>\n<!--l. 176--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/math> or<br \/>\n<!--l. 176--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/math>, then sampling<\/p>\n<p><!--l. 176--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n><\/mi><\/math> then D. If the<br \/>\nprobability of sampling <!--l. 177--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/math><br \/>\nand <!--l. 177--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/math><br \/>\nis not equal at that \ufb01rst stage, then the observed Bayes factor will be o\ufb00 by the corresponding<br \/>\n<!--l. 179--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-bin\">\u2215<\/mo><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math>. This is<br \/>\nsomewhat of a tautology, as we are basically just numerically verifying the Bayes factor equation as written. Nevertheless, it<br \/>\nshows that if the priors on the hypotheses don&#x2019;t re\ufb02ect reality, then the Bayes factor won&#x2019;t either. Blind application of a<br \/>\nBayes factor test won&#x2019;t automatically give you interpretable \u201cbeliefs\u201d.\n<\/p>\n<p><!--l. 187--><\/p>\n<p class=\"noindent\" >\n<h4 class=\"likesubsectionHead\"><a \n id=\"x1-8000\"><\/a>Parameter priors<\/h4>\n<p><!--l. 189--><\/p>\n<p class=\"noindent\" >The dependence of Bayes factors on the parameter priors<br \/>\n<!--l. 189--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n><\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><mo \nclass=\"MathClass-rel\">|<\/mo><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math> and<br \/>\n<!--l. 190--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n><\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><mo \nclass=\"MathClass-rel\">|<\/mo><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math> is really<br \/>\nat the core of understanding Bayes factors. It is a far more concrete problem than the hypothesis likelihoods, as there is no<br \/>\nclear \u201cobjective\u201d choices we can make. Additionally, intuitions from Bayesian estimation problems do not carry over to<br \/>\nhypothesis testing.\n<\/p>\n<p><!--l. 196--><\/p>\n<p class=\"indent\" >   Consider the simple BF t-test given above. We described the test using the Cauchy prior on the location<br \/>\nparameter of the Gaussian. Often in practice people use large vague Gaussian priors instead when doing<br \/>\nBayesian estimation (as opposed to hypothesis testing), as they tend to be more numerically stable than the<br \/>\nCauchy. However, it has been noted by several authors [<a \nhref=\"#Xberger-intro\">Berger and Pericchi<\/a>,\u00a0<a \nhref=\"#Xberger-intro\">2001<\/a>] that simply replacing the<br \/>\nCauchy with a wide Gaussian in the BF t-test causes serious problems. Suppose we use a standard deviation of<br \/>\n<!--l. 203--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03bd<\/mi><\/math> in this wide<br \/>\nprior (<!--l. 203--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03bd<\/mi><\/math><br \/>\nlarge). It turns out that Bayes factor scales asymptotically like:\n<\/p>\n<div class=\"math-display\"><!--l. 205--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                                                    <mfrac><mrow \n><mn>1<\/mn><\/mrow> \n<mrow \n><mi \n>\u03bd<\/mi><msqrt><mrow><mi \n>n<\/mi><\/mrow><\/msqrt><\/mrow><\/mfrac><mo class=\"qopname\">exp<\/mo><!--nolimits--> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><msup><mrow \n><mi \n>t<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-bin\">\u2215<\/mo><mn>2<\/mn><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 207--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 210--><\/p>\n<p class=\"indent\" >   where <!--l. 210--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mover \naccent=\"true\"><mrow \n><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-op\">\u0304<\/mo><\/mover><msqrt><mrow><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>n<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mn>1<\/mn><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-bin\">\u2215<\/mo><msup><mrow \n><mi \n>s<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow><\/msqrt><\/math> is the standard<br \/>\nt-statistic of the data. The large <!--l. 211--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03bd<\/mi><\/math><br \/>\nthat was chosen with the intent that it would be &#x2019;non-informative&#x2019; (i.e. minimally e\ufb00ect the result) turns out to directly<br \/>\ndivide the BF!\n<\/p>\n<p><!--l. 215--><\/p>\n<p class=\"indent\" >   This is the opposite situation from estimation, where as<br \/>\n<!--l. 215--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>\u03bd<\/mi><\/math> increases is behaves more and<br \/>\nmore like the improper \ufb02at prior <!--l. 216--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n><\/mi><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow> <mo \nclass=\"MathClass-rel\">\u221d<\/mo> <mn>1<\/mn><\/math>.<br \/>\nWe can&#x2019;t use the \ufb02at prior in the BF t-test, but if we assume that the sample standard deviation<br \/>\n<!--l. 218--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>s<\/mi><\/math><br \/>\nmatches the true standard deviation, we can perform a BF z-test instead. Using a \ufb02at prior on<br \/>\n<!--l. 219--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n><\/mi><\/math>, we get a<br \/>\nBF of:\n<\/p>\n<div class=\"math-display\"><!--l. 221--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                                               <mo \nclass=\"MathClass-rel\">\u2248<\/mo> <mi \n>s<\/mi><msqrt><mrow><mfrac><mrow \n><mn>2<\/mn><mi \n>\u03c0<\/mi><\/mrow>\n <mrow \n><mi \n>n<\/mi><\/mrow><\/mfrac><\/mrow><\/msqrt> <mo class=\"qopname\">exp<\/mo><!--nolimits--> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><msup><mrow \n><mi \n>z<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-bin\">\u2215<\/mo><mn>2<\/mn><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">,<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 223--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 226--><\/p>\n<p class=\"indent\" >   Because the \ufb02at prior appears in the numerator only there is no clear scaling choice for it, so this expression is not in<br \/>\nany sense calibrated (Several approaches to calibration appear in the literature, see <a \nhref=\"#Xrobert-note\">Robert<\/a>\u00a0[<a \nhref=\"#Xrobert-note\">1993<\/a>]).\n<\/p>\n<p><!--l. 230--><\/p>\n<p class=\"indent\" >   It is interesting to see what happens when we apply the Cauchy prior, as used in the standard form of the test. In that<br \/>\ncase we see approximate scaling like [<a \nhref=\"#Xrevisited\">Robert et\u00a0al.<\/a>,\u00a0<a \nhref=\"#Xrevisited\">2009<\/a>]:\n<\/p>\n<div class=\"math-display\"><!--l. 233--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                                               <mo \nclass=\"MathClass-rel\">\u2248<\/mo><msqrt><mrow> <mfrac> <mrow \n> <mn>2<\/mn><\/mrow>\n<mrow \n><mi \n>\u03c0<\/mi><mi \n>n<\/mi><\/mrow><\/mfrac><\/mrow><\/msqrt><mo class=\"qopname\">exp<\/mo><!--nolimits--> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><msup><mrow \n><mi \n>t<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-bin\">\u2215<\/mo><mn>2<\/mn><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 235--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 238--><\/p>\n<p class=\"indent\" >   A similar formula appears when the data standard deviation is assumed to be known also:\n<\/p>\n<div class=\"math-display\"><!--l. 240--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                                           <mo \nclass=\"MathClass-rel\">\u2248<\/mo><msqrt><mrow> <mfrac> <mrow \n> <mn>2<\/mn><\/mrow>\n<mrow \n><mi \n>\u03c0<\/mi><mi \n>n<\/mi><\/mrow><\/mfrac><\/mrow><\/msqrt> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">+<\/mo> <mfrac><mrow \n><msup><mrow \n><mi \n>t<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow> \n<mrow \n><mi \n>n<\/mi><\/mrow><\/mfrac> <\/mrow><\/mfenced><mo class=\"qopname\">exp<\/mo><!--nolimits--> <mfenced separators=\"\" \nopen=\"(\"  close=\")\" ><mrow><msup><mrow \n><mi \n>t<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><mo \nclass=\"MathClass-bin\">\u2215<\/mo><mn>2<\/mn><\/mrow><\/mfenced><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 242--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 245--><\/p>\n<p class=\"indent\" >   The <!--l. 245--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">+<\/mo> <mfrac><mrow \n><msup><mrow \n><mi \n>t<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow> \n<mrow \n><mi \n>n<\/mi><\/mrow><\/mfrac> <\/math> part can<br \/>\nalso be written <!--l. 245--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mn>1<\/mn> <mo \nclass=\"MathClass-bin\">+<\/mo> <mfrac><mrow \n><mover \naccent=\"true\"><mrow \n><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-op\">\u0304<\/mo><\/mover><\/mrow> \n<mrow \n><msup><mrow \n><mi \n>s<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow><\/mfrac> <\/math><br \/>\nin terms of the known standard deviation and the empirical mean, and clearly it is constant asymptotically; the expression<br \/>\non the whole scales similarly as in the unknown variance case.<\/p>\n<p><!--l. 253--><\/p>\n<p class=\"noindent\" >\n<h3 class=\"likesectionHead\"><a \n id=\"x1-9000\"><\/a>Lindley&#x2019;s paradox<\/h3>\n<p><!--l. 255--><\/p>\n<p class=\"noindent\" >Lindley&#x2019;s paradox refers to a remarkable di\ufb00erence in the behavior of Bayes factor tests compared<br \/>\nto frequentist tests as the sample size increases. Under a frequentist t-test, we compute the<br \/>\n<!--l. 257--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math> statistic and compare it against<br \/>\na threshold that depends on <!--l. 258--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>n<\/mi><\/math>,<br \/>\nand that decreases as <!--l. 259--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>n<\/mi><\/math> increases,<br \/>\n(Recall the de\ufb01nition of the <!--l. 259--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math><br \/>\nstatistic: <!--l. 260--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mover \naccent=\"true\"><mrow \n><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-op\">\u0304<\/mo><\/mover><msqrt><mrow><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>n<\/mi> <mo \nclass=\"MathClass-bin\">\u2212<\/mo> <mn>1<\/mn><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-bin\">\u2215<\/mo><msup><mrow \n><mi \n>s<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/mrow><\/msqrt><\/math>). In contrast the e\ufb00ective<br \/>\nthreshold used by the BF <!--l. 261--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math>-test<br \/>\nactually increases as <!--l. 261--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>n<\/mi><\/math> increases,<br \/>\nalthough slowly, following roughly <!--l. 262--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msqrt><mrow><mo class=\"qopname\">log<\/mo> <!--nolimits--> <mi \n>n<\/mi><\/mrow><\/msqrt><\/math>.<br \/>\nThis di\ufb00erence is illustrated in the following plot lifted from <a \nhref=\"#Xrouder2009\">Rouder et\u00a0al.<\/a>\u00a0[<a \nhref=\"#Xrouder2009\">2009<\/a>]:\n<\/p>\n<div class=\"center\" \n><br \/>\n<!--l. 265--><\/p>\n<p class=\"noindent\" >\n<p><!--l. 266--><\/p>\n<p class=\"noindent\" ><img \nsrc=\"critical-t.png\" alt=\"PIC\"  \n \/>\n<\/p>\n<\/div>\n<p><!--l. 269--><\/p>\n<p class=\"indent\" >   The e\ufb00ect of this di\ufb00erence is that for large sample sizes, a frequentist test can reject the null with small<br \/>\n<!--l. 270--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><\/math> (say<br \/>\n<!--l. 270--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mn>0<\/mn><mo \nclass=\"MathClass-punc\">.<\/mo><mn>0<\/mn><mn>5<\/mn><\/math>), while a<br \/>\nBF <!--l. 271--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math>-test<br \/>\nrun on the same data can strongly support the null hypothesis! As a concrete example, if we generate<br \/>\n<!--l. 272--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>n<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mn>5<\/mn><mn>0<\/mn><mo \nclass=\"MathClass-punc\">,<\/mo><mn>0<\/mn><mn>0<\/mn><mn>0<\/mn><\/math> synthetic data points<br \/>\nwith mean <!--l. 273--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mn>2<\/mn><mo \nclass=\"MathClass-bin\">\u2215<\/mo><msqrt><mrow><mn>5<\/mn><mn>0<\/mn><mn>0<\/mn><mn>0<\/mn><mn>0<\/mn><\/mrow><\/msqrt><\/math>, variance<br \/>\n1, then the BF <!--l. 273--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math>-test<br \/>\nyields 0.028 for the alternative hypothesis (i.e. 36:1 odds for the null being true), whereas a frequentist<br \/>\n<!--l. 275--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math>-test rejects<br \/>\nthe null with <!--l. 276--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mn>0<\/mn><mo \nclass=\"MathClass-punc\">.<\/mo><mn>0<\/mn><mn>4<\/mn><mn>7<\/mn><mn>8<\/mn><\/math><br \/>\n(two-sided test).\n<\/p>\n<p><!--l. 278--><\/p>\n<p class=\"indent\" >   It&#x2019;s all well and good to argue the philosophical di\ufb00erences between Bayesian and frequentist approaches, but when<br \/>\nthey disagree so strongly, it becomes a practical issue. The key issue is the handling of small-e\ufb00ects. As<br \/>\n<!--l. 281--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>n<\/mi><\/math> increases and<br \/>\n<!--l. 281--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><msup><mrow \n><mi \n>s<\/mi><\/mrow><mrow \n><mn>2<\/mn><\/mrow><\/msup \n><\/math> stays \ufb01xed, a<br \/>\n\ufb01xed <!--l. 281--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math><br \/>\nvalue of say 2 represents a smaller and smaller measured e\ufb00ect, namely:\n<\/p>\n<div class=\"math-display\"><!--l. 283--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                                                    <mover \naccent=\"true\"><mrow \n><mi \n>x<\/mi><\/mrow><mo \nclass=\"MathClass-op\">\u0304<\/mo><\/mover> <mo \nclass=\"MathClass-rel\">\u221d<\/mo> <mfrac><mrow \n><mn>2<\/mn><\/mrow> \n<mrow \n><msqrt><mrow><mi \n>n<\/mi><\/mrow><\/msqrt><\/mrow><\/mfrac><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 285--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 288--><\/p>\n<p class=\"indent\" >   Detecting small e\ufb00ects requires a large sample size, and the Bayesian<br \/>\n<!--l. 289--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math>-test<br \/>\ngreatly prefers the null hypothesis over that of a small e\ufb00ect. This is usually described positively as a Occam&#x2019;s razor e\ufb00ect,<br \/>\nbut it does have real consequences when we are attempting to \ufb01nd true small e\ufb00ects. Essentially, the BF test will require<br \/>\nmuch more data pick up a small e\ufb00ect then a corresponding frequentist test. Often, this is viewed from the opposite side;<br \/>\nthat frequentist tests too easily reject the null hypothesis for large sample sizes. This is more of an issue<br \/>\nwhen testing more complex models such as linear regressions with multiple coe\ufb03cients. Unless the data was<br \/>\nactually generated from a linear equation with Gaussian noise, large samples will inevitably reject simple<br \/>\nnull models. This is in contrast on simple point null hypothesis testing, where we can appeal to asymptotic<br \/>\nnormality.\n<\/p>\n<p><!--l. 304--><\/p>\n<p class=\"noindent\" >\n<h3 class=\"likesectionHead\"><a \n id=\"x1-10000\"><\/a>Possible changes from the default Bayes factor test<\/h3>\n<p><!--l. 306--><\/p>\n<p class=\"noindent\" >Here we attempt to answer the following question: are the properties of the default Bayes factor test indicative of BF tests<br \/>\nin general? As we see it, there are two main ways to modify the test: We can use a non-point hypothesis as the null, or we<br \/>\ncan change the alternative hypothesis in some fashion. We discuss these possibilities and their combination<br \/>\nseparately.\n<\/p>\n<p><!--l. 314--><\/p>\n<p class=\"noindent\" >\n<h4 class=\"likesubsectionHead\"><a \n id=\"x1-11000\"><\/a>Changing the alternative hypothesis<\/h4>\n<p><!--l. 316--><\/p>\n<p class=\"noindent\" >The default Bayes factor t-test is formulated from a objectivist point of view: it&#x2019;s designed so that the data<br \/>\ncan \u201cspeak for its self\u201d, the priors used are essentially as vague as possible while still yielding a usable test.<br \/>\nWhat about if we choose the prior for the alternative that gives the strongest evidence for the alternative<br \/>\ninstead? This obviously violates the likelihood principle as we choose the prior after seeing the data, but it is<br \/>\nnevertheless instructive in that it indicates how large the di\ufb00erence between point NHST tests and BF tests<br \/>\nare.\n<\/p>\n<p><!--l. 326--><\/p>\n<p class=\"indent\" >   A reasonable class of priors to consider is those that are non-increasing in terms of<br \/>\n<!--l. 327--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mo \nclass=\"MathClass-rel\">|<\/mo><mi \n><\/mi><mo \nclass=\"MathClass-rel\">|<\/mo><\/math> , as they have a<br \/>\nsingle mode at <!--l. 327--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n><\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mn>0<\/mn><\/math>.<br \/>\nWe discuss other priors below. Table <a \nhref=\"#x1-11001r1\">1<!--tex4ht:ref: tab:Comparison-of-methods --><\/a> shows a comparison of p-values, likelihood ratios from the Neyman-Pearson<br \/>\nlikelihood ratio test, and a bound on BFs from any possible prior of this form. It is immediately obvious that according to<br \/>\nthis bound, any reasonable choice of prior yields much less evidence for the alternative then that suggested<br \/>\nby the p-value. For example, the di\ufb00erence at p=0.001 is 18 fold, and the di\ufb00erence is larger for smaller<br \/>\n<!--l. 334--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><\/math>.\n<\/p>\n<div class=\"table\">\n<p><!--l. 336--><\/p>\n<p class=\"indent\" >   <a \n id=\"x1-11001r1\"><\/a><\/p>\n<hr class=\"float\" \/>\n<div class=\"float\" \n><\/p>\n<div class=\"tabular\">\n<table id=\"TBL-1\" class=\"tabular\" \ncellspacing=\"0\" cellpadding=\"0\" rules=\"groups\" \n><\/p>\n<colgroup id=\"TBL-1-1g\">\n<col \nid=\"TBL-1-1\" \/><\/colgroup>\n<colgroup id=\"TBL-1-2g\">\n<col \nid=\"TBL-1-2\" \/><\/colgroup>\n<colgroup id=\"TBL-1-3g\">\n<col \nid=\"TBL-1-3\" \/><\/colgroup>\n<colgroup id=\"TBL-1-4g\">\n<col \nid=\"TBL-1-4\" \/><\/colgroup>\n<colgroup id=\"TBL-1-5g\">\n<col \nid=\"TBL-1-5\" \/><\/colgroup>\n<tr \nclass=\"hline\"><\/p>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<\/tr>\n<tr  \n style=\"vertical-align:baseline;\" id=\"TBL-1-1-\"><\/p>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-1-1\"  \nclass=\"td11\"> Method <\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-1-2\"  \nclass=\"td11\">      <\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-1-3\"  \nclass=\"td11\">      <\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-1-4\"  \nclass=\"td11\">      <\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-1-5\"  \nclass=\"td11\">       <\/td>\n<\/tr>\n<tr \nclass=\"hline\"><\/p>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<\/tr>\n<tr \nclass=\"hline\"><\/p>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<\/tr>\n<tr  \n style=\"vertical-align:baseline;\" id=\"TBL-1-2-\"><\/p>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-2-1\"  \nclass=\"td11\">t-statistic<\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-2-2\"  \nclass=\"td11\"> 1.645 <\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-2-3\"  \nclass=\"td11\"> 1.960 <\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-2-4\"  \nclass=\"td11\"> 2.576 <\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-2-5\"  \nclass=\"td11\"> 3.291<\/td>\n<\/tr>\n<tr \nclass=\"hline\"><\/p>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<\/tr>\n<tr  \n style=\"vertical-align:baseline;\" id=\"TBL-1-3-\"><\/p>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-3-1\"  \nclass=\"td11\"> p-value  <\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-3-2\"  \nclass=\"td11\">  0.1   <\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-3-3\"  \nclass=\"td11\"> 0.05  <\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-3-4\"  \nclass=\"td11\"> 0.01  <\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-3-5\"  \nclass=\"td11\"> 0.001  <\/td>\n<\/tr>\n<tr \nclass=\"hline\"><\/p>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<\/tr>\n<tr  \n style=\"vertical-align:baseline;\" id=\"TBL-1-4-\"><\/p>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-4-1\"  \nclass=\"td11\">   LR     <\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-4-2\"  \nclass=\"td11\">0.258:1<\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-4-3\"  \nclass=\"td11\">0.146:1<\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-4-4\"  \nclass=\"td11\">0.036:1<\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-4-5\"  \nclass=\"td11\">0.0044:1<\/td>\n<\/tr>\n<tr \nclass=\"hline\"><\/p>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<\/tr>\n<tr  \n style=\"vertical-align:baseline;\" id=\"TBL-1-5-\"><\/p>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-5-1\"  \nclass=\"td11\">BF bound<\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-5-2\"  \nclass=\"td11\">0.644:1<\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-5-3\"  \nclass=\"td11\">0.409:1<\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-5-4\"  \nclass=\"td11\">0.123:1<\/td>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-5-5\"  \nclass=\"td11\"> 0.018:1 <\/td>\n<\/tr>\n<tr \nclass=\"hline\"><\/p>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<td>\n<hr \/>\n<\/td>\n<\/tr>\n<tr  \n style=\"vertical-align:baseline;\" id=\"TBL-1-6-\"><\/p>\n<td  style=\"text-align:center; white-space:nowrap;\" id=\"TBL-1-6-1\"  \nclass=\"td11\">        <\/td>\n<\/tr>\n<\/table>\n<\/div>\n<div class=\"caption\" \n><span class=\"id\">Table\u00a01: <\/span><span  \nclass=\"content\">Comparison of methods for point-null normal data tests, reproduced from <a \nhref=\"#Xberger1988likelihood\">Berger and Wolpert<\/a>\u00a0[<a \nhref=\"#Xberger1988likelihood\">1988<\/a>].<\/span><\/div>\n<p><!--tex4ht:label?: x1-11001r1 --><\/p><\/div>\n<hr class=\"endfloat\" \/>\n   <\/div>\n<p><!--l. 359--><\/p>\n<p class=\"indent\" >   Although we disregarded multi-modal alternative priors above, they actually form an interesting class of tests. In terms of<br \/>\nconservativeness, their BF is easily seen to be bounded by the likelihood ratio when using a prior concentrated on the maximum<br \/>\nlikelihood solution <!--l. 362--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mover \naccent=\"true\"><mrow \n><mi \n><\/mi><\/mrow><mo \nclass=\"MathClass-op\">\u0302<\/mo><\/mover><\/math>:\n<\/p>\n<div class=\"math-display\"><!--l. 363--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"block\" ><mrow \n>\n                                                                    <mfrac><mrow \n><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>D<\/mi><mo \nclass=\"MathClass-rel\">|<\/mo><mi \n><\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mn>0<\/mn><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow> \n<mrow \n><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><mi \n>D<\/mi><mo \nclass=\"MathClass-rel\">|<\/mo><mi \n><\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mover \naccent=\"true\"><mrow \n><mi \n><\/mi><\/mrow><mo \nclass=\"MathClass-op\">\u0302<\/mo><\/mover><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/mrow><\/mfrac><mo \nclass=\"MathClass-punc\">.<\/mo>\n<\/mrow><\/math><\/div>\n<p><!--l. 365--><\/p>\n<p class=\"nopar\" >\n<p><!--l. 368--><\/p>\n<p class=\"indent\" >   This ratio is simplify the Neyman-Pearson likelihood ratio for composite hypothesis,<br \/>\nand it&#x2019;s values are also shown in Table <a \nhref=\"#x1-11001r1\">1<!--tex4ht:ref: tab:Comparison-of-methods --><\/a>. This values are still more conservative than<br \/>\n<!--l. 370--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><\/math>-values<br \/>\nby a factor of 2-5. Note that in a Neyman-Pearson test such as the frequentist<br \/>\n<!--l. 372--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math>-test, this<br \/>\nratio is not read-o\ufb00 directly as an odds ratio as we are doing here; that&#x2019;s why this NP odds ratio doesn&#x2019;t agree directly with<br \/>\nthe p-value.\n<\/p>\n<p><!--l. 376--><\/p>\n<p class=\"indent\" >   The interesting aspect of multi-modal priors is that they allow you to formulate a test that is more powerful<br \/>\nfor \ufb01nding evidence of the null hypothesis. It can be shown that under the default BF t-test, evidence for<br \/>\nthe null accumulates very slowly when the null is true, where as evidence for the alternative accumulates<br \/>\nexponentially fast when the alternative is true. Multi-modal priors can be formulated that \ufb01x this imbalance<br \/>\n[<a \nhref=\"#Xjr-nonlocal\">Johnson and Rossell<\/a>,\u00a0<a \nhref=\"#Xjr-nonlocal\">2010<\/a>]. One simple case is a normal or Cauchy prior with a notch around the null<br \/>\n<!--l. 383--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n><\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mn>0<\/mn><\/math> set to<br \/>\nhave low probability.\n<\/p>\n<p><!--l. 386--><\/p>\n<p class=\"indent\" >   If we go a step further and consider the limits of sequences of improper priors, the Bayes factor<br \/>\ncan take essentially any value. Some particular \u201creasonable\u201d sequences actually yield values very similar to<br \/>\n<!--l. 388--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math>-test&#x2019;s<br \/>\np-values [<a \nhref=\"#Xrobert-note\">Robert<\/a>,\u00a0<a \nhref=\"#Xrobert-note\">1993<\/a>] as their posterior probabilities, at least between 0.1 and 0.01. Their construction also places 0<br \/>\nprobability mass in the neighborhood of the null point in the limit, so it is similar to the multi-modal priors discussed in the<br \/>\nprevious paragraph. Although this improper prior does actually avoid Lindley&#x2019;s paradox, it is more of a theoretical<br \/>\nconstruction; it is not recommend in practice, even by it&#x2019;s discoverer [<a \nhref=\"#Xrobert-note\">Robert<\/a>,\u00a0<a \nhref=\"#Xrobert-note\">1993<\/a>].\n<\/p>\n<h5 class=\"likesubsubsectionHead\"><a \n id=\"x1-12000\"><\/a>Consequences for Lindley&#x2019;s paradox<\/h5>\n<p><!--l. 402--><\/p>\n<p class=\"noindent\" >As discussed above, when restricting ourselves to well behaved proper priors, \u201cLindley&#x2019;s paradox\u201d as such can not be<br \/>\nrecti\ufb01ed by a careful choice of a proper prior on the alternative hypothesis. When one takes a step back and considers the<br \/>\nconsequences of this, Lindley&#x2019;s paradox becomes much clearer, as a statement about objective priors rather than BF tests in<br \/>\ngeneral.\n<\/p>\n<p><!--l. 409--><\/p>\n<p class=\"indent\" >   The central idea is this: It requires a lot of evidence to detect small e\ufb00ects when you use vague priors. The idea of running a<br \/>\nlong experiment when you use a vague prior is nonsensical from an experimental design point of view: large amounts of data are<br \/>\nuseful for picking up small e\ufb00ects, and if you expect such a small e\ufb00ect you should encode this sensibly in your prior. The default<br \/>\nCauchy prior used in the BF t-test essentially says that you expect a 50% chance that the absolute e\ufb00ect size (Cohen&#x2019;s<br \/>\n<!--l. 416--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>d<\/mi><\/math>) is larger<br \/>\nthan <!--l. 416--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mn>1<\/mn><\/math>,<br \/>\nwhich is a completely implausible in many settings. For instance, if you are trying to optimize conversion rates on a website<\/p>\n<p>with say a 10% base conversion rate, an e\ufb00ect size of \u201c1\u201d is a 4x increase in sales! Some software defaults to 0.707 instead of<br \/>\n1 for the Cauchy scale parameter, but the e\ufb00ect is largely the same\n<\/p>\n<p><!--l. 423--><\/p>\n<p class=\"indent\" >   Experimental design considerations are important in the choice of a prior, and the Bayesian justi\ufb01cation of the Bayes<br \/>\nfactor depends on our priors being reasonable. The choice of using a particular BF test is not a full experimental<br \/>\ndesign, just as for a frequentist it is not su\ufb03cient to just decide on particular test without regard to the<br \/>\neventual sample size. It is necessary to examine other properties of the test, so determine if it will behave<br \/>\nreasonably in the experimental setting in which we will apply it. So in e\ufb00ect, Lindley&#x2019;s paradox is just the<br \/>\nstatement that a poorly designed Bayesian experiment won&#x2019;t agree with a less poorly designed frequentist<br \/>\nexperiment.\n<\/p>\n<p><!--l. 436--><\/p>\n<p class=\"indent\" >   Put another way, if the Bayes factor is comparing two hypothesis, both of which are unlikely under the data, you<br \/>\nwon&#x2019;t get reasonable results. You can&#x2019;t identify this from examining the Bayes factor, you have to look at<br \/>\nthe experiment holistically. The extreme case of this is when neither hypothesis includes the true value of<br \/>\n<!--l. 440--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n><\/mi><\/math>. In such<br \/>\ncases, the BF will generally not converge to any particular value, and may show strong evidence in either<br \/>\ndirection.\n<\/p>\n<p><!--l. 445--><\/p>\n<p class=\"noindent\" >\n<h5 class=\"likesubsubsectionHead\"><a \n id=\"x1-13000\"><\/a>An example<\/h5>\n<p><!--l. 447--><\/p>\n<p class=\"noindent\" >It is illustrative to discuss an example in the literature of this problem: a high pro\ufb01le case where the Bayesian<br \/>\nand frequentist results are very di\ufb00erent under a default prior, and more reasonable under a subjective<br \/>\nprior. One such case is the study by <a \nhref=\"#Xbem\">Bem<\/a>\u00a0[<a \nhref=\"#Xbem\">2011<\/a>] on precognition. The subject matter is controversial;<br \/>\nBem gives results for 9 experiments, in which 8 show evidence for the existence of precognition at the<br \/>\n<!--l. 453--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi> <mo \nclass=\"MathClass-rel\">=<\/mo> <mn>0<\/mn><mo \nclass=\"MathClass-punc\">.<\/mo><mn>0<\/mn><mn>5<\/mn><\/math><br \/>\nlevel! This obviously extraordinary claim lead to extensive examination of the results from the Bayesian<br \/>\nview-point.\n<\/p>\n<p><!--l. 456--><\/p>\n<p class=\"indent\" >   In <a \nhref=\"#Xwagenmakers\">Wagenmakers et\u00a0al.<\/a>\u00a0[<a \nhref=\"#Xwagenmakers\">2011a<\/a>], the raw data of <a \nhref=\"#Xbem\">Bem<\/a>\u00a0[<a \nhref=\"#Xbem\">2011<\/a>] is analyzed using the default Bayesian t-test. They report<br \/>\nBayes factors towards the alternative (in decreasing order) of 5.88, 1.81, 1.63, 1.05, 0.88, 0.58, 0.47, 0.32, 0.29, 0.13 . So only<br \/>\none experiment showed reasonable evidence of precognition (5.88:1 for), and if you combine the results you get 19:1 odds<br \/>\nagainst an e\ufb00ect.\n<\/p>\n<p><!--l. 463--><\/p>\n<p class=\"indent\" >   In contrast, <a \nhref=\"#XBemUttsJohnson2011\">Bem et\u00a0al.<\/a>\u00a0[<a \nhref=\"#XBemUttsJohnson2011\">2011<\/a>] performed a Bayes factor analysis using informative priors on e\ufb00ect sizes, uses values<br \/>\nconsidered reasonable under non-precognitive circumstances. The 3 largest Bayes factors for the existence of precognition<br \/>\nthey got were 10.1, 5.3 and 4.9, indicating substantial evidence!\n<\/p>\n<p><!--l. 469--><\/p>\n<p class=\"indent\" >   The interesting thing here is that the vague prior used by <a \nhref=\"#Xwagenmakers\">Wagenmakers et\u00a0al.<\/a>\u00a0[<a \nhref=\"#Xwagenmakers\">2011a<\/a>] placed a large prior<br \/>\nprobability on implausibly large e\ufb00ect sizes. Even strong proponents of ESP don&#x2019;t believe in such large e\ufb00ects,<br \/>\nas they would have been easily detected in other past similar experiments. Somewhat counter-intuitively,<br \/>\nthis belief that ESP e\ufb00ects are strong if they exist at all yields a weaker posterior belief in the existence of<br \/>\nESP!\n<\/p>\n<p><!--l. 477--><\/p>\n<p class=\"indent\" >   It should be noted that <a \nhref=\"#Xwagenmakers-clarif\">Wagenmakers et\u00a0al.<\/a>\u00a0[<a \nhref=\"#Xwagenmakers-clarif\">2011b<\/a>] responded to this criticism, but somewhat unsatisfactory in my<br \/>\nopinion. Of course, even with the use of correct subjective Bayesian priors on the alternative hypothesis, these experiments<br \/>\nare not su\ufb03cient to convince a skeptic (such as my self) of the existence of ESP! If one factors in the prior<br \/>\n<!--l. 482--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>1<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><mo \nclass=\"MathClass-bin\">\u2215<\/mo><mi \n>p<\/mi><mrow ><mo \nclass=\"MathClass-open\">(<\/mo><mrow><msub><mrow \n><mi \n>H<\/mi><\/mrow><mrow \n><mn>0<\/mn><\/mrow><\/msub \n><\/mrow><mo \nclass=\"MathClass-close\">)<\/mo><\/mrow><\/math> and<br \/>\nupdates their beliefs correctly, you may go from a 1:1-million belief in ESP existing to a 1:76 belief (if you use the naive<br \/>\ncombined BF of 13,669 from <a \nhref=\"#XBemUttsJohnson2011\">Bem et\u00a0al.<\/a>\u00a0[<a \nhref=\"#XBemUttsJohnson2011\">2011<\/a>]).\n<\/p>\n<p><!--l. 486--><\/p>\n<p class=\"indent\" >   Most reasonable people would consider the chance that a systematic unintentional error was made during the<br \/>\nexperiments to be greater than 1 in a thousand, and so not update their beliefs quite as severely. Regardless, several<br \/>\npre-registered replications of Bem&#x2019;s experiments have failed to show any evidence of precognition [<a \nhref=\"#Xbemreplication\">Ritchie<br \/>\net\u00a0al.<\/a>,\u00a0<a \nhref=\"#Xbemreplication\">2012<\/a>].\n<\/p>\n<p><!--l. 493--><\/p>\n<p class=\"noindent\" >\n<h5 class=\"likesubsubsectionHead\"><a \n id=\"x1-14000\"><\/a>Recommendations when using point-nulls<\/h5>\n<p><!--l. 495--><\/p>\n<p class=\"noindent\" >The general consensus in the literature is that single Bayes factors should not be reported. Instead the BFs corresponding to<br \/>\nseveral di\ufb00erent possible priors for the alternative hypothesis should be reported. Only if the conclusions are stable for a<\/p>\n<p>wide range of priors can they be reported as de\ufb01nitive. If the results are not stable, then it must be argued why a particular<br \/>\nprior, chosen subjectively, is the best choice.\n<\/p>\n<p><!--l. 503--><\/p>\n<p class=\"indent\" >   Bounds giving the largest or smallest possible Bayes factor over reasonable classes of priors, such as the bounds given in<br \/>\n<a \nhref=\"#Xberger-robust\">Berger and Je\ufb00erys<\/a>\u00a0[<a \nhref=\"#Xberger-robust\">1991<\/a>], can also be reported. The consideration of multiple possible priors is generally called \u201crobust\u201d<br \/>\nBayesian analysis.\n<\/p>\n<p><!--l. 509--><\/p>\n<p class=\"noindent\" >\n<h4 class=\"likesubsectionHead\"><a \n id=\"x1-15000\"><\/a>Non-point nulls<\/h4>\n<p><!--l. 511--><\/p>\n<p class=\"noindent\" >As we showed above, under essentially any point-null Bayes factor t-test, the results are much more conservative<br \/>\nthan frequentist t-tests on the same data. But is this indicative of BF tests in general? It turns out Bayes<br \/>\nfactor tests that don&#x2019;t use point-nulls behave much more like their frequentists counterparts [<a \nhref=\"#Xreconcilingonesided\">Casella and<br \/>\nBerger<\/a>,\u00a0<a \nhref=\"#Xreconcilingonesided\">1987<\/a>].\n<\/p>\n<p><!--l. 517--><\/p>\n<p class=\"indent\" >   But \ufb01rst, lets just point out that point-null hypotheses are fundamentally weird things. When we start assigning positive<br \/>\nprobability mass to single values, we move from the comfortable realm of simple distribution functions we can plot on a<br \/>\nwhiteboard, into the <span \nclass=\"ecti-1000\">Terra pericolosa <\/span>that is measure theory. As anybody who has done a measure theory course knows,<br \/>\ncommon sense reasoning can easily go astray even on simple sounding problems. The \u201csurprising\u201d Occam&#x2019;s razor<br \/>\ne\ufb00ect seen in point-null BF tests is largely attributable to this e\ufb00ect. It can be argued that any prior that<br \/>\nconcentrates mass on a single point is not \u201cimpartial\u201d, e\ufb00ectively favoring that point disproportionately [<a \nhref=\"#Xreconcilingonesided\">Casella and<br \/>\nBerger<\/a>,\u00a0<a \nhref=\"#Xreconcilingonesided\">1987<\/a>].\n<\/p>\n<p><!--l. 530--><\/p>\n<p class=\"noindent\" >\n<h5 class=\"likesubsubsectionHead\"><a \n id=\"x1-16000\"><\/a>One-sided tests<\/h5>\n<p><!--l. 532--><\/p>\n<p class=\"noindent\" >The simplest alternative to a point null test is a postive versus negative e\ufb00ect test, where the null is taken to be the entire<br \/>\nnegative (or positive) portion of the real line. This test is very similar in e\ufb00ect to a frequentist one sided test. Unlike the<br \/>\npoint-null case, both the alternative and null hypotheses spaces are the same dimension; in practice this means we can use<br \/>\nthe same unnormalized priors on both without running into issues. This test can be preformed using the <a \nhref=\"http:\/\/bayesfactorpcl.r-forge.r-project.org\/\" >BayesFactor<\/a> R<br \/>\npackage with the ttestBF method, althought somewhat indirectly, by taking the ratio of two seperate point null<br \/>\ntests:\n<\/p>\n<p>   <!--l. 543--><\/p>\n<div class=\"lstlisting\" id=\"listing-1\"><span class=\"label\"><a \n id=\"x1-16001r1\"><\/a><\/span>bfInterval\u00a0&#x003C;<span \nclass=\"cmsy-10\">\u2212<\/span>\u00a0ttestBF(x\u00a0=\u00a0ldata,\u00a0y=rdata,\u00a0nullInterval=c(<span \nclass=\"cmsy-10\">\u2212<\/span>Inf,0))\u00a0<br \/><span class=\"label\"><a \n id=\"x1-16002r2\"><\/a><\/span>bfNonPoint\u00a0&#x003C;<span \nclass=\"cmsy-10\">\u2212<\/span>\u00a0bfInterval[2]\u00a0\/\u00a0bfInterval[1]\u00a0<br \/><span class=\"label\"><a \n id=\"x1-16003r3\"><\/a><\/span>print(bfNonPoint)<\/p><\/div>\n<p><!--l. 550--><\/p>\n<p class=\"indent\" >   Applying one-sided tests in the frequentist setting is routine (in fact often advocated [<a \nhref=\"#XCho20131261\">Cho and Abe<\/a>,\u00a0<a \nhref=\"#XCho20131261\">2013<\/a>]), it<br \/>\nis surprising then that their theoretical properties are very di\ufb00erent to two-sided tests. If we look at both<br \/>\nBayesian and frequentist approaches from the outside, using decision theory (with quadratic loss), we see a<br \/>\nfundamental di\ufb00erence. For the one sided case, the Fisher p-value approach and obeys a weak kind of optimality<br \/>\nknown as admissibility [<a \nhref=\"#Xhwang1992\">Hwang et\u00a0al.<\/a>,\u00a0<a \nhref=\"#Xhwang1992\">1992<\/a>]: no approach strictly dominates it. In contrast, in the two-sided<br \/>\ncase neither the frequentist or the standard Bayesian approach is admissible, and neither dominates the<br \/>\nother!\n<\/p>\n<p><!--l. 562--><\/p>\n<p class=\"noindent\" ><span class=\"likeparagraphHead\"><a \n id=\"x1-17000\"><\/a>What does this tell us about sequential testing settings?<\/span><br \/>\n   The use of Bayes factor tests is sometimes advocated in the sequential testing setting, as their interpretation is not<br \/>\na\ufb00ected by early stopping. \u201cSequential testing\u201d is when a test of some form of test is applied multiple times during the<br \/>\nexperiment, with the experiment potentially stopping early if the test indicates strong evidence of an e\ufb00ect. Using<br \/>\nfrequentist tests sequentially is problematic; the false-positive rate is ampli\ufb01ed and special considerations<br \/>\nmust be made to account for this. See my <a \nhref=\"http:\/\/www.aarondefazio.com\/tangentially\/?p=83\" >previous post<\/a> for details on how to perform the frequentist tests<br \/>\ncorrectly.<\/p>\n<p><!--l. 574--><\/p>\n<p class=\"indent\" >   The interpretation of the Bayes factor in contrast is una\ufb00ected by early stopping. Naive application of a point-null BF<br \/>\ntest does seem to perform reasonable in a sequential setting, as it&#x2019;s naturally conservative nature results in few false<br \/>\npositives being detected. However, as we pointed out above, one-sided non-point null tests are equally valid, and they are<br \/>\nnot particularly conservative. Just applying one sequentially gives you high false-positive rates comparable to the frequentist<br \/>\ntest. The false-positive rate is a frequentist property, and as such it is quite unrelated to the Bayesian interpretation of the<br \/>\nposterior distribution.\n<\/p>\n<p><!--l. 585--><\/p>\n<p class=\"indent\" >   Indeed, in most sequential A\/B testing situations, such as website design tests, the one-sided non-point test seems the<br \/>\nmost reasonable, as the two hypothesis correspond more directly to the two courses of action that will be taken after the<br \/>\ntest concludes: Switch all customers to design A or design B.\n<\/p>\n<p><!--l. 592--><\/p>\n<p class=\"noindent\" >\n<h3 class=\"likesectionHead\"><a \n id=\"x1-18000\"><\/a>So, should I use a point null hypothesis?<\/h3>\n<p><!--l. 594--><\/p>\n<p class=\"noindent\" >Here we start to move outside the realm of mathematics and start to meander into the realm of philosophy. You have to<br \/>\nconsider the purpose of your hypothesis test. Point-nulls enforce a strong Occam&#x2019;s razor e\ufb00ect, and depending on the<br \/>\npurpose of your test, this may or may not be appropriate.\n<\/p>\n<p><!--l. 600--><\/p>\n<p class=\"indent\" >   The examples and motivation used for null hypothesis testing in the original formulation of the problem by Je\ufb00reys<br \/>\ncame from physics. In physics problems testing for absolute truth is a much clearer proposition. Either an exotic particle<br \/>\nexists or it doesn&#x2019;t, there is no in-between. This is far cry from the psychology, marketing and economics problems that are<br \/>\nmore commonly dealt with today, where we always expect some sort of e\ufb00ect, although potentially small and possibly in the<br \/>\nopposite direction from what we guessed apriori.\n<\/p>\n<p><!--l. 609--><\/p>\n<p class=\"indent\" >   You need a real belief in the possibility of zero e\ufb00ect in order to justify a point null test. This is a real<br \/>\nproblem in practice. In contrast, by using non-point null tests you will be in closer agreement with frequentist<br \/>\nresults, and you will need less data to make conclusions, so there is a lot of advantages to avoiding point null<br \/>\ntests.\n<\/p>\n<p><!--l. 615--><\/p>\n<p class=\"indent\" >   It is interesting to contrast modern statistical practice in high energy physics (HEP) with that of other areas where statistics<br \/>\nis applied. It is common practice to seek \u201c\ufb01ve sigma\u201d of evidence before making a claim of a discovery in HEP \ufb01elds. In contrast<br \/>\nto the usual <!--l. 619--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi> <mo \nclass=\"MathClass-rel\">\u2264<\/mo> <mn>0<\/mn><mo \nclass=\"MathClass-punc\">.<\/mo><mn>0<\/mn><mn>5<\/mn><\/math><br \/>\nrequirements, the di\ufb00erence is vast. Five sigma corresponds to about a 1-in-1million<br \/>\n<!--l. 620--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>p<\/mi><\/math> value. However,<br \/>\nif you view the \ufb01ve sigma approach as a heuristic approximation to the point-null BF test, the results are much more comparable.<br \/>\nThe equivalent <!--l. 623--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math><br \/>\nstatistic needed for rejection of the point null under the Bayesian test scales with the amount of data. HEP experiments often<br \/>\naverage over millions of data points, and it is in that multiple-million region where the rejection threshold, when rephrased in<br \/>\nterms of the <!--l. 626--><math \n xmlns=\"http:\/\/www.w3.org\/1998\/Math\/MathML\"  \ndisplay=\"inline\" ><mi \n>t<\/mi><\/math><br \/>\nstatistic, is roughly \ufb01ve.\n<\/p>\n<p><!--l. 634--><\/p>\n<p class=\"noindent\" >\n<h3 class=\"likesectionHead\"><a \n id=\"x1-19000\"><\/a>Concluding guidelines for applying Bayes factor tests<\/h3>\n<ul class=\"itemize1\">\n<li class=\"itemize\">Don&#x2019;t use point-null Bayesian tests unless the point null hypothesis is physically plausible.\n     <\/li>\n<li class=\"itemize\">Always perform a proper experimental design, including the consideration of frequentist implications of your<br \/>\n     Bayesian test.\n     <\/li>\n<li class=\"itemize\">Try to use an informative prior if you are using a point null test; non-informative priors are both controversial<br \/>\n     and extremely conservative. Ideally show the results under multiple reasonable priors.\n     <\/li>\n<li class=\"itemize\">Don&#x2019;t blindly apply the Bayes factor test in sequential testing situations. You need to use decision theory.<\/li>\n<\/ul>\n<p><!--l. 647--><\/p>\n<p class=\"noindent\" >\n<h3 class=\"likesectionHead\"><a \n id=\"x1-20000\"><\/a><\/h3>\n<p><!--l. 1--><\/p>\n<p class=\"noindent\" >\n<h3 class=\"likesectionHead\"><a \n id=\"x1-21000\"><\/a>References<\/h3>\n<p><!--l. 1--><\/p>\n<p class=\"noindent\" >\n<div class=\"thebibliography\">\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xbem\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Daryl\u00a0J. Bem. Feeling the future: Experimental evidence for anomalous retroactive in\ufb02uences on cognition and<br \/>\n  a\ufb00ect. <span \nclass=\"ecti-1000\">Journal of Personality and Social Psychology<\/span>, 2011.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"XBemUttsJohnson2011\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Daryl\u00a0J. Bem, Jessica Utts, and Wesley\u00a0O. Johnson.  Must psychologists change the way they analyze their<br \/>\n  data? a response to wagenmakers, wetzels, borsboom, &#x0026; van der maas. 2011.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xberger-robust\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>James\u00a0O. Berger and William\u00a0H. Je\ufb00erys.  The application of robust bayesian analysis to hypothesis testing<br \/>\n  and occam&#x2019;s razor. Technical report, Purdue University, 1991.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xberger-intro\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>James\u00a0O. Berger and Luis\u00a0R. Pericchi.   Objective bayesian methods for model selection: Introduction and<br \/>\n  comparison. <span \nclass=\"ecti-1000\">IMS Lecture Notes &#8211; Monograph Series<\/span>, 2001.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xberger1988likelihood\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>J.O.  Berger  and  R.L.  Wolpert.   <span \nclass=\"ecti-1000\">The  Likelihood  Principle<\/span>.   Institute  of  Mathematical  Statistics.  Lecture<br \/>\n  notes  :  monographs  series.  Institute  of  Mathematical  Statistics,  1988.     ISBN  9780940600133.     URL<br \/>\n  <a \nhref=\"https:\/\/books.google.com.au\/books?id=7fz8JGLmWbgC\" class=\"url\" ><span \nclass=\"ectt-1000\">https:\/\/books.google.com.au\/books?id=7fz8JGLmWbgC<\/span><\/a>.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xreconcilingonesided\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>George Casella and Roger L. Berger.  Reconciling bayesian and frequentist evidence in the one-sided testing<br \/>\n  problem.   <span \nclass=\"ecti-1000\">Journal  of  the  American  Statistical  Association<\/span>,  82(397):106\u2013111,  1987.   ISSN  0162-1459.   doi:<br \/>\n  10.1080\/01621459.1987.10478396.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"XCho20131261\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Hyun-Chul Cho and Shuzo Abe. Is two-tailed testing for directional research hypotheses tests legitimate? <span \nclass=\"ecti-1000\">Journal<\/span><br \/>\n  <span \nclass=\"ecti-1000\">of Business Research<\/span>, 66(9):1261 \u2013 1266, 2013. ISSN 0148-2963. doi: http:\/\/dx.doi.org\/10.1016\/j.jbusres.2012.02.<br \/>\n  023.  URL <a \nhref=\"http:\/\/www.sciencedirect.com\/science\/article\/pii\/S0148296312000550\" class=\"url\" ><span \nclass=\"ectt-1000\">http:\/\/www.sciencedirect.com\/science\/article\/pii\/S0148296312000550<\/span><\/a>.  Advancing Research<br \/>\n  Methods in Marketing.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xhwang1992\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Jiunn\u00a0Tzon Hwang, George Casella, Christian Robert, Martin\u00a0T. Wells, and Roger\u00a0H. Farrell.  Estimation<br \/>\n  of  accuracy  in  testing.     <span \nclass=\"ecti-1000\">Ann.  Statist.<\/span>,  20(1):490\u2013509,  03  1992.     doi:  10.1214\/aos\/1176348534.     URL<br \/>\n  <a \nhref=\"http:\/\/dx.doi.org\/10.1214\/aos\/1176348534\" class=\"url\" ><span \nclass=\"ectt-1000\">http:\/\/dx.doi.org\/10.1214\/aos\/1176348534<\/span><\/a>.<\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xjeffreys-top\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Harold Je\ufb00reys. <span \nclass=\"ecti-1000\">Theory of Probability<\/span>. 1939.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xjr-nonlocal\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>V.\u00a0E. Johnson and D.\u00a0Rossell.  On the use of non-local prior desities in bayesian hypothesis tests.  <span \nclass=\"ecti-1000\">Journal of<\/span><br \/>\n  <span \nclass=\"ecti-1000\">the Royal Statistical Society<\/span>, 2010.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xbemreplication\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Stuart\u00a0J.  Ritchie,  Richard  Wiseman,  and  Christopher\u00a0C.  French.   Failing  the  future:  Three  unsuccessful<br \/>\n  attempts to replicate bem&#x2019;s \u2018retroactive facilitation of recall&#x2019; e\ufb00ect. <span \nclass=\"ecti-1000\">PLoS ONE<\/span>, 2012.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xrobert-note\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Christian\u00a0P. Robert. A note on je\ufb00reys-lindley paradox. <span \nclass=\"ecti-1000\">Statistica Sinica<\/span>, 1993.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xrevisited\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Christian\u00a0P. Robert, Nicolas Chopin, and Judith Rousseau.  Harold je\ufb00reys&#x2019;s theory of probability revisited.<br \/>\n  <span \nclass=\"ecti-1000\">Statistical Science<\/span>, 2009.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xrouder2009\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Je\ufb00rey\u00a0N. Rouder, Paul l.\u00a0Speckman, Dongchu Sun, Richard\u00a0D. Morey, and Geo\ufb00rey Iverson. Bayesian t tests<br \/>\n  for accepting and rejecting the null hypothesis. <span \nclass=\"ecti-1000\">Psychonomic Bulletin &#x0026; Review<\/span>, 2009.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xwagenmakers\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Eric\u00a0Jan Wagenmakers, Ruud Wetzels, Denny Borsboom, and Han van\u00a0der Maas.  Why psychologists must<br \/>\n  change the way they analyze their data: The case of psi. <span \nclass=\"ecti-1000\">Journal of Personality and Social Psychology<\/span>, 2011a.\n  <\/p>\n<p class=\"bibitem\" ><span class=\"biblabel\"><br \/>\n<a \n id=\"Xwagenmakers-clarif\"><\/a><span class=\"bibsp\">\u00a0\u00a0\u00a0<\/span><\/span>Eric\u00a0Jan Wagenmakers, Ruud Wetzels, Denny Borsboom, and Han van\u00a0der Maas.  Why psychologists must<br \/>\n  change the way they analyze their data: The case of psi: Clari\ufb01cations for bem, utts, and johnson (2011). <span \nclass=\"ecti-1000\">Journal<\/span><br \/>\n  <span \nclass=\"ecti-1000\">of Personality and Social Psychology<\/span>, 2011b.\n<\/p>\n<\/p><\/div>\n","protected":false},"excerpt":{"rendered":"<p>(PDF version of this article is available) The Bayes factor test is an interesting thing. Some Bayesians advocate it unequivalently, whereas others reject the notion of testing altogether, Bayesian or otherwise. This post takes a critical look at the Bayes factor, attempting to tease apart the ideas to get to the core of what it&#x2019;s &hellip; <a href=\"https:\/\/www.aarondefazio.com\/tangentially\/?p=90\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">A complete guide to the bayes factor test<\/span> <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/posts\/90"}],"collection":[{"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=90"}],"version-history":[{"count":4,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/posts\/90\/revisions"}],"predecessor-version":[{"id":132,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=\/wp\/v2\/posts\/90\/revisions\/132"}],"wp:attachment":[{"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=90"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=90"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.aarondefazio.com\/tangentially\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=90"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}