Wagenmakers - Clarifications For Bem.pdf - psychologia - xyzgeo

Yes,PsychologistsMustChangetheWayTheyAnalyzeTheir

Data:Clari¯cationsforBem,Utts,andJohnson(2011)

Eric{JanWagenmakers,RuudWetzels,DennyBorsboom,Rogier

Kievit,&HanL.J.vanderMaas

UniversityofAmsterdam

Abstract

Doespsiexist?Inawidelypublicizedarticlefeaturingnineexperimentswith

overonethousandparticipants,Bem(inpress)claimedthatfutureevents

retroactivelya®ectpeople'sresponses.Inaresponse,wepointedoutthat

Bem'sanalyseswerepartlyexploratory.Moreover,wereanalyzedBem's

datausingadefaultBayesian t -testandshowedthatBem'sevidenceforpsi

isweaktononexistent.Arobustnessanalysiscon¯rmedourskepticalcon-

clusions.Recently,Bem,Utts,andJohnson(2011)questionseveralaspects

ofouranalysis.Inthisbriefreplyweclarifyouranalysisprocedureand

demonstratethatourargumentsstillhold.

Keywords:Con¯rmatoryExperiments,BayesianHypothesisTest,ESP.

TheHistoryandtheHype

Inarecentarticlefor JournalofPersonalityandSocialPsychology ,Bem(inpress)

presentednineexperimentsthattestforthepresenceofpsi.Speci¯cally,Bem'sexperi-

mentsweredesignedtoassessthehypothesisthatfutureeventsa®ectpeople'sthinkingand

people'sbehaviorinthepast(henceforthprecognition).Bemarguedthatineightoutof

thenineexperiments,thedatasupportedthepresenceofprecognition,thatis,one-sided p

valuesweresmallerthan.05.

Bem's¯ndings|and,perhapsmoreimportantly,thefactthattheyweregoingto

bepublishedinamajorjournal|createdastormofmediaattention.Inthe NewYork

Times ,severalresearchersvoicedstrongopinions:Dr.RayHyman,along-timecritic

ofESPresearch,questionedthequalityoftherefereeingprocessashebelievedthatthe

publicationofDr.Bem'sarticlewas\(...)purecraziness(...)anembarrassmentforthe

ThisversionwaslastupdatedwithminorchangesonFebruary18th,2011.Thisresearchwassupported

byVidigrantsfromtheDutchOrganizationforScienti¯cResearch(NWO).Correspondenceconcerningthis

articlemaybeaddressedtoEric{JanWagenmakers,UniversityofAmsterdam,DepartmentofPsychology,

Roetersstraat15,1018WBAmsterdam,theNetherlands.Emailaddress:ej.wagenmakers@gmail.com.

EXTRASENSORYPERCEPTION 2

entire¯eld" 1 ,andDr.DouglasHofstadterarguedfor\(...)acuto®forcraziness,and

whenthatthresholdisexceeded,thenthecriteriaforpublicationshouldgetfar,farmore

stringent."Bem'sarticlewasalsodiscussedin Science (Miller,2011)andmanyothermedia

throughouttheworld.AGooglesearchon\Bem"and\feelingthefuture"generatesover

50,000hits. 2 BemhimselfappearedonthepopularUStelevisionshow TheColbertReport ,

wherethehostdescribedBem'sworkas\extrasensorypornception"referringtothefact

thatExperiment1inBem(inpress)foundthatprecognitionwaspresentonlyforerotic

pictures.Inthe NewYorkTimes ,Bemwasquotedassaying\WhatIshowedwasthat

unselectedsubjectscouldsensetheeroticphotos,butmyguessisthatifyouusemore

talentedpeople,whoarebetteratthis,theycould¯ndanyofthephotos."

SomemonthsbeforeBem'sresearchstartedtoattractalotofmediaattentionwe

wrotearesponsethatcriticizedBem'sworkonseveralcounts.Thisresponsewassubmitted

toJPSPandpublishedinthesameissue(i.e.,Wagenmakers,Wetzels,Borsboom,&van

derMaas,inpress).Inthisresponse,we¯rstnotedthattheanalysisoftheexperiments

hadbeenpartlyexploratory,whereasthestatisticalanalysisassumedafullycon¯rmatory

approach.Thatis,wearguedthatBemhadusedthedatatwice:oncetodiscoveran

interestingresult,andthentotestit.Insupportofourclaim,wepointedtoseveral

instanceswhereitwasclearthattheanalysishadbeenexploratory.

NextweusedBayestheoremtoarguethatthebarforpublishingshouldbesethigher

forclaimsthatareoutlandishorimprobable.Third,weusedadefaultBayesian t test

(Rouder,Speckman,Sun,Morey,&Iverson,2009)tohighlightthattheone-sided p values

usedbyBemoverestimatetheevidenceagainstthenull;infact,ourdefaulttestindicated

littleevidenceinfavorofprecognition|onlyoneofBem'snineexperimentsyieldeddata

substantiallymorelikelyunder H 1 (i.e.,thehypothesisofprecognition)thanunder H 0 .

ItisimportanttonotethatourdefaultBayesiantestdoesnotdependatallon

thepriorprobabilitythatonemayassign H 1 .Therefore,itiscertainlynottruethatour

Bayesiananalysissimplycon¯rmsourinitialbiasagainstprecognition,assomebloggers

mistakenlybelieved.Instead,theresultofourBayesiantestisknownasthe Bayesfactor ,

andwithrespecttopriorassumptionsitonlydependsonthee®ectsize ± expectedunder

H 1 (seealsoLiang,Paulo,Molina,Clyde,&Berger,2008).Inwhatfollows,wewilldenote

thepriordistributionfore®ectsizeunder H 1 as p ( ±jH 1 ).

Thedefaultassumptionwemadeabout p ( ±jH 1 )wasbasedonalongtradition

inBayesianstatisticswherepriordistributionsareconstructedfromgeneraldesiderata

(Je®reys,1961).TheadvantagethatthisbringsisthattheBayesiananalysisisfully

objective(Berger,2004)andavoidssubjectivespeci¯cationoftheexpectede®ectsizes

under H 1 .Werealizedthatthedefaultchoiceleadstoaconservativetest.Indeed,our

abstractstatedthat\(...)inordertoconvinceaskepticalaudienceofacontroversialclaim,

oneneedstoconductstrictlycon¯rmatorystudiesandanalyzetheresultswithstatistical

teststhatareconservativeratherthanliberal."

Despitetheadvantagesofanobjectivetest,wealsorealizedthatthechoiceof p ( ±jH 1 )

couldbedisputed.Wethereforecarriedoutarobustnessanalysisinwhichwesystematically

1 Dr.Hymandidnotquestionthepublicationofaparapsychologicalarticleassuch.Instead,Dr.Hyman

waspuzzledthatJPSPhadacceptedanarticlewithsomanydeparturesfromacceptedmethodological

practice(Dr.Hyman,personalcommunication).

2 Queryissuedon15February2011.

EXTRASENSORYPERCEPTION 3

variedthescaleparameterfor p ( ±jH 1 ),andreportedtheresultsinanonlineappendix. 3

Theseresultsshowedthatforawiderangeofdi®erent,non-defaultpriordistributionson

e®ectsizetheevidenceforprecognitioniseithernon-existentornegligible.

Thepenultimatesectionofourresponseprovidedguidelinesoncon¯rmatoryresearch.

Westressedhowimportantitisthatresearchonprecognitionisconductedinthecontextof

anadversarialcollaboration,thatis,acollaborationwithaquali¯edskeptic(e.g.,Diaconis,

1991).

Throughoutourresponse,wearguedthatourcritiquewasnotmeanttoattackre-

searchonpsi.Thelastparagraphofourresponseisparticularlyclearonthebroader

consequencesofthedebate:

\ItiseasytoblameBemforpresentingresultsthatwereobtainedinpart

byexploration;itisalsoeasytoblameBemforpossiblyoverestimatingthe

evidenceinfavorof H 1 becauseheused p valuesinsteadofatestthatconsiders

H 0 vis-a-vis H 1 .However,Bemplayedbytheimplicitrulesthatguideacademic

publishing|infact,Bempresentedmanymorestudiesthanwouldusuallybe

required.Itwouldthereforebemistakentointerpretourassessmentofthe

Bemexperimentsasanattackonresearchofunlikelyphenomena;instead,our

assessmentsuggeststhatsomethingisdeeplywrongwiththewayexperimental

psychologistsdesigntheirstudiesandreporttheirstatisticalresults.Itisa

disturbingthoughtthatmanyexperimental¯ndings,proudlyandcon¯dently

reportedintheliteratureasreal,mightinfactbebasedonstatisticaltests

thatareexplorativeandbiased(...).WehopetheBemarticlewillbecomea

signpostforchange,awritingonthewall:psychologistsmustchangetheway

theyanalyzetheirdata."

ThebroaderimpactofourresponsetoBemhasbeendescribedas\theBayesian

bomb". 4 Consistentwiththisassessment,Wetzelsetal.(inpress)presenteddefaultBayes

factorsforall855 t testsreportedinthe2007volumesof PsychonomicBulletin&Review

and JournalofExperimentalPsychology:Learning,Memory,andCognition .Theresults

showedthatfor70%ofthedatasetsforwhich p valuesrangefrom.01to.05,theBayes

factorindicatedthattheevidenceinfavorof H 1 is\anecdotal"inthesensethatthedata

arelessthanthreetimesmorelikelyunder H 1 thanunder H 0 .

TheComplaintsbyBem,Utts,andJohnson(2011)

ArecentrebuttalbyBemetal.(2011) 5 questionsseveralaspectsofourresponse

outlinedabove.Wedisagreewithseveraloftheirpoints,butwealsobelievethatsomething

goodmaycomeoutofthisdebate,atleastforthe¯eldofpsi.

BelowwediscusstheBemetal.(2011)rebuttalintermsoffourcentralcomplaints.

The¯rstisthatBem(inpress)did not explorethedatawhenheanalyzedhisresults.

Wearguethatthisgeneralstatementfailstoaddressourdetailedpointsofcritique,that

inearlierworkBemhimselfarguedstronglyinfavorofexploration,andthattheBem

3 Available on the ¯rst author'swebsite or at http://www.ruudwetzels.com/articles/

Wagenmakersetal_robust.pdf .

4 GeorgevanHal,NWTMagazine.

5 Downloadedfrom http://dbem.ws/ResponsetoWagenmakers.pdf onFebruary15th,2011.

EXTRASENSORYPERCEPTION 4

experimentsshowastrongnegativecorrelationbetweensamplesizeande®ectsize(as¯rst

pointedoutbyDr.Hyman,personalcommunication).

ThesecondcomplaintisthatinBem'sexperimentsaone-sidedtestismoreappro-

priatethanatwo-sidedtest.AlthoughwegenerallyagreethataBayesianone-sidedtest

canbeentirelyappropriate(e.g.,Wagenmakers,Lodewyckx,Kuriyal,&Grasman,2010;

Wetzels,Raaijmakers,Jakab,&Wagenmakers,2009)thedangerofaone-sidedtestisthat

itcanbeabusedintheabsenceofstrong apriori expectationstocreateanoverlyopti-

misticimpressionofthetrueevidenceinfavorofthehypothesisunderconsideration.We

willillustratethisdangerwiththreeexperimentsreportedinBem(inpress).

Thethirdcomplaintisthatourdefaultpriordistributionone®ectsize, p ( ±jH 1 ),

wastoowideandassignedtoomuchweighttoimplausiblyhighvaluesofe®ectsize.As

indicatedabove,wehadalreadyaddressedthisissueinourrobustnessanalysis.However,

wedoappreciatetheproposalforaspeci¯cpriordistributionthatcannowbeusedto

computesubjectiveorinformedBayesfactorsinthe¯eldofpsi.Perhapsfuturestudieswill

usethispriortoevaluatetheevidenceinfavororagainstprecognitionandpsi.Weexamine

atwo-sidedversionoftheproposedpriordistributionindetailinthepenultimatesection

ofthispaper.

Thefourthcomplaintisthatevidenceshouldbecombinedacrossstudies.Weagree

that,inanidealworld,combininginformationacrossmultiplestudiesisuseful.However,

thisisnotaperfectworld,andasstatedinourresponse:

(...)wehaveassessedtheevidentialimpactofBem'sexperimentsinisolation.

Itiscertainlypossibletocombinetheinformationacrossexperiments,forin-

stancebymeansofameta-analysis(Storm,Tressoldi,&DiRisio,2010;Utts,

1991).Weareambivalentaboutthemeritsofmeta-analysesinthecontextof

psi:onemayobtainasigni¯cantresultbycombiningthedatafrommanyex-

periments,butthismaysimplyre°ectthefactthatsomeproportionofthese

experimentssu®erfromexperimenterbiasandexcessexploration.Whenexam-

iningdi®erentanswerstocriticismagainstresearchonpsi,Price(1955,p.367)

concluded\Buttheonlyanswerthatwillimpressmeisanadequateexperiment.

Not1000experimentswith10milliontrialsandby100separateinvestigators

givingtotaloddsagainstchanceof10 1000 to1|butjustonegoodexperiment."

WealsonotethatBem'sarticlewouldmostlikelynothavebeenpublishedifit

hadtobackawayfromtheclaimthattheexperimentsshowed independent evidencefor

precognition,i.e.,whenconsideredinisolation.JPSPdoesnotpublishmanyexperiments

with200participantsthatyieldinconclusiveresults.

Wenowdealwitheachofthecomplaintsindetail.Thereaderwhoisboredcansafely

skiptotheConclusionsection.

Complaint1:ThereReallyWasNoExploration

Bemetal.(2011)denythattherewasanyexplorationintheBem(inpress)exper-

iments.Theyarguethatthehypotheseswereallbasedonpriorresearch,andthateven

thoughmultipleanalyseswereconducted,theseanalysesservedtocon¯rmthesamepoint.

Thisstatementcontrastssharplywithreality.

EXTRASENSORYPERCEPTION 5

Firstofall,Bemetal.(2011)donotaddressthespeci¯cpointsofconcernthat

weraisedinfourparagraphsofourresponse.Forexample,itiscompletelyunclearwhy

gendere®ectsweretestedinthe¯rstplace,asBem(inpress)explicitlystatesthat\the

psiliteraturedoesnotrevealanysystematicsexdi®erencesinpsiability".Inaddition,

ourexperienceisthatpsychologistsexploretheirdataatleasttosomeextent.WhenBem

etal.(2011)claimnottohaveexploredthedataatall,theye®ectivestatethattheresearch

byBem(inpress)isthepinnacleofcon¯rmatoryresearch.Thisimpressionisinconsistent

withapainfullydetailedanalysisoftheBemexperimentsbyJamesAlcock. 6 Moreover,

thisimpressionisalsoinconsistentwiththequotationfromtheBemchaptersonwriting

thatwepresentedinourresponse:

\Theconventionalviewoftheresearchprocessisthatwe¯rstderiveasetof

hypothesesfromatheory,designandconductastudytotestthesehypotheses,

analyzethedatatoseeiftheywerecon¯rmedordiscon¯rmed,andthenchronicle

thissequenceofeventsinthejournalarticle.(...)Butthisisnothowour

enterpriseactuallyproceeds.Psychologyismoreexcitingthanthat(...)"(Bem,

2000,p.4).

Unfortunately,Bemetal.(2011)chosenottoelaborateontheextenttowhichthephi-

losophybehindthisquotation(andothers)discreditstheconclusionsfromallstatistical

analysis,Bayesian,frequentist,orotherwise.

Asa¯nalindicationthattheresultsfromBem(inpress)wereobtainedfromexplo-

ration,RayHyman(personalcommunication)notedthatintheBemstudythelowe®ect

sizestendedtooccurinexperimentswithmanyparticipants.Figure1showsthisasso-

ciation(seealsoHyman,1985).Howcanweexplainthisiftheexperimentswerepurely

con¯rmatory?

Insum,Bemetal.(2011)failtoaddressthequestionsaboutexplorationthatwe

raisedinourresponse.Inaddition,theBemexperimentswithmanyparticipantsshow

smallere®ectsthanthosewithfewerparticipants.Thisstronglysuggeststhatexploration

(perhapsthroughoptionalstopping)didtakeplace.

Complaint2:AOne-SidedTestisMoreAppropriateThanaTwo-SidedTest

Bemetal.(2011)arguethatthetestsforprecognitionintheBemstudiesshouldbe

one-sided,nottwo-sided.Aspointedoutabove,themainproblemwithone-sidedtestsis

thattheymaybeusedtobiastheresults.Thatis,aresearcherwithoutstrongapriori

expectationsmayawaitthedataandselecttheone-sidedtestthatproducesthemost

convincingresult.Infact,thisdisadvantageisillustratedintheverypaperthatBemetal.

(2011)seektodefend.

TheproblemconcernsExperiments5,6,and7anditisperhapsbestillustratedwith

acommentfromRouderandMorey(2011) 7 ,whoalsoadvocatedtheuseofaone-sidedtest

butexcludedtheseexperimentsfromconsideration:

6 Availableat http://www.csicop.org/specialarticles/show/back_from_the_future .Bem'sresponse

andAlcock'sreplycanalsobefoundonline.

7 Downloadedfrom http://pcl.missouri.edu/sites/default/files/rouder-morey.pdf onFebruary

15th,2011.

Wagenmakers - Clarifications For Bem.pdf

Plik z chomika:

Inne pliki z tego folderu:

Inne foldery tego chomika: