ebook img

Surpassing Humans and Computers with JELLYBEAN PDF

20 Pages·2015·14.82 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Surpassing Humans and Computers with JELLYBEAN

Surpassing Humans and Computers with JELLYBEAN: Crowd-Vision-Hybrid Counting Algorithms AkashDasSarma AyushJain ArnabNandi StanfordUniversity UniversityofIllinois TheOhioStateUniversity [email protected] [email protected] [email protected] AdityaParameswaran JenniferWidom UniversityofIllinois StanfordUniversity [email protected] [email protected] Abstract Countingisimportant.Countingobjectsinimagesorvideos is a ubiquitous problem with many applications. For in- Counting objects is a fundamental image processing primi- stance, biologists are often interested in counting the num- tive, and has many scientific, health, surveillance, security, berofcellcoloniesinperiodicallycapturedphotographsof and military applications. Existing supervised computer vi- sion techniques typically require large quantities of labeled petridishes;countingthenumberofindividualsatconcerts training data, and even with that, fail to return accurate re- ordemonstrationsisoftenessentialforsurveillanceandse- sultsinallbutthemoststylizedsettings.Usingvanillacrowd- curity(Liuetal.2005);countingisoftennecessaryinmili- sourcing, on the other hand, can lead to significant errors, taryapplications;countingnervecellsortumorsisstandard especially on images with many objects. In this paper, we practice in medical applications (Loukas et al. 2003); and presentourJellyBeansuiteofalgorithms,thatcombinesthe countingthenumberofanimalsinphotographsofpondsor best of crowds and computer vision to count objects in im- wildlife sanctuaries is often essential for animal conserva- ages,andusesjudiciousdecompositionofimagestogreatly tion(Russelletal.1996).Inmanyofthesescenarios,making improveaccuracyatlowcost.Ouralgorithmshaveseveralde- errorsincountingcanhaveunfavorableconsequences.Fur- sirableproperties:(i)theyaretheoreticallyoptimalornear- thermore,countingisaprerequisitetoother,morecomplex optimal,inthattheyaskasfewquestionsaspossibletohu- mans (under certain intuitively reasonable assumptions that computer vision problems requiring a deeper, more com- wejustifyinourpaperexperimentally);(ii)theyoperateun- pleteunderstandingofimages. derstand-aloneorhybridmodes,inthattheycaneitherwork Countingishardforcomputers. Unfortunately, current su- independentofcomputervisionalgorithms,orworkincon- pervisedcomputervisiontechniquesaretypicallyverypoor cert with them, depending on whether the computer vision atcountingforallbutthemoststylizedsettings,andcannot techniques are available or useful for the given setting; (iii) berelieduponformakingstrategicdecisions.Inourconver- theyperformverywellinpractice,returningaccuratecounts sationswithmicrobiologists,theyreportthattheydonotrely onimagesthatnoindividualworkerorcomputervisionalgo- on automated counts (Carpenter et al. 2006), and will fre- rithmcancountcorrectly,whilenotincurringahighcost. quently spend hours painstakingly counting and cataloging photographs of cell colonies. The computer vision tech- 1 Introduction niques primarily have problems with occlusion, i.e., iden- The field of computer vision (Forsyth and Ponce 2003; tifying objects that are partially hidden behind other ob- Szeliski 2010) concerns itself with the understanding and jects.Asanexample,considerFigure1,depictingtheperfor- interpretationofthecontentsofimagesorvideos.Manyof manceofarecentpre-trainedfacedetectionalgorithm(Zhu the fundamental problems in this field are far from solved, andRamanan2012).Thealgorithmperformspoorlyforoc- witheventhestate-of-the-arttechniquesachievingpoorre- cludedfaces,detectingonly35outof59(59.3%)faces.The sults on benchmark datasets. For example, the recent tech- average precision for the state-of-the-art person detector is niques for image categorization achieve average precision only46%(Everinghametal.2014).Furthermore,thesetech- ranging from 19.5% (for the chair class) to 65% (for the niquesarenotgeneralizable;separatemodelsareneededfor airplane class) on a canonical benchmark (Everingham eachnewapplication.Forinstance,ifinsteadofwantingto et al. 2014). Another study (Nguyen, Yosinski, and Clune count the number of faces in a photograph, we needed to 2015)hasshownthateventhestate-of-the-artdeep-network countthenumberofwomen,orthenumberofpeoplewear- methods are susceptible to high-confidence erroneous la- ing hoodies, we would need to start afresh by training an bels, recognizing objects within images containing no ob- entirelynewmodel. jectsvisibletothehumaneye. Evenhumanshavetroublecounting. While humans are Counting is one such fundamental image understanding muchbetteratcountingthanautomatedtechniques,andare problem, and refers to the task of counting the number of goodatdetectingoccluded(hidden)objects,andnotmissing itemsofaparticulartypewithinanimageorvideo. them while counting, as the number of objects in the im- Copyright©2015,AssociationfortheAdvancementofArtificial ageincreases,theystartmakingmistakes.Psychologystud- Intelligence(www.aaai.org).Allrightsreserved. ies have estimated that human beings have a memory span age.However,itisnotclearwhenorhowweshoulddivide animage,orwheretofocusourattentionbyassigningmore workers. For example, we cannot tell a-priori if all the cell coloniesareconcentratedintheupperleftcorneroftheim- age.Figuringoutthegranularityofthissub-divisionscheme is another problem. Once it has been decided that a partic- ular image region needs to be divided, how many portions should it be divided into? Another challenge is to divide Figure1:ChallengingimageforMachineLearning an image while being cognizant of the fact that you may 6 cut across objects during the division. This could result in 5 double-counting some objects across different sub-images. Error 4 Lastly,giventhattheremaybecomputervisionalgorithms ge 3 thatcanprovideusagoodstartingpointfordecidingwhere a Aver 2 tionffoorcmuastoiounrainttethnetiobnes,tywetaaynpoothsesirbclhea.llengeistoexploitthis 1 0 Adaptivitytotwomodes.Inthespiritofcombiningthebest 0 10 20 30 40 50 60 70 80 of human worker and computer expertise, when available, Number of Objects Figure2:WorkerError wedevelopalgorithmsthatarenear-optimalfortwoseparate regimesormodes: of 7 chunks of information at a time (Miller 1956). While • First, assuming we have no computer vision assistance countingobjectsisasimplerproblemthanrememberingob- (i.e.,nopriorcomputervisionalgorithmthatcouldguide jects,weexpectasimilar(butpossiblyhigher)thresholdfor ustowheretheobjectsareintheimage),wedesignan counting—thisisbecauseworkersmayendupcountingthe algorithm that will allow us to narrow our focus to the sameobjectmultipletimesornotcountthematall.Toob- right portions of the image requiring special attention. serve this behavior experimentally, we had workers count The algorithm, while intuitively simple to describe, is the number of cell colonies in simulated fluorescence mi- theoretically optimal in that it achieves the best possi- croscope images with a wide range of counts. We plot the blecompetitiveratio,undercertainassumptions.Atthe resultsinFigure2,displayingtheaverageerrorincount(on sametime,inpractice,onarealcrowd-countingdataset, the y-axis) versus the actual count (on the x-axis). As can thecostofouralgorithmiswithin2.75×oftheoptimal beseeninthefigure,crowdworkersmakefewmistakesun- “oracle” algorithm that has perfect information, while tilthenumberofcellshit20or25,afterwhichtheaverage stillmaintainingveryhighaccuracy. error increases. In fact, when the number of cells reaches • Second, if we have primitive or preliminary computer 75,theaverageerrorincountisasmuchas5.(Therearein vision algorithms that provide segmentation and prior factmanyimageswithevenhighererrors.)Therefore,sim- count information, we design algorithms that can use ply showing each image to one or more workers and using thisknowledgetoonceagainidentifytheregionsofthe thosecountsisnotusefulifaccuratecountsaredesired(as image to focus our resources on, by “fast-forwarding” wewillshowlater). totherightareas.Weformulatetheproblemasagraph Theneedforahybridapproach. Thus, since both humans binningproblem,knowntobeNP-COMPLETEandpro- videanefficientarticulation-pointbasedheuristicforthis and computers have trouble with counting, there is a need problem. We show that in practice, on a real biological foranapproachthatbestcombineshumanandcomputerca- celldataset,ouralgorithmhasaveryhighaccuracy,and pabilities to count accurately while minimizing cost. These only incurs 1.3× the cost of the optimal, perfect infor- techniques would certainly help with the counting problem mation“oracle”algorithm. instanceathand—thealternativeofhavingabiology,secu- We dub our algorithms for these two regimes as the Jelly- rity, military, medical, or wildlife expert count objects in Beanalgorithmsuite,asahomagetooneoftheearlyappli- each image can be error-prone, tedious, and costly. At the cationsofcrowdcounting1. sametime,thesetechniqueswouldalsoenablethecollection Here is the outline for the rest of the paper as well as our oftrainingdataatscale,andtherebyspurthegenerationof contributions(WedescriberelatedworkinSection6.) evenmorecapablecomputervisionalgorithms.Tothebest of our knowledge, we are the first to articulate and make • Wemodelimagesastreeswithnodesrepresentingimage segmentsandedgesrepresentingimage-division.Given concrete steps towards solving this important, fundamental thismodel,wepresentanovelformulationofthecount- visionproblem. ing problem as a search problem over the nodes of the Keyidea:judiciousdecomposition. Our approach, inspired tree(Section2). by Figure 2, is to judiciously decompose an image into • We present a crowdsourced solution to the problem of smallerones,focusingworkerattentionontheareasthatre- countingobjectsoveragivenimage-tree.Weshowthat quire more careful counting. Since workers have been ob- served to be more accurate on images with fewer objects, 1Countingorestimatingthenumberofjellybeansinajarhas thekeyideaistoobtainreliablecountsonsmaller,targeted beenapopularactivityinfairssince1900s,whilealsoservingas sub-images,andusethemtoinfercountsfortheoriginalim- unfortunatevehiclefordisenfranchisement(NBCNews2005). G=(V,E) SegmentationTree V V0 RootNode 0 Image(Vi) Image/SegmentcorrespondingtonodeVi F Frontier TrueCount(Vi) ActualnumberofobjectsinImage(Vi) WorkerCount(Vi) Worker estimate of number of objects in Image(Vi) V1 V2 Ci ChildrenofnodeVi b Fanoutofthesegmentationtree d∗ Countingthresholdforworkererrors V V V5 V6 V7 GP =(VP,EP) PartitionGraph 3 4 Table1:NotationTable Figure3:SegmentationTree {V ,V }. V , in turn, can be split into segments {V ,V }, 1 2 1 3 4 and V into {V ,V ,V }. Intuitively, the root image can be 2 5 6 7 under reasonable assumptions, our solution is provably thought of as physically segmented into the five leaf nodes optimal(Section3). {V ,V ,V ,V ,V }. 3 4 5 6 7 • Weextendtheabovesolutiontoahybridschemethatcan We assume that all segments of a node are non- work in conjunction with computer vision algorithms, overlapping. That is, given any node V and its imme- i leveraging prior information to reduce the cost of the diate set of children C , we have (1) Image(V ) = i i crowdsourcing component of our algorithm, while sig- (cid:83) Image(V ) and (2) Image(V ) ∩ Image(V ) = nificantlyimprovingourcountestimates(Section4). Vj∈Ci j j k φ∀V ,V ∈ C We denote the actual number of objects • We validate the performance of our algorithms against j k i inasegment,Image(V ),byTrueCount(V ).Asignificant credible baselines using experiments on real data from i i challengeistoensurethateachobjectappearsinexactlyone twodifferentrepresentativeapplications(Section5). segmentinC .WewilldescribehowwehandlethisinSec- i tion 3.4. Our assumption of non-overlapping splits ensures 2 Preliminaries (cid:80) thatTrueCount(V )= TrueCount(V ). i Vj ∈Ci j Inthissection,wedescribeourdatamodelfortheinputim- Our first constraint (1) above ensures that each object is ages and our interaction model for worker responses. We countedatleastonce,whilethesecondconstraint(2)ensures providealistofnotationsinTable1foreasyreference. thatnoneoftheobjectsarecountedmorethanonce. Oneofthemajorchallengesofthecountingproblemisto 2.1 DataModel estimatetheseTrueCountvalueswithhighaccuracy,byus- Givenanimagewithalargenumberof(possiblyheteroge- ing elicited worker responses. Given the segmentation tree nous) objects, our goal is to estimate, with high accuracy, GforimageV ,wecanaskworkerstocount,possiblymul- 0 the number of objects present. As noted above in Figure 2, tiple times, the number of objects in any of the segments. humans can accurately count up to a small number of ob- For example, in Figure 3, we can ask workers to count the jects,butmakesignificanterrorsonimageswithlargernum- number of objects in the segments (V ), (V ), (V ), (V ), 3 4 5 6 bers of objects. To reduce human error, we split the image (V ), (V = V ∪ V ), (V = V ∪ V ∪ V ),(V = 7 1 3 4 2 5 6 7 0 into smaller portions, or segments, and ask workers to esti- V ∪V ∪V ∪V ∪V ). While we can obtain counts for 3 4 5 6 7 matethenumberofobjectsineachsegment.Naturally,there differentnodesinthesegmentationtree,weneedtoconsoli- aremanywayswemaysplitanimage.Wediscussourpre- datethesecountstoafinalestimateforV .Tohelpwiththis, 0 cise algorithms for splitting an image into segments subse- weintroducetheideaofafrontier,whichiscentraltoallour quently.Fornow,weassumethatthesegmentationisfixed. algorithms.Intuitively,afrontierF isasetofnodeswhose We represent a given image and all its segments in the correspondingsegmentsdonotoverlap,andcovertheentire formofadirectedtreeG = (V,E),calledasegmentation originalimage,Image(V )onmerging.Weformallydefine 0 tree.Theoriginalimageistherootnode,V0,ofthetree.Each thisnotionbelow. node V ∈ V,i ∈ {0,1,2,...} corresponds to a sub- i Definition 2.1 (Frontier). Let G = (V,E) be a segmen- image,denotedbyImage(V ).WecallanodeV asegment i j tation tree with root node V . A set of k nodes given by of V if Image(V ) is contained in Image(V ). A directed 0 i j i F = {V ,V ,...,V },whereV ∈ V∀i ∈ {1,...,k}is edge exists between nodes Vi and Vj : (Vi,Vj) ∈ E, if afrontier1of2sizekifkImage(V )i=(cid:83) Image(V ),and and only if Vi is the lowest node in the tree, i.e. smallest 0 Vi∈F i Image(V )∩Image(V )=φ∀V ,V ∈ F image, such that V is a segment of V . Note that not all i j i j j i segments need to be materialized to actual images — the A frontier F is now a set of nodes in the segmentation treeisconceptualandcanbematerializedondemand. For tree such that taking the sum of TrueCount(·) over these brevity, we refer to the set of children of node Vi (denoted nodes returns the desired count estimate TrueCount(V0). as Ci) as the(cid:83)split of Vi. If Ci = {V1,··· ,Vs}, we have Note that the set containing just the root node, {V0} is a Image(Vi)= j∈{1,···,s}Image(Vj). trivialfrontier.ContinuingwithourexampleinFigure3,we Forexample,considerthesegmentationtreeinFigure3. have the following five possible frontiers: {V }, {V ,V }, 0 1 2 The original image, V , can be split into the two segments {V ,V ,V ,V },{V ,V ,V },and{V ,V ,V ,V ,V }. 0 1 5 6 7 2 3 4 3 4 5 6 7 2.2 WorkerBehaviorModel 3.2 TheFrontierSeekingAlgorithm Intuitively,workersestimatethenumberofobjectsinanim- Our algorithm is based on the simple idea that to estimate age correctly if the image has a small number of objects. the number of objects in the root node, we need to find a Asthenumberofobjectsincreases,itbecomesdifficultfor frontier with all nodes having fewer than d∗ objects. This humans to keep track of which objects have been counted. isbecausetheelicitedWorkerCountsaretrustworthyonly As a result, some objects may be counted multiple times, at the nodes that meet this criteria. We call such a frontier whereasothersmaynotbecountedatall.Basedontheex- a terminating frontier. If we query all nodes in such a ter- perimental evidence in Figure 2, we hypothesize that there minating frontier, then the sum of the worker estimates on isathresholdnumberofobjects,abovewhichworkersstart those nodes is in fact the correct number of objects in the to make errors, and below which their count estimates are rootnode,givenourmodelofworkerbehavior.Notethatter- accurate. Let this threshold be d∗. So, in our interface, we minating frontiers are not necessarily unique for any given asktheworkerstocountthenumberofobjectsinthequery imageandsegmentationtree. image.Iftheirestimate,d,islessthand∗,theyprovidethat Definition 3.1. A frontier F is said to be terminating if estimate. If not, they simply inform us that the number of ∀V ∈ F,TrueCount(V)≤d∗ objectsisgreaterthand∗.Thisallowsustocaptheamount If a node V has greater than d∗ objects, then we cannot of work done by the workers and ensure that we pay them estimate the number of objects in its parent node, and con- a minimum wage–the workers can count as far as they are sequentlytherootnode,withoutqueryingV’schildren.Our willing to correctly, and if the number of objects is, say, in Algorithm 1, FrontierSeeking(G), depends on this ob- the thousands, they may just inform us that this is greater servation for finding a terminating frontier efficiently, and than d∗ without expending too much effort. We denote the correspondinglyobtainingacountfortherootnode,V .The worker’sestimateofTrueCount(V)byWorkerCount(V). 0 algorithmsimplyqueriesnodesinatop-downexpansionof (cid:26) TrueCount(V) :TrueCount(V)≤d∗ the segmentation tree, for example, with a breadth-first or WorkerCount(V)= >d∗ :TrueCount(V)>d∗ depth-first search. For each node, we query its children if andonlyifworkersreportitscountasbeinghigherthanthe Based on Figure 2, the threshold d∗ is 20. We provide fur- threshold d∗. We continue querying nodes in the tree, only ther experimental verification for this error model in Ap- stoppingourexpansionatnodeswhosecountsarereported pendixB.Whilewecouldchoosetousemorecomplexerror assmallerthand∗,untilwehavequeriedallnodesinater- models,wefindthattheabovemodeliseasytoanalyzeand minatingfrontier.Wereturnthesumofthereportedcounts experimentallyvalid,andthereforesufficesforourpurposes. ofnodesinthisterminatingfrontierasourfinalestimate. 3 Crowdsourcing-OnlySolution Algorithm1FrontierSeeking(G) In this section, we consider the case when we do not have acomputervisionalgorithmatourdisposal.Thus,wemust Require: SegmentationTreeG=(V,E) useonlycrowdsourcingtoestimateimagecounts.Sinceitis Ensure: TerminatingFrontierF,Actualcountcount oftenhardtotraincomputervisionalgorithmsforeverynew F ←φ typeofobject,thisisascenariothatoftenoccursinpractice. count←0 AshintedatinSection2,theideabehindouralgorithmsis createaqueueQ simple:weaskworkerstoestimatethecountatnodesofthe Q.enqueue(V0){AddrootV0toQ} segmentationtreeinatop-downexpansion,untilwereacha whileQisnotemptydo frontier such that we have a high confidence in the worker V ←Q.dequeue() estimatesforallnodesinthatfrontier. Query(V) ifWorkerCount(V)>d∗then 3.1 ProblemSetup Q.enqueue(children(V)) else Wearegivenafixedb-arysegmentationtreei.e.atreewith F ←F ∪{V} each non-leaf node having exactly b children. We also as- endif sume that each object is present in exactly one segment endwhile across siblings of a node, and that workers follow the be- haviormodelfromSection2.2.Someoftheseassumptions maynotalwaysholdinpractice,andwediscusstheirrelax- 3.3 Guarantees ationslaterinSection3.4. Forbrevity,wewillrefertodisplayinganimagesegment WenowprovetheoptimalitythatourAlgorithm1provides (node in the segmentation tree) and asking a worker to es- underourproposedmodel.Givenanimageanditssegmen- timate the number of objects, as querying the node. Our tation tree, let F∗ be a terminating frontier of the smallest problemcanbethereforerestatedasthatoffindingtheexact possiblesize,havingk nodes.Althoughtherearemanyter- numberofobjectsinanimagebyqueryingasfewnodesof minatingfrontiers,thereexistsauniqueminimalterminating thesegmentationtreeaspossible.Next,wedescribeoural- frontier with k nodes, such that all other terminating fron- gorithm on this setting in Section 3.2, and give complexity tiers,F,have|F| > k.Ourgoalistofindtheminimalter- andoptimalityguaranteesinSection3.3. minatingfrontierwithasfewqueriednodesaspossible. First, we describe the complexity of a baseline optimal tree where our algorithm A queries fewer nodes, and A FS oracle algorithm using the following lemma, whose proof queries at least b k, where k is the size of the minimal b−1 followstriviallyfromtheobservationthatweneedtoquery terminatingfrontier. at least one complete terminating frontier to obtain a count fortherootnodeofthetree. 3.4 PracticalSetup Lemma3.1. Thenumberofquestionsrequiredtogetatrue Inthissectionwediscusssomeofthepracticaldesignchal- countofthenumberofobjectsinthegivenimageis≥k. lenges faced by our algorithm and give a brief overview ofourcurrentmechanismsforaddressingthesechallenges. Thus k is the minimum number of questions that an algo- Whileourcurrentmechanismssufficeforderivingaccurate rithm with prior knowledge of a minimal terminating fron- results,itremainstobeseeniffurtherexplorationintothese tier requires to determine the true count, if it were only al- choices would give greater benefits—this forms an impor- lowedtoaskquestionsatnodesofthetree.Weusethisasthe tantdirectionforourfutureworkinthisspace. Furtherde- baselinetocompareagainstonlinealgorithms,thatis,algo- tailscanbefoundinSection5. rithmswhichhavenopriorknowledgeanddiscovercounts ofsegmentsonlythroughtheprocessofqueryingthem.To Workererror.Sofar,wehaveassumedthathumanworker measure the performance of an algorithm against this opti- countsareaccuratefornodeswithfewerthand∗objectsper- mal baseline, we use a competitive ratio analysis (Borodin mitting us to query each node just a single time. However, 1998).LetA (G)denotethesequenceofquestionsasked, this is not always the case in practice. In reality, workers FS ornodesqueried,byanonlinedeterministicalgorithmAon may make mistakes while counting images with any num- segmentationgraphG.Let|A (G)|bethecorresponding berofobjects(andweseethismanifestinourexperiments FS numberofquestionsasked.Fortheoptimaloraclealgorithm aswell).So,inouralgorithms,weshoweachimageornode OPT, |OPT(G)| = k where k is the size of the mini- to multiple (five in our experiments) workers and aggre- mal terminating frontier of G. For brevity, we also denote gate their answers via median to obtain a count estimate FrontierSeekingalgorithmbyA . for that node. We observe that although individual work- FS ers can make mistakes, our aggregated answers satisfy our Definition 3.2 (Competitive ratio). Let G be the set of all assumptions in general (e.g., that the aggregate is always segmentation trees with fanout b. The competitive ratio of accurate when the count is less than d∗). While we use a analgorithmAisgivenbyCR(A)=max |A(G)| . primitiveaggregationschemeinthiswork,itremainstobe G∈G |OPT(G)| seen if more advanced aggregation schemes, such as those Intuitively,thecompetitiveratioofanalgorithmisamea- in (Karger, Oh, and Shah 2011; Parameswaran et al. 2012; sure of its worst case performance against the optimal or- Sheng,Provost,andIpeirotis2008)wouldleadtobetterre- acle algorithm. Note that since our segmentation graphs sults;weplantoexploretheseschemesinfuturework. represent real segmentations on images, the true counts Segmentation tree. So far, we have also assumed that a on nodes are constrained by TrueCount(parentnode) = (cid:80) segmentation tree with fanout b is already given to us. In TrueCount(childrennodes). This means that when practice,weareoftenonlygiventhewholeimage,andhave queryingnodesinsegmentationtrees,onlyanswersthatare to create the segmentation tree ourselves. In our setup, we consistentwithourworkerbehaviormodeloversomecon- create a binary segmentation tree (b = 2) where the chil- sistent assignment of TrueCounts will be observed. We drenofanynodearecreatedbysplittingtheparentintotwo now show that our Algorithm 1 has a constant competitive halves along its longer dimension. As we will see later on, ratio that depends only on the fanout of the given segmen- this choice leads to accurate results. While our algorithms tation tree. The proof, which basically finds a tight upper- also apply to segmentation trees of any fanout; further in- bound to the number of nodes queried by our algorithm, is vestigationisneededtostudytheeffectofbonthecostand somewhatstraightforward,andcanbefoundinAppendixA. accuracyoftheresults. Theorem3.1. LetGbethesetofallsegmentationtreeswith Segmentboundaries.Wehaveassumedthatobjectsdonot fanoutb.Wehave,CR(A )= b overG. FS b−1 cross segmentation boundaries, i.e., each object is present Thefollowingtheorem(combinedwiththepreviousone) in exactly one leaf node, and cannot be partially present in statesthatouralgorithmachievesthebestpossiblecompeti- multiple siblings. Our segmentation does not always guar- tiveratioacrossallonlinedeterministicalgorithms. anteethis.Tohandlethiscornercase,inourexperimentswe specifythenotionofa“majority”objecttoworkerswiththe Theorem3.2. LetAbeanyonlinedeterministicalgorithm help of an example image, and ask them to only count an that computes the correct count for every given input seg- objectforanimagesegmentifthemajorityofitispresentin mentationtreeGwithfanoutb.Then,CR(A)≥( b ). b−1 thatsegment.Onceagain,wefindthatthisleadstoaccurate Theproofofthistheoremislessstraightforward,andcan resultsinourpresentexperiments. befoundinAppendixA.Thehigh-levelideaisthefollow- Notethatthisnotionof“majorityobject”ishardertode- ing: We first show that “reasonable” algorithms, i.e., those fineforsegmentationtreeswithfanouthigherthan2,asthe that query a node only if the parent is queried, and do not sameobjectcouldbepresentin1,2,3or4imagesegments. continuequeryingbelownodesthatarealreadyknowntobe Tocheckfor“majority”,theworkerwouldhavetolookatall ≤ d∗ query the same nodes as our algorithm. For any “un- segments containing a portion of the object simultaneously reasonable” algorithm A, we can construct a segmentation and gauge which contains the largest chunk. An alternate that these priors are only approximate estimates, and still need to be verified by workers. We discuss details of our partitioningalgorithm,priorcountestimation,andotherim- plementationdetailslaterinSection4.3. Weusethesegeneratedpartitionsandpriorcountstode- fineapartitiongraphasfollows: Definition4.1(PartitionGraph). Givenanimagesplitinto thesetofpartitions,V ,wedefineitspartitiongraph,G = (a) (b) P P (V ,E ),asfollows.Eachpartition,u ∈ V ,isanodein Figure4:Biologicalimage(a)beforeand(b)afterpartitioning P P P the graph and has a weight associated with it equal to the prior, w(u) = du. Furthermore, an undirected edge exists way to handle this could be to instead just ask the worker betweentwonodes,(u,v) ∈ E ,inthegraphifandonlyif topopulateatableforeachdisplayedimagecontainingtwo P thecorrespondingpartitions,u,v,areadjacentintheorigi- columns: (a) Number of boundaries crossed, and (b) Num- nalimage. ber of objects crossing these many boundaries. This table could then be used to avoid the double-counting of objects Notice that while we have used one partitioning scheme whenaggregatingcountsacrossdifferentimagesegments. and one prior count estimation technique for our example here,othermachinelearningorvisionalgorithmsforthis,as As mentioned earlier, while our present design decisions well as other settings provide similar information that will areadequateforhighqualityresults,acompleteexploration allowustogeneratesimilarpartitiongraphs.Thus,theset- of the effect of aggregations schemes, fanout, and objects ting where we begin with a partition graph is general, and crossingsegmentboundariesisleftforfuturework.Were- appliestootherscenarios. visitthesedesigndecisionsinSection5. Now, given a partition graph, one approach to counting thenumberofobjectsinthecorrespondingimagecouldbe 4 IncorporatingComputerVision tohaveworkerscounteachpartitionsindividually.Thenum- Unliketheprevioussection,whereweassumedafixedseg- ber of partitions in a partition graph is, however, typically mentation tree, here, we use computer vision techniques very large, making this approach impractical. For instance, (wheneasilyavailable)tohelpbuildthesegmentationtree, mostofthe5–6partitionsclosetothelowerrighthandcor- andusecrowdstosubsequentlycountsegmentsinthistree. neroftheimageabovehavepreciselyonecell,anditwould Forcertaintypesofimages,existingmachinelearningtech- bewastefulandexpensivetoaskahumantocounteachone niquesgivetwothings:(1)apartitioningofthegivenimage individually. Next, we discuss an algorithm to merge these suchthatnoobjectispresentinmultiplepartitions,and(2) partitionsintoasmallernumber,tominimizethenumberof apriorcountestimateofthenumberofobjectsineachpar- humantasks. tition.Whilethesepriorcountsarenotalwaysaccurateand stillneedtobeverifiedbyhumanworkers,theyallowusto 4.2 MergingPartitions skipsomenodesintheimplicitsegmentationtreeand“fast- Givenapartitiongraphcorrespondingtoanimage,welever- forward” to querying lower nodes, thereby requiring fewer agethepriorcountsonpartitionstoavoidthetop-downex- humantasks.Notethatthisisadeparturefromourtop-down pansion of segmentation trees described in Section 3. In- searchbasedapproachdiscussedinSection3.Weformulate stead, we infer the count of the image by merging its par- ourideaoffast-forwardingthroughthesegmentationtreeas titions together into a small number of bins, each of which a merging problem and discuss it in greater detail in Sec- canbereasonablycountedbyworkers,andaggregatingthe tion4.2. countsacrossbins. Mergingproblem.Intuitively,theproblemofmergingpar- 4.1 Partitioning titionsisequivalenttoidentifyingconnectedcomponents(or Asarunningexample,weconsidertheapplicationofcount- bins) of the partition graph, with total weight (or count) at ing cells in biological images. Figure 4a shows one such mostd∗.Sinceworkersareaccurateonimageswithsizeup image, generated using SIMCEP, a tool for simulating flu- tod∗,wecanthenelicitworkercountsforourmergedcom- orescencemicroscopyimagesofcellpopulations(Lehmus- ponents and aggregate them to find the count of the whole sola et al. 2007). SIMCEP is the gold standard for testing image.Overall,wehavethefollowingproblem: algorithmsinmedicalimaging,providingmanytunablepa- Problem4.1(MergingPartitions). Givenapartitiongraph rametersthatcansimulaterealworldconditions.Weimple- G = (V ,E ) of an image, partition the graph into k ment one simple partitioning scheme that splits any given P P P disjointconnectedcomponentsinG ,suchthatthesumof such cell population image into many small, disjoint par- P nodeweightsineachcomponentislessthanorequaltod∗, titions. Applying this partitioning scheme to the image in andkisassmallaspossible. Figure4ayieldsFigure4b.Combinedwiththepartitioning scheme above, we can leverage existing machine learning Inotherwords,givenapartitiongraphG =(V ,E )of P P P (ML) techniques to estimate the number of objects in each animage,wewishtofindmdisjointconnectedcomponents of the partitions. We denote these ML-estimated counts on inG ,suchthatthesumofnodeweightsineachcomponent P eachpartition,u,aspriorcountsorsimplypriors,du.Note isclosetod∗.Enforcingdisjointcomponentsensuresthatno mergingitsmemberpartitions:ifthepriorcountsareaccu- rate,theseimagesegmentstogethercompriseaminimalter- minatingfrontierforsomesegmentationtree.Whileinprac- tice, they need not necessarily form a minimal terminating frontier, or indeed even a terminating frontier, we observe thattheyprovideverygoodapproximationsforone. Relationship between Partition graphs and Segmenta- Figure5:ArticulationPoint tiontrees.Sofarwehavelookedattheproblemofcounting givenapartitiongraphasasanindependentmergingprob- components overlap over a common object, thereby avoid- lem.Inthissection,wediscusshowthisproblemisrelated ing double-counting. Furthermore, restricting our search to tothatofcountingonasegmentationtreedescribedinSec- connectedcomponentsensuresthatourdisplayedimagesare tion 3. First, we note that since partitions are atomic seg- contiguous — this is a desirable property for images dis- ments that we do not split further, they can be thought of played to workers over most applications, because it pro- as the leaf nodes of a segmentation tree for a given image. videsuseful,necessarycontexttounderstandtheimage. Observethatfixingthepartitiongraphdoesnotnecessarily Hardness and reformulation. The solution to the above fix a segmentation tree – there can be many segmentation problemwouldgiveustherequiredmerging.However,the treeswiththesamesetofleafnodes(partitions).Intuitively, problemdescribedabovecanbeshowntobeNP-Complete, eachintermediate(non-leaf)nodeinanysuchsegmentation using a reduction from the NP-Complete problem of parti- tree is the union of a group of adjacent partitions (or leaf tioningplanarbipartitegraphs(DyerandFrieze1985);our nodes), and its prior count estimate is the sum of the pri- setting uses arbitrary planar graphs, and so our problem is orsofitsconstituentpartitions.Notethatbyonlytakingthe moregeneral.Thus,wehave: union of adjacent partitions to form intermediate nodes of the segmentation tree, we restrict ourselves to trees where Theorem4.1(Hardness). Problem4.1isNP-COMPLETE. eachnode,orimage,iscontiguous. Weconsideranalternativeformulationfortheabovebal- Ifthepriorcountonanintermediatenodeisgreaterthan anced partitioning problem based on the idea that given an d∗,wecannowdirectlyqueryalowernodeinthesegmen- upperboundonthesizeofeachcomponent,balancingcom- tation tree, without verifying this image against the crowd. ponent sizes is intuitively similar to finding the minimum Intuitively, this is exactly what we do when we display number of covering components satisfying the size con- our merged components to workers — we are skipping the straint.Thatis,giventotalweightonthegraphofsayD,and nodes for which querying them would anyway yield incor- upperboundoncomponentweightd∗,theproblemofparti- rect answers resulting in image splitting. This is equiva- tioning the graph into roughly even sized components with lent to jumping ahead to a lower layer of the segmentation weightsascloseto(butnotexceeding)d∗willyieldroughly tree,onethatispotentiallyveryclosetoaterminatingfron- D/d∗ components.Conversely,theproblemofpartitioning tier. Also note that jumping too far ahead (segments hav- thesegmentationgraphintothesmallestsetofcomponents, ing counts (cid:28) d∗), while yielding accurate counts on each each upper bounded by size d∗, will yield roughly D/d∗ node,willresultinthecountingofalargenumberofnodes, even sized components. We formalize this problem of par- whichwillcorrespondinglyincreasethecostofcounting.As titioningthesegmentationgraphintothenumberofdisjoint acompromise,wewishtojumpaheadtonodesthatideally connectedcomponentsboundedbysized∗ below.Notethat eachhavepriorcountsjustsmallerthanorequaltod∗.This whilethis problemisstillNP-COMPLETE,asweseebelow, motivatesourmergingproblem4.2. itismoreconvenienttodesignheuristicalgorithmsforit. Given the hardness of this modified merging problem, Problem 4.2 (Modified Merging Problem). Let d = max we now discuss good heuristics for it, and provide theo- max du, u ∈ V be the maximum partition weight in the u P retical and experimental evidence in support of our algo- partition graph G = (V ,E ). Split G into the small- P P P P rithms. Note that while we discuss the complexity of our estnumberofdisjoint,connectedcomponentssuchthatfor merging algorithms, in practice their running time is small eachcomponent,thesumofitspartitionweightsisatmost compared to the latency of human workers. It is still im- k×d . max practicaltorunabruteforcealgorithmtosolvethemerging Bysettingk ≤d∗/d intheaboveproblem,wecanfind max problem optimally – fortunately, our second, relatively so- connected components whose priorcounts areestimated to phisticatedheuristic,ArticulationAvoidance,described beatmostd∗.Observethathere,althoughwedonotstartout below achieves the theoretical optimum for most problem withasegmentationtree,thepartitionsprovidedbythepar- instancesandisveryclosetoitfortherest. titioningalgorithmcanbethoughtofasleafnodesofaseg- mentationtreeandourmergedcomponentsformparents,or FirstCut Algorithm. One simple, natural approach to ancestorsoftheleafnodes.Wecommentontherelationship Problem 4.2, motivated by the first fit approximation to betweenthepartitiongraphs(discussedinthissection)and the Bin-Packing problem (Coffman Jr, Garey, and Johnson thesegmentationgraphs(discussedinSection3)ingreater 1996), is to start a component with one partition, and in- detailbelow. crementallyaddneighboringpartitionsone-by-oneuntilno Each component producedvia a solution of Problem4.2 more partitions can be added without violating the upper also corresponds to an actual image segment formed by bound,k×d onthesumofvertexweights.Wereferto max thisastheFirstCutalgorithm,anddetailitinAlgorithm2. belookedatforthatcomponentagain. Soeachcomponent takesatmostO(n)timetoform,wherenisthetotalnumber ofpartitions.Intheworstcase,wheneverycomponentisan Algorithm2FirstCut(G ) individualnode(forexampleifthepartitiongraphwasastar P witheverynodeconnectedtojustonecommonlargecentral Require: PartitionGraphG =(V ,E ) P P P node), the number of components formed could at most be Ensure: SetS ofcomponents P O(n), resulting in an overall complexity of O(n2) for the V ←V copy P FirstCutalgorithm. S ←φ P whileVcopy isnotemptydo While the number of components formed could at most newComponent←φ be O(n), in practice, given an upper bound on component ChooseV ∈Vcopy sizeofkdmax,weobservethatthisnumberiscloseton/k newComponent←newComponent∪V resultinginaneffectiveruntimeofO(n2/k).Also,ifwere- N←Vcopy.neighbors(V) laxtherequirementofonlymergingadjacentpartitions,then Vcopy.removeNode(V) the worst case running time of our algorithm is O(n2/k) whileSumofvertexweightsinnewComponent<k× since (as discussed above) every component will have size dmaxdo lyingbetween(k−1)dmaxandkdmaxresultinginO(n/k) ChooseV ∈N components. newComponent←newComponent∪V Next,wedescribeanalgorithmthatbuildsonthebinning N←N∪Vcopy.neighbors(V) ideas used in Algorithm 2, while also efficiently handling N.removeNode(V) casesliketheoneshowninFigure5. V .removeNode(V) copy ArticulationAvoidanceAlgorithm. Recallthatap- endwhile plyingourfirstcutproceduretoFigure5resultsinpoorqual- S ←S ∪{newComponent} P P ity components if we merge partitions B...F to A before endwhile G.Intuitively,whenaddingB tothecomponentcontaining A, the partition graph is split into two disconnected com- Consider running this algorithm on a complete partition ponents:onecontainingG,andanothercontainingC...F. graph(everypartitiontoucheseveryotherpartition).Byre- Givenourconstraintrequiringconnectedcomponents(con- peatingthisproceduretotermination,(1)eachofthemerged tiguous images), this means that partition G can never be components should have between (k − 1) × d and part of a reasonably sized component. This indicates that max k × d objects, and (2) the number of components ob- mergingarticulationpartitionslikeB,i.e.,nodesorparti- max tainedwillbeatmost k ×OPT,whereOPT isatheo- tionswhoseremovalfromthepartitiongraphsplitsthegraph k−1 reticallowerboundtotheminimumnumberofcomponents intodisconnectedcomponents,potentiallyresultsinimbal- possible in Problem 4.2. In practice, however, we find that ancedfinalmergedcomponents: forseveralgraphs,certainpartitionsandcomponentsgetdis- Definition 4.2 (Articulation Partition). A node v in graph connectedbytheformationofothercomponentsduringthis G isdefinedtobeanarticulationpartitioniffremovingit P process, resulting in a sub-optimal merging. For example, (alongwiththeedgesthroughit)disconnectsthegraphG . P considerthepartitioningshowninFigure5. Since adding articulation partitions early results in the Suppose partitions A,...,G contain 100 objects each formation of disconnected components or imbalanced is- and parameter k = 6. The maximum allowed size for a lands, we implement our ArticulationAvoidance algo- merged component is 6 × d ≥ 6 × 100. Supposing max rithm detailed in Algorithm 3 that tries to merge them to we start a component with A, and incrementally merge in growing components as late as possible. We merge parti- partitions B,...,F, we end up isolating G as an indepen- tions as before, growing one component at a time up to an dentmergedcomponent.Thiscausesasignificantdeparture upperboundsizeofk×d ,butweprioritizetheaddingof from our first bound on component size described above max non-articulationpartitionsfirst.Witheachnewpartition,u, (500≤(k−1)×d ≤componentcontainingG),which max addedtoagrowingcomponent,weupdatethegraphofthe inturnwillresultinahigherfinalnumberofmergedcompo- remainingunmergedpartitionsbyremovingthenodecorre- nentsthanoptimal.Notethatifwerelaxedourrequirement spondingtoufromtheremainingpartitiongraph—wealso ofonlymergingadjacentpartitionsandallowarbitrarynon- update our list of articulation partitions for the new graph connectedcomponents,theaboveguaranteeswouldholdfor andrepeatthisprocessuntilallpartitionshavebeenmerged ourFirstCutalgorithmoverarbitrarypartitiongraphs. intoexistingcomponents. Theorem 4.2 (Complexity). The worst case complexity of Again, we state the worst-case complexity of our algo- theFirstCutalgorithmisO(n2)wherenisthenumberof rithmbelow,anddiscussitsproofaswellasthealgorithm’s nodesinthepartitiongraph. complexityinpracticeinourtechnicalreport. Proof. First, weobserve that foreach growing component, Theorem 4.3 (Complexity). The worst case complexity of eachpartitionisexaminedatmostonce-ifitisnotaddedto theAritculationAvoidancealgorithmisO(n2(n+m)) thecomponentatthattime(becauseaddingitwillcausethe wherenisthenumberofnodesandmisthenumberofedges componenttoexceedthethresholdsize),thenitneednever inthepartitiongraph. Merging Partitions: Performance Algorithm3ArticulationAvoidance(G ) P 80 FirstCut Require: SegmentationGraphGP =(VP,EP) ed 70 ArticulationAvoidance EnVsucroepy: ←SetVSPPofcomponents generat 456000 Min SC←φ ns 30 whileVcopy isnotemptydo of bi 20 newComponent←φ # 10 Choose V ∈ V with priority to non-articulation 0 copy 0 2 4 6 8 10 12 14 16 18 20 points k newComponent←newComponent∪V Figure 6: Performanceofalgorithmsformergingpartitions:Min N←Vcopy.neighbors(V) correspondstotheabsoluteminimumnumberofsegmentsrequired V .removeNode(V) suchthateachsegmenthaslessthank×d objects copy max UpdatearticulationpointsofV copy runtimeofO(n2(n+m)/k). Evaluation.Weperformex- whileSumofvertexweightsinnewComponent<k× d do tensive evaluation of our algorithms on synthetic and real max Choose V ∈ N with priority to non-articulation partition graphs on the metric of number of components points generated. Since the optimal, minimum number of bins is newComponent←newComponent∪V not known to us, we compare our algorithms against the N←N∪V .neighbors(V) theoretical minimum. That is, given a partition graph with copy N.removeNode(V) sumofpartitionweightsN,highestpartitionweightdmax, VUpcdopayte.raertmicouvleatNioondep(oVin)tsofV acnaldlyumppienrimbuomundnuomnbceormopfocnoemnptosnizeentksdpmoasxsi,btlheeistheoNreti-. copy kdmax endwhile Notethatthisisalowerboundtothetrueoptimalnumberof S ←S ∪{newComponent} components. We show one representative plot for the vari- P P endwhile ationinnumberofcomponentscreated(y-axis)onthepar- titioned image shown in Figure 4b, against the maximum component size determined by k (x-axis) in Figure 6. The image has 315 objects with d = 5 and the plot is rep- Theorem 4.4 (Complexity). The worst case complexity of max resentativeofthefollowinggeneraltrend—weobservethat theAritculationAvoidancealgorithmisO(n2(n+m)) ArticulationAvoidance performs close to the theoreti- wherenisthenumberofnodesandmisthenumberofedges caloptimum;FirstCut,ontheotherhand,oftengetsstuck inthepartitiongraph. atarticulationpartitions,unabletomeetthetheoreticalopti- mum. Proof. Each time a partition is added to an existing com- ponent, we recompute the set of articulation points. Since 4.3 PracticalSetup the maximum size of a component is kd , we update max Inthissectionwediscusssomeoftheimplementationdetails the set of articulation points at most min(n,kd ) times max ofandchallengesfacedbyouralgorithmsinpractice.Many for each component (if each new partition added is of size ofthechallengesfacedinSection3.4applyhereaswell. 1). We use a standard library to compute the set of artic- ulation points whose running time is O(n + m) where n Partitioning. The first step of our algorithm is to parti- is the number of partitions, and m is the number of edges tion the image into small, non-overlapping partitions. To between partitions in the partition graph. Here, we use the do this, we use the marker-controlled watershed algo- articulation pointsfunctionofPython’sNetworkXli- rithm (Beucher and Meyer 1992). The foreground markers brary (Hagberg, Schult, and Swart 2008) which is imple- are obtained by background subtraction using morphologi- mented using a non-recursive depth-first-search (DFS) that calopening(BeucherandMeyer1992)withacirculardisk. keepstrackofthehighestlevelthatbackedgesreachinthe Priorcounts.IntheexampleofFigure4,welearnamodel DFStree. for the biological cells using a simple Support Vector Ma- Similar to the FirstCut algorithm, for each grow- chine classifier, trained on positive and negative examples ing component, each partition is examined at most once. extracted from 25 such images. For a test image, every So each component takes at most O(n) + O((n + 15 × 15 pixel window in the image is classified as ‘cell’ m)min(n,kdmax)),orO(n(n+m))timetoform.Again, or‘notcell’usingthelearnedmodeltoobtainaconfidence intheworstcase,thenumberofcomponentsformedcould mapoftheoriginalimage.Theconfidencevalueassociated at most be O(n), resulting in an overall complexity of witheachwindowisameasureofprobabilitythatthewin- O(n2(n + m)) for the ArticulationAvoidance algo- dowcontainsacell.Simplelocalmaximasearchandthresh- rithm. olding of this confidence map provides the locations of the cells.Formoredetailsofthisapproach,wereferthereader While the worst case complexity is O(n2(n + m)), in to(Nattkemperetal.2002;Sarmaetal.2015).Empirically practice,weobservethenumberofcomponentsformedus- we observed that our choice of threshold = 0.5 gave clos- ing this algorithm is closer to n/k resulting in an effective estestimatestothetruecounts.Notethatthisprocedureal- waysundercounted,thatis,thepriorcountestimateobtained for any partition was always smaller than the true number of objects in that partition. Setting a lower threshold value cancauseovercounting,andwefoundtheseestimatestobe moreerroneousthantheonesobtainedbyhigherthresholds. Additionally,ifthemachinelearningalgorithmovercounts, theneachcomponentwillhaveTrueCount < d∗.Thisim- pliesthatthecorrespondingnodeswillliebelowtheactual minimal terminating frontier in the segmentation tree, and the size of our guess of the frontier will be higher than the minimalfrontier.Itisthusmorecostly(intermsofnumber ofnodesqueried)toovercount. Traversing the Segmentation Tree. While Section 4.2 givesusasetofmergedcomponents,orunifiedimageseg- mentswithpriorcountestimates,westillneedtoshowthese imagestohumanworkerstoverifythecounts.Asmentioned earlierinthesamesection,bysettingk = (cid:98) d∗ (cid:99),wefind dmax connectedcomponentswhosepriorcountisestimatedtobe Figure7:Imagesinthecrowddataset atmostd∗.Sincethepriorcountestimatesareapproximate andlowerthanthetruecounts,weexpectthesemergedim- pict people in various settings, with the number of people age segments constructed to have true counts close to, and (counts)rangingfrom41to209.Thisdatasetisrepresenta- potentiallyhigherd∗.Oneoptionistohave(multiple)work- tiveoftheapplicationsinsurveillance.Thisisachallenging erssimplycounteachoftheseimagecomponentsandaggre- dataset,withpeoplelookingverydifferentacrossimages— gatethecountstogetanestimateforthewholeimage.Since rangingfrompartiallytocompletelyvisible,alignedatdif- some of these image components may have counts higher ferent angles and sizes, and varying backgrounds. Further- thanoursetworkerthresholdofd∗,ourmodeltellsusthat more,nopriorsorpartitionsareavailablefortheseimages— worker answers on the larger components could be inaccu- soweevaluateoursolutionsfromSection3onthisdataset. rate. So, another option is to use these images as a starting For the rest of this section, we refer to this as the crowd pointforanexpansiondownthesegmentationtree,andper- dataset. formaFrontierSeekingsearchsimilartothatinSection3 Theseconddatasetconsistsof20simulatedimagesshow- by splitting these segments until we reach segments whose ingbiologicalcells,generatedusingSIMCEP(Lehmussola countsareallunderd∗.Wecomparethesetwocountingal- etal.2007).Thenumberofobjectsineachimagewassam- gorithms(termedAAandAA-Split)furtherinSection5. pled uniformly from the range 150 to 350. The minimum numberofobjectsinanimageinour(randomly)generated 5 ExperimentalStudy datasetwas151,andthemaximumwas328.Thecomputer We deployed our crowdsourcing solution for counting on vision techniques detailed in Section 4 are applied on each twoimagedatasetsthatarerepresentativeofthemanyappli- of these images to get prior counts and partitions. For the cationsofourwork.Weexaminethefollowingquestions: remainder of this section, we refer to this as the biological • HowdotheJellyBeanalgorithmscomparewiththethe- dataset. oreticallybestpossible“oracle”algorithmsoncost? Segmentation Tree Construction. For the crowd dataset, • How accurate are the JellyBean algorithms relative to the segmentation tree was constructed with fanout b = 2 machinelearningbaselines? untiladepthof5,foratotalof31nodesperimage.Ateach • Whatarethemonetarycostsofouralgorithms,andhow stage, the image was split into two equal halves along the dotheyscalewiththenumberofobjects? longer dimension. This ensures that the aspect ratios of all • How accurate are the JellyBean algorithms relative to segmentsareclosetotheaspectratiooftheoriginalimage. directlyaskingworkerstocountontheentireimage? Given a segment, workers were asked to count the number • Do we get any additional benefit from counting nodes of‘majority’heads(asdescribedinSection3.4)—ifahead below the initially determined terminating frontier crossedtheimageboundary,itwastobecountedonlyifthe (whenweworkinconcertwithanMLalgorithm)? workerfeltthatmajorityofitwasvisible.Toaidtheworker Later, in Section 5.4, we provide details of experiments in judging whether a majority of an object lies within the evaluationg various answer aggregation schemes beyond image, the surrounding region was also shown demarcated median. Inourappendix,weexperimentallystudyworker byclearlines.OnesuchimageisshowninFigure8b. behavior,includingthefrequencyoferrorsandlevelofcon- For the biological dataset, bins were generated by our fidenceinanswers. ArticulationAvoidance algorithm. As detailed in Sec- tion 4, we start our search for the terminating frontier 5.1 Datasets fromthesenodes.Startingfromthesebins,subsequentlay- Dataset Description. Our first dataset is a collection of 12 ers of the segmentation tree are constructed by applying images from Flickr. These images, shown in Figure 7, de- ArticulationAvoidance recursively, but with a reduced

Description:
Surpassing Humans and Computers with JELLYBEAN: Crowd-Vision-Hybrid Counting Algorithms. Akash Das Sarma. Stanford University.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.