ebook img

Debugging Systems-on-Chip: Communication-centric and Abstraction-based Techniques PDF

311 Pages·2014·1.8 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Debugging Systems-on-Chip: Communication-centric and Abstraction-based Techniques

3.1 CommunicationBetweenTwoBuildingBlocks 45 sU sV sW sU i i i i initiatorstate targetstate sU sV sU t t t phase 1 2 3 4 1 outputdata d valid accept targetregister d t Fig.3.13 Timingdiagram Theinitiatorandtargetbothstartinastatethatispartofrespectivelythesuperstate sUandsU.Thevalidandacceptcontrolsignalsaresettologic-0inthesesuper-states. i t ThismarksthestartofPhase1inthehandshakeprotocol.Theinitiatorsubsequently startsPhase2oftheprotocolbytransitioningtoastateinthesuper-statesV,whereit i appliesthedataelementd toitsdataoutputandtogglesitsvalidoutputsignalonce. 1 Theseactionsaresynchronoustotheinitiatorclock“clk_i”.Thetargetcontinuously samples the valid signal of the initiator using the target clock “clk_t”. When the targetisinsuper-statesU andobservesthattheinitiatorhastoggleditsvalidsignal, t itrespondsbytransitioningtoastateinthesuper-statesV,whereitsamplesthedata t signalsinalocalregisteronceusingitslocalclocksignal.Inthesuper-statesV,the t targettogglesitsacceptoutputsignalonce,synchronoustoitslocalclock,tosignal totheinitiatorthatithassampledthedata.ThisendsPhase2oftheprotocol.The initiatorcontinuouslysamplestheacceptsignalofthetargetonitslocalclock.When theinitiatorisinthesuper-statesV andobservesthatthetargethastoggleditsaccept i signal as well, it responds by transitioning to a state in the super-state sW.At this i pointtheinitiatorisallowedtochangethevalueonitsdataoutput. Thetransferofanotherdataelementd occursinthesamewayasthefirstdata 2 elementd ,exceptthatthevaluesofthevalidandacceptsignalsareinverted.During 1 thissecondhandshake, theinitiatortransitionsfromsuper-statesW throughsuper- i state sX back to a state in super-state sU, while the target transitions from super- i i state sV back to a state in super-state sU, all under control of the valid and accept t t signals. The four-phase handshake protocol is similarly illustrated in Figs. 3.13, 3.14, and3.15.Theinitiatorandtargetagainstartinastatethatispartofrespectivelythe superstate sU andsU.Thevalidandacceptcontrolsignalsaresettologic-0inthese i t super-states. This marks the start of Phase 1 in this protocol. The initiator subse- quentlystartsPhase2ofthisprotocolbytransitioningtoastateinthesuper-statesV, i where it applies the data element d to its data output and asserts its valid output 46 3 Post-siliconDebuggingofMultipleBuildingBlocks Fig.3.14 InitiatorSTG -/0,- -/0,- sU sV 0/1,d i i 1/1,d 0/0,- sW 1/0,- i Fig.3.15 TargetSTG sU 0,-/0 t 0,-/1 1,d/0 sV 1,d/1 t signal.Theseactionsaresynchronoustotheinitiatorclock.Thetargetcontinuously samplesthevalidsignaloftheinitiatorusingthetargetclock.Whenthetargetisin super-statesU andobservesthattheinitiatorhasasserteditsvalidsignal,itresponds t by transitioning to a state in the super-state sV. In super-state sV, it samples the t t datasignalsinalocalregisteronceusingitslocalclockandassertsitsacceptoutput signaltosignaltotheinitiatorthatithassampledthedata.ThisendsPhase2ofthis protocol.Theinitiatorcontinuouslysamplestheacceptsignalfromthetargetonits localclock.Whentheinitiatorisinsuper-statesV andobservesthatthetargethas i asserteditsacceptsignal,itrespondsbytransitioningtoastateinthesuper-statesW, i whereitdeassertsitsvalidoutputsignalsynchronoustoitslocalclock.Atthispoint theinitiatorisallowedtochangethevalueonitsdataoutput.ThisendsPhase3of thisprotocol.Whenthetargetisinthesuper-statesV andobservesthedeassertion t ofthevalidsignalbytheinitiator,itrespondsbytransitioningtoastateinthesuper- statesU, whereitdeassertsitsacceptoutputsignalsynchronoustoitslocalclock. t This ends Phase 4 of this protocol. When the initiator is in the super-state sW and i observesthatthetargethasdeasserteditsacceptsignal,itrespondsbytransitioning toastateinthesuper-statesU, whereitcanoptionallystartthetransferofanother i dataelement. Please note that the target STGs in Figs. 3.12 and 3.15 are simplifications of realimplementationstohelpfocusontheoperationofthehandshakeprotocols.As both handshake protocols allow the target to delay its acceptance of the data, the targetSTGcanbeextendedasshowninFig.3.16.TheSTGinFig.3.16istheSTG ofatargetimplementingthetwo-phasehandshakeprotocol,butwithtwoadditional super-statessW andsX.Thetargetdoesnotimmediatelyacknowledgearequestfrom t t theinitiatorwhenitisinthesuper-statesW orsX.Theinitiatorstallsinrespectively t t super-statesV orsX untilthetargetreturnstorespectivelysuper-statesU orsV and i i t t acceptsthedata.Thesuper-statessW andsX canforexamplebeusedtoimplement t t theuninterruptedprocessingofpreviously-receiveddataelementsbythetarget. These asynchronous handshake protocols do not prevent meta-stability on the sampled valid and accept control signals. Any meta-stability on these signals are 3.1 CommunicationBetweenTwoBuildingBlocks 47 Fig.3.16 Extendedtarget STG 0,-/0 sWt -,-/0 0,-/0 sU -,-/0 t 0,d2/1 1,d1/0 1,-/1 sVt 1,-/1 -,-/1 sX -,-/1 t howeverguaranteedtobeeventuallyresolved,becausetheinitiatorstallsandthereby keeps its valid and data signals stable until the target acknowledges that it has ac- ceptedtherequestfromtheinitiator.Meta-stabilityonthehandshakecontrolsignals therefore does not cause meta-stable data to be sampled. No meta-stability occurs when the data signals are sampled in a register in the target, because these data signalsareheldstablebytheinitiatorwhenthetargetsamplesthem.Consequently, handshake-based communication techniques ensure the correct transfer of data by design,evenwhentheclocksareasynchronous. 3.1.4 SOCCommunicationProtocols The example in Fig. 3.8 shows a unidirectional communication link between an initiatorandatarget. Thislinkconsistsofone signalgroupthatcomprisesavalid handshakesignal, anaccepthandshakesignal, andasetofassociateddatasignals. Thehandshakesignalsensurethecorrecttransferofthedatasignalswithoutmetasta- bility problems. Modern communication protocols, such as the AXI protocol [1], thedevicetransactionlevel(DTL)protocol [14]andtheopencoreprotocol(OCP) [15], use a bidirectional communication link between an initiator and a target that comprisesarequestchannelandaresponsechannel(refertoFig.3.17).Thesetwo communicationchannelstransferthewriteandreadtransactionsbetweentheinitia- torandthetarget.Atransactioncomprisesarequestmessagefromtheinitiatortothe targetandanoptionalresponsemessagefromthetargettotheinitiator.Eachmessage consistsofoneormoredataelementsthatareindividuallytransferredbetweenthe initiator and the target using a handshake. A write transaction consists of a write requestmessageandanoptionalwriteresponsemessage.Thewriterequestmessage containsawritecommandelementandoneormoreelementswiththedatatowrite. Theoptionalwriteresponsemessagecontainsawriteacknowledgeelement.Aread transaction consists of a read request message and a read response message. The read request message contains a read command element, while the read response messagecontainsoneormoreelementswiththedatathatwasread. 48 3 Post-siliconDebuggingofMultipleBuildingBlocks write transaction read transaction write request message read request request channel write write write write read message (from initiator cmd data data data cmd to target) response channel (from target to write read initiator) write response ack read response data message message time = transaction = message = data element Fig.3.17 Communicationrequestandresponsechannels,transactions,messages,anddataelements Table3.2 MainsignalsandsignalgroupsoftheDTLcommunicationprotocol[14] Name Sourcea Description Systemgroup clk S DTLclock rst_an S AsynchronousDTLreset Commandgroup cmd_read I Commandreadoperation cmd_addr I Commandaddress cmd_block_size I Commandblocksize cmd_rd_mask I Commandreadmask cmd_valid I Commandvalid cmd_accept T Commandaccept Writegroup wr_data I Writedata wr_mask I Writedatabytemask wr_last I Writelast wr_valid I Writevalid wr_accept T Writeaccept Readgroup rd_data T Readdata rd_last T Readlast rd_valid T Readvalid rd_accept I Readaccept aSsystem,Iinitiator,Ttarget Theoperatingprinciplesoftheseprotocolsareverysimilar.Wethereforeillustrate these principles using the DTL protocol below, because we also use this protocol in our case study in Chap. 8. Table 3.2 gives an overview and short description of themainsignalgroupsandsignalsinvolvedintheDTLcommunicationbetweenan initiatorandatarget[14].Table3.2alsoliststhesourceofeachsignal. TheDTLprotocolisasynchronouscommunicationprotocol,i.e.,itrequiresthe initiatorandtargettobepartofthesameclockdomain.Thisallowsthemaximum 3.1 CommunicationBetweenTwoBuildingBlocks 49 clk 1 cmd_read cmd_addr address address cmd_rd_mask 0xF cmd_block_size 0x02 0x00 cmd_valid cmd_accept 2 wr_data d1 d2 d3 wr_mask mask1 mask2 mask3 wr_last wr_valid wr_accept 3 4 rd_data d4 rd_last rd_valid rd_accept Fig.3.18 ExampleDTLwriteandreadtransactions,basedon[14] throughput of the communication link, of one data element per clock cycle, to be utilizedduringdatatransfers.Theprotocolhoweverstillprescribestheuseofhand- shakecontrolsignalstocontrolthedatatransferbetweentheinitiatorandthetarget. Thehandshakecontrolsignalsallowtheinitiatorandtargettoexecuteindependently fromeachotherwhentheyarenotcommunicatingwitheachother.Theinitiatorwill onlystallwhenithasdatatocommunicatetothetargetandthetargetisnotreadyto receivethisdatayet. The DTL protocol uses three signal groups to transfer commands and data be- tween the initiator and the target, even though the use of a single signal group for each channel is sufficient. SOC protocols typically use multiple signal groups to support pipelined and concurrent transactions and thereby obtain a higher system performance. Figure3.18showsanexamplewritetransactionandanexamplereadtransaction onaDTLcommunicationlink,basedontheDTLprotocolspecificationin[14].The initiatorstartsthetransactioninbothcases.Itsendsrespectivelyawritecommand elementorareadcommandelementwithcommandinformationtothetarget,using 50 3 Post-siliconDebuggingofMultipleBuildingBlocks thesignalsinthecommandgroup(indicatedwith 1 and 2 inFig.3.18).Thisinfor- mationincludesthetypeofcommand(“cmd_read”),thestartaddress(“cmd_addr”), andtheblocksize(“cmd_block_size”).Thevalidationofthisinformationbytheini- tiator and its subsequent acceptance by the target is indicated by respectively the “cmd_valid” and the “cmd_accept” handshake signal. Write transfers take place similartothetransferofthecommand, i.e., theinitiatorprovidesthedatatowrite bymeansofthesignalsinthewritegroup(indicatedas 3 inFig.3.18).Thisinfor- mation includes the data to write (“wr_data”), a possible byte mask (“wr_mask”), andaflagindicatingwhetherthecurrentdataelementisthelastelementinthewrite requestmessage(“wr_last”).Thevalidationofthewritedatabytheinitiatorandits subsequentacceptancebythetargetisindicatedbyrespectivelythe“wr_valid”and the“wr_accept”handshakesignal.IntheexamplewritetransactioninFig.3.18,the command element specifies a write operation (“cmd_read=0”) and a block size of threeelements(“cmd_block_size=2”).Thesethreedataelementsaresubsequently transferredfromtheinitiatortothetarget.Thetransferofthelastwritedataelement isindicatedbytheassertionofthe“wr_last”signal. Areadresponsemessagetakesplaceintheoppositedirectioncomparedtoaread request message, e.g., the target provides the data that was read by means of the signalsinthereadgroup(indicatedas 4 ).Thisinformationincludesthedatathat was read (“rd_data”) and a flag indicating whether the current data element is the lastelementinthereadmessage(“rd_last”).Thevalidationbythetargetofthedata readanditssubsequentacceptancebytheinitiatoriscontrolledbyrespectivelythe “rd_valid”and“rd_accept”handshakesignals. ToenablecommunicationbetweenasynchronousbuildingblocksusingtheDTL communicationprotocol,anSOCdesignteamhastouseaso-calledclockdomain crossing(CDC)module[2].ACDCmoduleconsistsoftwoasynchronousbuilding blocksthatcommunicatewitheachotherusinganasynchronousprotocol(referto Fig. 3.19).An initiator block in clock domain a can use the DTL interface on the initiatorsideoftheCDCmoduletowritedataintotheCDCmodule.Atargetblock inclockdomainbcanusetheDTLinterfaceonthetargetsideoftheCDCblockto readthisdata.AmemoryinsidetheCDCmodulekeepstrackofthedataelements thathavebeenwrittenbutnotyetread. 3.1.5 VariableCommunicationDuration Inexchangeforacorrectdatatransferbetweenaninitiatorandantargetindifferent clockdomains,asynchronouscommunicationprotocolsintroduceavariationinthe duration of the handshakes. This is illustrated in Figs. 3.20 and 3.21 for the two- phase,asynchronouscommunicationprotocol.InFig.3.20,ittakestwoactiveedges on the initiator clock to complete Handshake I, measured from the active edge on whichtheinitiatortogglesitsvalidsignaltotheactiveedgeonwhichtheinitiator samplesatoggledacceptsignalfromthetarget. HandshakeIIhowevertakesthree activeedgesontheinitiatorclocktocomplete.Thisdifferenceindurationiscaused 3.1 CommunicationBetweenTwoBuildingBlocks 51 clk_a clk_b clock domain a clock domain b CDC module command command group group initiator target write side side wtarritgeet DTL group CDC CDC group DTL initiator target read read group group DTL asynchronous DTL interface interface interface Fig.3.19 BlockdiagramofaCDCmodule Handshake I Handshake II initiator state valid accept valid accept target state t initiator clock target clock valid accept Δ Δ φ1 φ2 Fig.3.20 Non-determinismatclock-cyclelevelduetoclockphasedifferences by a difference in the clock periods of the initiator and target, and by the phase difference between the two clocks at the start of the handshakes (refer to Δ and φ1 Δ in Fig. 3.20). The handshakes in Fig. 3.20 are examples in which the control φ2 signalsfromtheinitiatorandthetargetaresampledwithoutmeta-stability. Figure 3.21 shows another example handshake, where the target samples the validcontrolsignalfromtheinitiatorwithmeta-stability,becausethevalidsignalis assertedbytheinitiatorinthesetup-and-holdintervalaroundanactiveedgeonthe targetclocksignal.Inthisexample,ittakesoneadditionalclockcycleofthetarget clock to resolve this meta-stability. This causes a total duration of Handshake III of four active edges on the initiator clock. This difference in the duration of the handshakeisvisibleasavariablecommunicationlatencybetweenthetwobuilding blocksinvolved.ThisdifferencehastobetakenintoaccountbytheSOCapplications. Wediscusstheconsequencesofthisvariabledelaynext,inSects.3.2and3.3. 52 3 Post-siliconDebuggingofMultipleBuildingBlocks Fig.3.21 Non-determinism Handshake III atclock-cycleleveldue tometa-stability initiator state valid accept target state t initiator clock target clock Fig.3.22 ExampleSOC clock domain A clock domain B clock domain C withthreebuildingblocks clk_a clk_b clk_c data_p data_c initiator request_p shared request_c initiator 1 target 2 ack_p ack_c producer shared memory consumer 3.2 ResourceSharingBetweenBuildingBlocks ItiscommoninalargeSOCforasetofbuildingblockstorequireaccesstothesame resource. For example, many of the building blocks in the SOC block diagram in Fig.1.5needwriteandreadaccesstotheexternalmemory.Thememorycontroller, shown at the top of Fig. 1.5, arbitrates between the write and read requests from thesebuildingblocks,bydecidingtheorderinwhichtheyareappliedtotheoff-chip SDRAM.TheSDRAMisasharedresourceinthisSOC.Thememorycontrolleris thearbiterforthissharedresource. Figure3.22showsanInitiator1andanInitiator2thatbothrequiretheservicesofa sharedtarget.Thissharedtargetcanhoweveronlyacceptandexecuteasinglerequest atatime.Becausetherequestscometothistargetviaseparateports,thistargethas todecidetheorderinwhichitservicestherequestsfromthesetwoinitiators.This processiscalledarbitration.Commonresourcearbitrationalgorithmsincludestatic, first-in/first-out,shortestjobfirst,priority-based,andround-robinarbitration.Wedo notexplorethesespecificalgorithmsherefurther, butinsteadanalyze thepossible effectsthatusingaGALSdesignstylehasonthearbitrationprocess. InaGALSSOC,therequestsignalsfrominitiatorsinotherclockdomainsfirst need to be synchronized to the clock domain of the arbiter. This synchronization process prevents meta-stability problems, but introduces variable latencies in the communicationofbothrequeststothesharedtarget(refertoSect.3.1.5).Thesyn- chronized requests are subsequently combined with the requests originating from thearbiter’sownclockdomainandservicedintheorderdeterminedbythearbitra- tionalgorithm.Dependingonthearbitrationalgorithmused,thearrivaltimesofthe 3.2 ResourceSharingBetweenBuildingBlocks 53 Fig.3.23 “writebeforeread” s1 s2 s3 scenario p p p producer s2 m s3 m shared memory s1 m consumer s1 s2 s3 c c c t requestsattheinputsofthearbitermayhaveanimpactontheorderinwhichthese requestsaresubsequentlyhandled.Adifferenceinthisordermayinturninfluence the SOC execution. Figures 3.23 and 3.24 illustrates this phenomenon for the two initiatorsandtheirsharedtarget;Initiator1isaproducerofdataandInitiator2isa consumerofdata.Bothinitiatorscommunicateviaseparateportstoasharedmem- ory.Thissharedmemorycanonlyacceptandexecuteasinglerequestatatime.In Fig. 3.23, the producer is the first initiator to start a write request, which is soon followed by a read request of the consumer. The producer’s request is the first re- questtoarriveatthesharedmemoryandisthereforeexecutedfirst.Afterwardsthe requestoftheconsumerisexecutedbythesharedmemory.Anotherpossiblerequest sequenceisshowninFig.3.24,butwithadifferenttransactionsequence.Thistime, duetodifferentlatenciesonthecommunicationpathbetweentheproducerandthe sharedmemory,andbetweentheconsumerandthesharedmemory,thereadrequest oftheconsumerarrivesbeforethewriterequestoftheproducer.Thereadrequestof theconsumeristhereforeexecutedbeforethewriterequestoftheproducer.There- sponsemessagetotheconsumermaybedifferentfromwhatitwasinthescenarioin Fig.3.23,becauseforexampletheconsumerrequesteddatafromthesharedmemory beforetheproducerwasabletowriteit.Thisisreflectedinthestateoftheshared memory, when the read request of the consumer comes in. In Fig. 3.23, this state s1 s2 s3 p p p producer s4m s2m s3m shared memory s1 m read data consumer s1 s2 s4 s2 s3 c c c c c t Fig.3.24 “readbeforewrite,andthenreread”scenario

Description:
This book describes an approach and supporting infrastructure to facilitate debugging the silicon implementation of a System-on-Chip (SOC), allowing its associated product to be introduced into the market more quickly. Readers learn step-by-step the key requirements for debugging a modern, silicon S
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.