《Reed–Solomoncodesforcoders.docx》由会员分享,可在线阅读,更多相关《Reed–Solomoncodesforcoders.docx(70页珍藏版)》请在课桌文档上搜索。
1、Reed-SolomoncodesforcodersJUnIPtonavigationjumptoSearChErrOrCorreCtingCOdeSareasignalprocessingtechniquetocorrecterrors.Theyarenowadaysubiquitous,suchasincommunications(mobilephone,internet),datastorageandarchival(harddrives,opticaldiscsCD/DVD/BluRay,archivaltapes),warehousemanagement(barcodes)andad
2、vertisement(QRcodes).RuUd-SOIOmOnUrrOrCOrreCtionisaspecifictypeoferrorcorrectioncode.Itisoneoftheoldestbutitisstillwidelyused,asitisverywelldefinedandseveralefficientalgorithmsarenowavailableunderthepublicdomain.Usually,errorcorrectioncodesarehiddenandmostusersdonotevenknowaboutthem,norwhentheyareus
3、ed.Yet,theyareacriticalcomponentforsomeapplicationstobeviable,suchascommunicationordatastorage.Indeed,aharddrivethatwouldrandomlylosedataeveryfewdayswouldbeuseless,andaphonebeingabletocallonlyondayswithacloud-lessweatherwouldbeseldomused.Usingerrorcorrectioncodesallowstorecoveracorruptedmessageintot
4、hefulloriginalmessage.BarcodesandQRcodesareinterestingapplicationstostudy,astheyhavethespecificityofdisplayingvisuallytheerrorcorrectioncode,renderingthesecodesreadilyaccessibletothecurioususer.Inthisessay,wewillattempttointroducetheprinciplesofReed-Solomoncodesfromthepointofviewofaprogrammerrathert
5、hanamathematician,whichmeansthatwewillfocusmoreonthepracticethanthetheory,althoughwewillalsoexplainthetheory,butonlythenecessaryknowledgeforintuitionandimplementation.Notablereferencesinthedomainwillbeprovided,sothattheinterestedreadercandigdeeperintothemathematicaltheoryatwill.Wewillprovidereal-wor
6、ldexamplestakenfromthepopularQRCOdebarcodesystemaswellasworkingcodesamples.WechosetousePythonforthesamples(mainlybecauseitlooksprettyandsimilartopseudocode),butwewilltrytoexplainanynon-obviousfeaturesforthosewhoarenotfamiliarwithit.Themathematicsinvolvedisadvancedinthesensethatitisnotusuallytaughtbe
7、lowtheuniversitylevel,butitshouldbeunderstandabletosomeonewithagoodgraspofhigh-schoolalgebra.Wewillfirstgentlyintroducetheintuitionsbehinderrorcorrectioncodesprinciples,theninasecondsectionwewillintroducethestructuraldesignofQRcodes,inotherwordshowinformationisstoredinaQRcodeandhowtoreadandproduceit
8、,andinathirdsectionwewillstudyerrorcorrectioncodesviatheimplementationofaReed-Solomondecoder,withaquickintroductionofthebiggerBCHcodesfamily,inordertoreliablyreaddamagedQRcodes.NoteforthecuriousreadersthatUXtUndedinformationCanbefoundinth。appendixandonthediscussionpage.Contents !Principlesoferrorcor
9、rection 2QRcodestructureo2.IMaskingo2.2Formattinginformationo2.3Messagedatao2.4Decoding 3BCHcodeso3.IBCHerrordetectiono3.2BCHerrorcorrection 4Finitefieldarithmetico4.!Introductiontomathematicalfieldso4.2dditionandSubtractiono4.SMultiplicationo4.4Multiplicationwithlogarithmso4.5Divisiono4.6PowerandIn
10、verseo4.7Polynonials 5Reed-Solomoncodeso5.Hnsightofthecodingtheoryo5.2RSencoding 5.2.!Encodingoutline 5.2.2Exceptionmanagement 5.2.3RSgeneratorpolynomial 5.2.4Polynomialdivision 5.2.5Encodingmainfunctiono5.3RSdecoding 5.3.!Decodingoutline 5.3.2Syndromecalculation 5.3.3Erasurecorrection5.3.4Errorcorr
11、ection5.3.5Erroranderasurecorrectiono5.4Wrappingupwithanexample 6Conclusionandgoingfurther 7Third-partyimplementations 8Externallinks 9ReferencesPrinciplesoferrorCOrreCtion孰editSOlIrCelBeforedetailingthecode,itmightbeusefultounderstandtheintuitionbehinderrorcorrection.Indeed,althougherrorcorrectingc
12、odesmayseemdauntingmathematically-wise,mostofthemathematicaloperationsarehighschoolgrade(withtheexceptionofGaloisFields,butwhichareinfacteasyandcommonforanyprogrammer:it,ssimplydoingoperationsonintegersmoduloanumber).However,thecomplexityofthemathematicalingenuitybehinderrorcorrectioncodeshidethequi
13、teintuitivegoalandmechanismsatplay.Errorcorrectingcodesmightseemlikeadifficultmathematicalconcept,buttheyareinfactbasedonanintuitiveideawithaningeniousmathematicalimplementation:let,smakethedatastructured,inawaythatwecan*guess*whatthedatawasifitgetscorrupted,justbyfixingthestructure.Mathematically-w
14、ise,weusepolynomialsfromtheGaloisFieldtoimplementthisstructure.1.et,stakeamorepracticalanalogy:let,ssayyouwanttocommunicatemessagestosomeoneelse,butthesemessagescangetcorruptedalongtheway.Themaininsightoferrorcorrectingcodesisthat,insteadofusingawholedictionaryofwords,wecanuseasmallersetofcarefullys
15、electedwords,a*reduceddictionary*,sothateachwordisasdifferentasanyother.Thisway,whenwegetamessage,wejusthavetolookupinsideourreduceddictionaryto1)detectwhichwordsarecorrupted(astheyarenotinourreduceddictionary);2)correctcorruptedwordsbyfindingthemostsimilarwordinourdictionary.1.et,stakeasimpleexampl
16、e:wehaveareduceddictionarywithonlythreewordsof4letters:this,thatandcorn.Let,ssaywereceiveacorruptedword:co*,where*isanerasure.Sincewehaveonly3wordsinourdictionary,wecaneasilycompareourreceivedwordwithourdictionarytofindthewordthatistheclosest.Inthiscase,it,scorn.Thusthemissinglettersarern.Nowlet,ssa
17、ywereceivethewordth*.Heretheproblemisthatwehavetwowordsinourdictionarythatmatchthereceivedword:thisandthat.Inthiscase,wecannotbesurewhichoneitis,andthuswecannotdecode.Thismeansthatourdictionaryisnotverygood,andweshouldreplacethatwithanothermoredifferentword,suchasdashtomaximizethedifferencebetweenea
18、chword.Thisdifference,ormorepreciselytheminimumnumberofdifferentlettersbetweenany2wordsofourdictionary,iscalledthemaximumHammingdistanceofourdictionary.Makingsurethatany2wordsofthedictionaryshareonlyaminimumnumberoflettersatthesamepositioniscalledmaximumseparability.Thesameprincipleisusedformosterro
19、rcorrectingcodes:wegenerateonlyareduceddictionarycontainingonlywordswithmaximumseparability(wewilldetailmorehowtodothatinthethirdsection),andthenwecommunicateonlywiththewordsofthisreduceddictionary.WhatGaloisFieldsprovideisthestructure(ie,reduceddictionarybasis),andReed-Solomonisawaytoautomaticallyc
20、reateasuitablestructure(makeareduceddictionarywithmaximumseparabilitytailoredforadataset),aswellasprovidetheautomatedmethodstodetectandcorrecterrors(ie,lookupsinthereduceddictionary).Tobemoreprecise,GaloisFieldsarethestructure(thankstotheircyclicnature,themoduloaninteger)andReed-Solomonisthecodec(en
21、coder/decoder)basedonGaloisFields.Ifawordgetscorruptedinthecommunication,that,snobigdealsincewecaneasilyfixitbylookinginsideourdictionaryandfindtheclosestword,whichisprobablythecorrectone(thereishoweverachanceofchoosingawrongoneiftheinputmessageistooheavilycorrupted,buttheprobabilityisverysmall).Als
22、o,thelongerourwordsare,themoreseparabletheyare,sincemorecharacterscanbecorruptedwithoutanyimpact.Thesimplestwaytogenerateadictionaryofmaximallyseparablewordsistomakewordslongerthantheyreallyare.1.et,stakeagainourexample:thisthatcornAppendauniquesetofcharacterssothattherearenoduplicatedcharactersatan
23、yoftheappendedpositions,andaddonemorewordtohelpwiththeexplanation:thisabcdthatbcdecorncdefNotethateachwordinthisdictionarydiffersfromeveryotherwordbyatleast6characters,sothedistanceis6.Thisallowsupto5errorsinknownpositions(whicharecallederasures),or3errorsinunknownpositions,tobecorrected.Assumethat4
24、erasuresoccur:t*ab*dThenasearchofthedictionaryforthe4non-erasedcharacterscanbedonetofindtheonlyentrythatmatchesthose4characters,sincethedistanceis5.Hereitgives:thisabcdAssumethat2errorsoccurasinoneofthesepatterns:thosbcdeTheissuehereisthelocationoftheerrorsisunknown.Theerasuresmighthavehappenedinany
25、2positionsmeaningthatthereareor28possiblesub-setsof6characters:thosbc*thosb*d*thosb*eIfwedoadictionarysearchoneachofthesesub-sequences,wefindthatthereisonlyonesub-setthatmatches6characters,th*bcdematchesthatbcde.Withtheseexamples,onecanseetheadvantageofredundancyinrecoveringlostinformation:redundant
26、charactershelpyourecoveryouroriginaldata.Thepreviousexamplesshowhowacrudeerrorcorrectingschemecouldwork.Reed-Solomon,scoreideaissimilar,appendredundantdatatoamessagebasedonGaloisFieldmathematics.Theoriginalerrorcorrectingdecoderwassimilartotheerrorexampleabove,searchsub-setsofareceivedmessagethatcor
27、respondtoavalidmessage,andchoosetheonewiththemostmatchesasthecorrectedmessage.Thisisn,tpracticalforlargermessages,somathematicalalgorithmsweredevelopedtoperformerrorcorrectioninareasonabletime.QRcodeStructuretediteditsourceThissectionintroducesthestructureofQRcodes,whichishowdataisstoredinaQRcode.Th
28、einformationinthissectionisdeliberatelyincomplete.Onlythemostcommonfeaturesofthesmall2121sizesymbols(alsoknownasversion1)arepresentedhere,butseetheappendixforadditionalinformation.HereisaQRsymbolthatwillbeusedasanexample.Itconsistsofdarkandlightsquares,knownasmodulesinthebarcodingworld.Thethreesquar
29、elocatorpatternsinthecornersareavisuallydistinctivefeatureofQRsymbols.MaskingfediteditSoUrCelAmaskingprocessisusedtoavoidfeaturesinthesymbolthatmightconfuseascanner,suchasmisleadingshapesthatlooklikethelocatorpatternsandlargeblankareas.Maskinginvertscertainmodules(whitebecomesblackandblackbecomeswhi
30、te)whileleavingothersalone.Inthediagrambelow,theredareasencodeformatinformationanduseafixedmaskingpattern.Thedataarea(inblackandwhite)ismaskedwithavariablepattern.Whenthecodeiscreated,theencodertriesanumberofdifferentmasksandchoosestheonethatminimizesundesirablefeaturesintheresult.Thechosenmaskpatte
31、rnisthenindicatedintheformatinformationsothatthedecoderknowswhichonetouse.Thelightgrayareasarefixedpatternswhichdonotencodeanyinformation.Inadditiontotheobviouslocatorpatterns,therearealsotimingpatternswhichcontainalternatinglightanddarkmodules.Scanned SymbolUnmasked SymbolThemaskingtransformationis
32、easilyapplied(orremoved)usingtheexclusive-。Foperation(denotedbyacaretCinmanyprogramminglanguages).Theunmaskingoftheformatinformationisshownbelow.Readingcounter-clockwisearoundtheupper-leftlocatorpattern,wehavethefollowingsequenceofbits.Whitemodulesrepresent0andblackmodulesrepresent1.Input10110110100
33、1011MaskIoIOloOOoolO(HOOutput000111101011001FormattinginformationediteditSoUrCelTherearetwoidenticalcopiesoftheformattinginformation,sothatthesymbolcanstillbedecodedevenifitisdamaged.Thesecondcopyisbrokenintwopiecesandplacedaroundtheothertwolocators,andisreadinaclockwisedirection(upwardsinthelower-l
34、eftcorner,thenleft-to-rightintheupper-rightcorner).Thefirsttwobitsofformattinginformationgivetheerrorcorrectionlevelusedforthemessagedata.AQRsymbolthissizecontains26bytesofinformation.Someoftheseareusedtostorethemessageandsomeareusedforerrorcorrection,asshowninthetablebelow.Theleft-handcolumnissimpl
35、yanamegiventothatlevel.ErrorCorrectionLevelLevelIndicatorErrorCorrectionBytesMessageDataBytesLOl719M001016Q111313H10179Thenextthreebitsofformatinformationselectthemaskingpatterntobeusedinthedataarea.Thepatternsareillustratedbelow,includingthemathematicalformulathattellswhetheramoduleisblack(iandjare
36、therowandcolumnnumbers,respectively,andstartwith0intheupper-lefthandcorner).Mask IOO (i2 + j3)%2=0Mask 110 (ij)%3+ij)%2=0Mask 111 (i*j)%3+i+j)%2=0同Mask101(i*j)%2+(i*j)%3=0Theremainingtenbitsofformattinginformationareforcorrectingerrorsintheformatitself.ThiswillbeexplainedinaIaterSeCtiorLMessagedatae
37、diteditSoUrCelHereisalargerdiagramshowingtheunmaskedQRcode.Differentregionsofthesymbolareindicated,includingtheboundariesofthemessagedatabytes.FormatInfoErrorCorrectionBytesDuplicateFormat InfoMessageData BytesDatabitsarereadstartingfromthelower-rightcornerandmovingupthetworight-handcolumnsinazig-za
38、gpattern.Thefirstthreebytesare010000001101001001110101.Thenexttwocolumnsarereadinadownwarddirection,sothenextbyteis01000111.Uponreachingthebottom,thetwocolumnsafterthatarereadupward.Proceedinthisup-and-downfashionallthewaytotheleftsideofthesymbol(skippingoverthetimingpatternwherenecessary).Hereisthe
39、completemessageinhexadecimalnotation.Messagedatabytes:40d2754776173206272696c6c69670ecErrorcorrectionbytes:be2a90136bafeffd4beODecodingediteditSOIIrCelThefinalstepistodecodethemessagebytesintosomethingreadable.Thefirstfourbitsindicatehowthemessageisencoded.QRcodesuseseveraldifferentencodingschemes,s
40、othatdifferentkindsofmessagescanbestoredefficiently.Thesearesummarizedinthetablebelow.Afterthemodeindicatorisalengthfield,whichtellshowmanycharactersarestored.Thesizeofthelengthfielddependsonthespecificencoding.ModeNameModeIndicatorLengthBitsDataBitsNumericOOOl1010bitsper3digitsAlphanumeric0010911bi
41、tsper2charactersByte010088bitspercharacterKanji1000813bitspercharacter(ThelengthfieldsizesabovearevalidonlyforsmallerQRcodes.)OursamplemessagestartswithOlOO(hex4),indicatingthatthereare8bitspercharacter.Thenext8bits(hex0d)arethelengthfield,13indecimalnotation.Thebitsafterthatcanberearrangedinbytesre
42、presentingtheactualcharactersofthemessage:2754776173206272696c6c6967,andadditionallyOec.Thefirsttwo,hex27and54aretheASCIlcodesforapostropheandT.Thewholemessageis,z,Twasbillig,z(fromw:JabberWOCky#LeXiCOn).Afterthelastofthedatabitsisanother4-bitmodeindicator.Itcanbedifferentfromthefirstone,allowingdif
43、ferentencodingstobemixedwithinthesameQRsymbol.Whenthereisnomoredatatostore,thespecialend-of-messagecode0000isgiven.(Notethatthestandardallowstheend-of-messagecodetobeomittedifitwouldn,tfitintheavailablenumberofdatabytes.)Atthispoint,weknowhowtodecode,orread,awholeQRcode.However,inreallifeconditions,
44、aQRcodeisrarelywhole:usually,itisscannedviaaphone,scamera,underpotentiallypoorlightingconditions,oronascratchedsurfacewherepartoftheQRcodewasripped,oronastainedsurface,etc.TomakeourQRcodedecoder*reliable*,weneedtobeableto*correct*errors.Thenextpartofthisarticlewilldescribehowtocorrecterrors,byconstr
45、uctingaBCHdecoder,andmorespecificallyaReed-Solomondecoder.BCHCOdeSeditIeditSOUrCelInthissection,weintroduceageneralclassoferrorcorrectioncodes:theBCHcodes,theparentfamilyofmodernReed-Solomoncodes,andthebasicdetectionandcorrectionmechanisms.TheformattinginformationisencodedwithaBeHCOdewhichallowsacer
46、tainnumberofbit-errorstobedetectedandcorrected.BCHcodesareageneralizationofReed-Solomoncodes(modernReed-SolomoncodesareBCHcodes).InthecaseofQRcodes,theBCHcodeusedfortheformatinformationismuchsimplerthantheReed-Solomoncodeusedforthemessagedata,soitmakessensetostartwiththeBCHcodeforformatinformation.B
47、CHerrordetectionediteditSOUrCelTheprocessforcheckingtheencodedinformationissimilartolongdivision,butusesexclusive-orinsteadofsubtraction.Theformatcodeshouldproducearemainderofzerowhenitisdividedbytheso-calledgeneratorofthecode.QRformatcodesusethegenerator10100110111.Thisprocessisdemonstratedforthefo
48、rmatinformationintheexamplecode(000111101011001)below.OoOIl10100110111)000111101011001,IOlOollOlll0101001101111010011011100000000000HereisaPythonfunctionwhichimplementsthiscalculation.defqr_check_format(fmt):g=0x537#二ObioiooilOlllinpython2.6+foriinrange(4,-1,-1):iffmt&(1(i+10):fmt-gireturnfmtPythonnote:Therangefunctionmaynotbecleartonon-Pythonprogrammers.Itproducesalistofnumberscountingdownfrom4to0.InC-derivedlanguages,theforloopmightbewrittenasfor(i=4;i=0;i-)