《From-ShaderX-2-–-Shader-Programming-Tips-and-Trick.docx》由会员分享,可在线阅读,更多相关《From-ShaderX-2-–-Shader-Programming-Tips-and-Trick.docx(29页珍藏版)》请在课桌文档上搜索。
1、AdvancedImageProcessingwithDirectX9PixelShadersJasonL.Mitchell,MarwanY.AnsariandEvanHart3DApplicationResearchGroupATIResearchIntroductionWiththeintroductionoftheps_2_0pixelshadermodelinDirectX9.0,weareabletosignificantlyexpandourabilitytouseconsumergraphicshardwaretoperformimageprocessingoperations.
2、Thisisduetothelongerprogramlength,theabilitytosamplemoretimesfromtheinputimage(s)andtheadditionoffloatingpointinternaldatarepresentation.InthefirstShaderXbook,weusedtheps_l_4pixelshadermodelinDirectX8.1toperformbasicimageprocessingtechniquessuchassimpleblurs,edgedetection,transferfunctionsandmorphol
3、ogicaloperatorsMitchell02.Inthischapter,wewillextendourimageprocessingtoolboxtoincludecolorspaceconversion,abetteredgedetectionfiltercalledtheCannyfilter,separableGaussianandmedianfilters,andareal-timeimplementationoftheFastFourierTransform.ReviewAsshowninouroriginalimageprocessingchapterinthefirstS
4、haderXbook,post-processingof3Dframesisfundamentaltoproducingavarietyofinterestingeffectsingamescenes.ImageprocessingisperformedonaGPUbyusingthesourceimageasatextureanddrawingascreen-alignedquadrilateralintothebackbufferoranothertexture.Apixelshaderisusedtoprocesstheinputimagetoproducethedesiredresul
5、tintherendertarget.Figure 1 - Using a pixel shader for image processing by rendering from one image to anotherOutputImageImageprocessingisespeciallypowerfulwhenthecolorofthedestinationpixelistheresultofcomputationsdoneonmultiplepixelsfromthesourceimage.Inthiscase,wesamplethesourceimagemultipletimesa
6、ndusethepixelshadertocombinethedatafromthemultiplesamples(ortaps)toproduceasingleoutput.ColorSpaceConversionBeforewegetintointerestingmulti-tapfilters,we,llpresentapairofshaderswhichcanbeusedtoconvertbetweenHSVandRGBcolorspaces.Theseshadersperformsomerelativelycomplexoperationstoconvertbetweencolors
7、paceseventhoughtheyareonlysingle-tapfilters.ForthosewhomaynotbefamiliarwithHSVspace,itisacolorspacewhichisdesignedtobeintuitivetoartistswhothinkofacolorstint,shadeandtoneSmith78.InterpolationinthiscolorspacecanbemoreaestheticallypleasingthaninterpolationinRGBspace.Additionally,whencomparingcolors,it
8、maybedesirabletodosoinHSVspace.Forexample,inRGBspace,thecolor100,0,0)isverydifferentfromthecolor0,0,100.However,theirVcomponentsinHSVspaceareequal.Colors,representedbyhue,saturation,valuetriplesaredefinedtoliewithinahexagonalpyramidasshowninFigure2below.Thehueofacolorisrepresentedbyananglebetween0an
9、d360oaroundthecentralaxisofthehexagonalcone.Acolor,ssaturationisthedistancefromthecentral(achromatic)axisanditsvalueisthedistancealongtheaxis.Bothsaturationandvaluearedefinedtobebetween0and1.WehavetranslatedthepseudocodeRGB-to-HSVtransformationfromFoley90totheDirectX9HighLevelShadingLanguage(HLSL)an
10、dcompileditfortheps_2_0target.IfyouareunfamiliarwithHLSL,youcanreferbacktotheintroductorychapterIntroductiontotheDirectX9HighLevelShadingLanguage.AsdescribedinSmith79,youcanseethattheRGB_to_HSV()functioninthisshaderfirstdeterminestheminimumandmaximumchannelsoftheinputRGBcolor.Themaxchanneldetermines
11、thevalueoftheHSVcolor,orhowfaralongtheachromaticcentralaxisofthehexagonalconetheHSVcolorwillbe.ThesaturationisthencomputedasthedifferencebetweenthemaxandminRGBchannelsdividedbythemax.Hue(theanglearoundthecentralachromaticaxis)isthenafunctionofwhichchannelhadthemaxmagnitudeandthusdeterminedthevalue.f
12、loat4RGB_to_HSV(float4color)(floatrzg,b,delta;floatColorMaxzcolorMin;floath=0,s=0,v=0;float4hsv=0;r=color0;g=color1;b=color2;ColorMax=max(rzg);ColorMax=max(colorMax,b);colorMin=min(rzg);colorMin=min(colorMin,b);v=colorMax;/thisisvalueif(colorMax!=O)(s=(colorMax-colorMin)/colorMax;if(s!=O)/ifnotachro
13、matic(delta=colorMax-colorMin;if(r=colorMax)h=(g-b)/delta;elseif(g=colorMax)Ih=2.0+(b-r)/delta;else/bismaxIh=4.0+(r-g)/delta;h*=60;if(h=atan2(,P)MagnitudeandarewrittenouttoanimagesothatthenextshadercanusethemtocompletetheCannyfilteroperation.Theedgedirection,yisasignedquantityintherangeof-toandmustb
14、epackedintothe0to1rangeinordertopreventlossofdatabetweenrenderingpasses.Inordertodothis,wewillpackitbycomputing:A=abs(8)/Youveprobablynoticedthat,duetotheabsolutevalue,thisfunctionisnotinvertible,hencedataiseffectivelylost.Thisdoesnotpresentaproblemforthisparticularapplicationduetosymmetriesinthefol
15、lowingstep.ThefinalpassinvolvessamplingtheimagetogettheMagnitudeandtheedgedirection,8,atthecurrentlocation.Theedgedirection,mustnowbeunpackedintoitsproperrange.Figure3belowshowsapartitioningofallvaluesof(indegrees)intofoursectors.Figure3-The360degreesofananglepartitionedintofoursectorsThesectorsares
16、ymmetricandmaptothepossiblewaysalinecanpassthrougha33setofpixels.Inthepreviousstep,wetooktheabsolutevalueofanddivideditbytoputitinthe0to1range.Sinceweknowthatisalreadybetween0and1fromthepreviousstep,wearealmostdone.Sincethepartitioningissymmetric,itwasanexcellentwaytoreducethenumberofcomparisonsneed
17、edtofindthecorrectneighborstosample.Normally,tocompletethemappingwewouldmultiplyAby4andwewouldbedone.However,ifyoulookcloselyatFigure3youwillthatthesectorsarecenteredaround0and18().Inordertocompensateforthis,theproperequationis:Sector=floor(A-16)*4)Next,wecomputetheneighboringtexelcoordinatesbycheck
18、ingwhichsectorthisedgegoesthrough.Nowthattheneighborshavebeensampled,wecomparethecurrenttexesmagnitudetothemagnitudesofitsneighbors.Ifitsmagnitudeisgreaterthanbothofitsneighbors,thenitisthelocalmaximumandthevalueiskept.Ifitsmagnitudeislessthaneitherofitsneighbors,thenthistexesvalueissettozero.Thispr
19、ocessisknownasnontnaximasuppression,anditsgoalistothintheareasofchangesothatonlythegreatestlocalchangesareretained.Asafinalstep,wecanthresholdtheimageinordertoreducethenumberfalseedgesthatmightbepickedupbythisprocess.Thethresholdisoftensetbytheuserwhenheorshefindstherightbalancebetweentrueandfalseed
20、ges.Figure4-One-Pixel-WideEdgesfromCannyFilterFigure5-GradientMagnitudesfromSobelFilter(seeMitchell02)AsyoucanseeinFigure4,theCannyfilterproducesonepixelwideedgesunlikemorebasicfilterssuchasaSobeledgefilter.ImplementationDetailsThisshaderisimplementedintheVideoShaderapplicationontheCDusingHLSLandcan
21、becompiledfortheps_2_0targetorhigher.Inthisimplementation,thesamplesaretakenfromtheeightneighborsadjacenttothecenterofthefilter.LookingattheHLSLcode,you,llseeanarrayoffloattwo-tuplescalledsampleoffsets.Thisarraydefinesasetof2Doffsetsfromthecentertapwhichareusedtodeterminethelocationsfromwhichtosampl
22、etheinputimage.ThelocationsofthesesamplesrelativetothecentertapareshowninFigure6.Figure6-LocationsoftapsasdefinedinsampleoffsetsThefourstepsoftheCannyedgedetectionfilterdescribedabovehavebeencollapsedintotworenderingpasses,requiringthetwoshadersshownbelow.ThefirstshadercomputesthegradientsPandQfollo
23、wedbytheMagnitudeanddirection().Afterpacking0intothe0to1range,Magnitudeand0arewrittenouttoatemporarysurface.samplerInputimage;float2sampleoffsets8:register(cl);structPS_INPUT(float2texCoord:TEXCOORDO;float4main(PS_INPUTIn):COLOR(inti=0;float4result;floatMagnitude,Theta;floatp=0zq=0;floatpKernel4=-l,
24、1,-1,1);floatqKernel4=-1,-1,1,1);float2texCoords4;float3texSamples4;floatPI=3.1415926535897932384626433832795;texCoords(0=In.texCoord+samleffsets1;texCoords1=In.texCoord+SampleOffsets2;texCoords2=In.texCoord;texCoords3=In.texCoord+SampleOffsets4;for(i=0;i4;i+)(texSamplesi.xyz=tex2D(Inputimage,texCoo
25、rdsi);texSamplesi=dot(texSamplesi,0.33333333f);p+=texSamplesi*pKerneli;q+=texSamplesi*qKernel(i;)=2.0;q/=2.0;sqrt(p*p)+(q*q);result=Magnitude;directionofthe/linetoprepforNonmaximasupression./,ttheMax,/makeit0(hence,supressit)Theta=atan2(qzp);/resultisis0to1/Justsoitcanbewrittenout.returnresult;Inthe
26、secondpassoftheCannyedgedetector,Magnitudeandarereadbackfromthetemporarysurface.Theedgedirection,仇isclassifiedintooneoffoursectorsandtheneighborsalongtheproperdirectionaresampledusingdependentreads.TheMagnitudesoftheseneighborsamplesalongwithauser-definedthresholdarethenusedtodeterminewhetherthispix
27、elisalocalmaximumornot,resultingineither0or1beingoutputasthefinalresult.samplerInputimage;fl8:register(cl);float4Userinput:register(c24);structPS_INPUT(rd:Texcoordo;;flt4main(PS_INPUTIn):COLOR(float4result;floatMagnitude,Theta;float2LexCoords(4;float4texSamples3;floatPI=3.141592653589793238462643383
28、2795;TapthecurrenttexelandfigureoutlinedirectiontexSamples0=tex2D(Inputimage,In.texCoord);Magnitude=texSamples0.r;/Sampletwoneighborsthatlieinthedirectionoftheline/Thenfindoutif_thistexelhasagreaterMagnitude.Theta=texSamples0.a;/Mustunpacktheta.PriorpassmadeThetarangebetween0and1/Butwereallywantitto
29、beeither0,1,2,or4.SeeJain95/formoredetails.Theta=(Theta-PI/1Theta=floor(Theta);/NowthetaisanINT.texCoords2In.texCoord In.texCoordsamplef fsets (4 ;sampleoffsets3; eltexCoords2.texCoord In.texCoordsampleffsets2;sampleffsets5;) el.LexCoordtexCoords2In.texCoordSampleOffsets1;sampleoffsets6;) elexCoordt
30、exCoords2In.texCoordsampleffsets0;sampleoffsets7;Takeothertwosamples/Remembertheyareinthedirectionoftheedgefor(i=l;i3;i+)(=tex2D(Inputimage,texCoordsi);)Nowit,stimeforNonmaximasupression./Nonmaximasupression-IfthistexelisnttheMax,/makeit0(hence,supressit)/Thiskeepstheedgesniceandresult=Magnitude;if
31、( Magnitude texStude texSamples2.x )Thresholdtheresult.if(result.xUserinput.z)elseresult=1;YoucanseeinFigure4thatthisproducesone-pixel-wideedges,whichmaybemoredesirableforsomeapplications.Youmayseesomegapsinthedetectededgesand,insomecases,itmaybeusefultoapplyadilationoperationtofillinthesegapsMitche
32、ll02.SeparableTechniquesCertainfilteringoperationshaveinherentsymmetrywhichallowsustoimplementthemmoreefficientlyinaseparablemanner.Thatis,wecanperformthese2DimageprocessingoperationswithasequenceofIDoperationsandobtainequivalentresultswithlesscomputation.Conversely,wecanimplementalargeseparablefilt
33、erkernelwiththesameamountofcomputationasasmallnon-separablefilter.Thisisparticularlyimportantwhenattemptingtoapply“blooms“tofinalframesinhighdynamicrangespacetosimulatelightscattering.Inthisfinalsectionofthechapter,wewilldiscussthreeseparablefilteringoperations:theGaussianblur,amedianfilterapproxima
34、tionandtheFastFourierTransform.SeparableGaussianAverycommonly-usedseparablefilteristheGaussianfilter,whichcanbeusedtoperformblurringof2Dimages.The2Disotropic(i.e.circularlysymmetric)Gaussianfilter,g2D(x,y),samplesacircularneighborhoodofpixelsfromtheinputimageandcomputestheirweightedaverage,according
35、tothefollowingequation:13gflD(X、y)=2c202whereisthestandarddeviationoftheGaussianandxandyarethecoordinatesofimagesamplesrelativetothecenterofthefilter.Thestandarddeviation,determinesthesizeofthefilter.Whatthismeansisthatwewillsamplealocalareaoftexelsfromtheinputimageandweightthemaccordingtotheaboveeq
36、uation.Forexample,foraGaussianwith=1,Wecomputethefollowingfilterkernel(afternormalization).0.00370.01460.02560.01460.00370.01460.05860.09520.05860.01460.02560.09520.15020.09520.02560.01460.05860.09520.05860.01460.00370.01460.02560.01460.0037Intheory,theGaussianhasinfiniteextent,butthecontributiontot
37、hefinalresultisinsignificantforinputtexelsoutsideofthis55region.AnextremelyimportantpropertyoftheGaussianisthatitisseparable.Thatis,itcanberearrangedinthefollowingmanner:giD(%)gs(y)ThismeansthatwecanimplementagivenGaussianwithaseriesof1Dfilteringoperations:onehorizontal(gm(x)andoneverticald()0)Thisa
38、llowsustoimplementGaussianswithmuchlargerkernels(larger)whileperformingthesameamountofcalculationsthatwouldberequiredtoimplementasmallernon-separablefilterkernel.Thistechniquewasusedinourreal-timeimplementationofPaulDcbevec,sRenderingwithNaturalLightanimationasseeninFigure7.Figure7-FramefromReal-Tim
39、eRenderingwithNaturalLightAfterrenderingthesceneinhighdynamicrangespace,DebevecperformedanumberoflargeGaussianblursonhis2Drenderedscenetoobtainbloomsonbrightareasofthescene.Inordertodothisinreal-time,WeexploitedtheGaussian,sseparabilitytoperformtheoperationefficiently.Inourcase,weused=7,whichresultedina2525Gaussian