《软件工程专业毕业设计外文文献解析.docx》由会员分享,可在线阅读,更多相关《软件工程专业毕业设计外文文献解析.docx(21页珍藏版)》请在课桌文档上搜索。
1、学校代码:10128生.本科毕业设计外文文献翻译英文题目:SoftwareDatabaseAnObject-OrientedPerspective.中文题目:软件数据库的面向对象的视角学生姓名:宋兰兰学院:信息工程学院系别:软件工程系W业加姓Tw二。一三年六月AHISTORICALPERSPECTIVEFromtheearliestdaysofcomputers,storingandmanipulatingdatahavebeenamajorapplicationfocus.Thefirstgeneral-purposeDBMSwasdesignedbyCharlesBachmanatGene
2、ralElectricintheearly1960sandwascalledtheIntegratedDataStore.Itformedthebasisforthenetworkdatamodel,whichwasstandardizedbytheConferenceonDataSystemsLanguages(CODASYL)andstronglyinfluenceddatabasesystemsthroughthe1960s.BachmanwasthefirstrecipientofACM,sTuringAward(thecomputerscienceequivalentofaNobel
3、prize)forworkinthedatabasearea;hereceivedtheawardin1973.Inthelate1960s,IBMdevelopedtheInformationManagementSystem(IMS)DBMS,usedeventodayinmanymajorinstallations.IMSformedthebasisforanalternativedatarepresentationframeworkcalledthehierarchicaldatamodel.TheSABREsystemformakingairlinereservationswasjoi
4、ntlydevelopedbyAmericanAirlinesandIBMaroundthesametime,anditallowedseveralpeopletoaccessthesamedatathroughcomputernetwork.Interestingly,todaythesameSABREsystemisusedtopowerpopularWeb-basedtravelservicessuchasTravelocity!In1970,EdgarCodd,atlBM,sSanJoseResearchLaboratory,proposedanewdatarepresentation
5、frameworkcalledtherelationaldatamodel.Thisprovedtobeawatershedinthedevelopmentofdatabasesystems:itsparkedrapiddevelopmentofseveralDBMSsbasedontherelationalmodel,alongwitharichbodyoftheoreticalresultsthatplacedthefieldonafirmfoundation.Coddwonthe1981TuringAwardforhisseminalwork.Databasesystemsmatured
6、asanacademicdiscipline,andthepopularityofrelationalDBMSschangedthecommerciallandscape.Theirbenefitswerewidelyrecognized,andtheuseofDBMSsformanagingcorporatedatabecamestandardpractice.Inthe1980s,therelationalmodelconsolidateditspositionasthedominantDBMSparadigm,anddatabasesystemscontinuedtogainwidesp
7、readuse.TheSQLquerylanguageforrelationaldatabases,developedaspartofIBM,sSystemRproject,isnowthestandardquerylanguage.SQLwasstandardizedinthelate1980s,andthecurrentstandard,SQL-92,wasadoptedbytheAmericanNationalStandardsInstitute(ANSI)andInternationalStandardsOrganization(ISO).Arguably,themostwidelyu
8、sedfbnofconcurrentprogrammingistheconcurrentexecutionofdatabaseprograms(calledtransactions).Userswriteprogramsasiftheyaretoberunbythemselves,andtheresponsibilityforrunningthemconcurrentlyisgiventotheDBMS.JamesGraywonthe1999TuringawardforhiscontributionstothefieldoftransactionmanagementinaDBMS.Inthel
9、ate1980sandthe1990s,advanceshavebeenmadeinmanyareasofdatabasesystems.Considerableresearchhasbeencarriedoutintomorepowerfulquerylanguagesandricherdatamodels,andtherehasbeenabigemphasisonsupportingcomplexanalysisofdatafromallpartsofanenterprise.Severalvendors(e.g.,IBM,sDB2,Oracle8,InformixUDS)haveexte
10、ndedtheirsystemswiththeabilitytostorenewdatatypessuchasimagesandtext,andwiththeabilitytoaskmorecomplexqueries.Specializedsystemshavebeendevelopedbynumerousvendorsforcreatingdatawarehouses,consolidatingdatafromseveraldatabases,andforcarryingoutspecializedanalysis.Aninterestingphenomenonistheemergence
11、ofseveralenterpriseresourceplanning(ERP)andmanagementresourceplanning(MRP)packages,whichaddasubstantiallayerofapplication-orientedfeaturesontopofaDBMS.WidelyusedpackagesincludesystemsfromBaan,Oracle,PeopleSoft,SAP,andSiebel.Thesepackagesidentifyasetofcommontasks(e.g.,inventorymanagement,humanresourc
12、esplanning,financialanalysis)encounteredbyalargenumberoforganizationsandprovideageneralapplicationlayertocarryoutthesetasks.ThedataisstoredinarelationalDBMS,andtheapplicationlayercanbecustomizedtodifferentcompanies,leadingtolowerIntroductiontoDatabaseSystemsoverallcostsforthecompanies,comparedtothec
13、ostofbuildingtheapplicationlayerfromscratch.Mostsignificantly,perhaps,DBMSshaveenteredtheInternetAge.WhilethefirstgenerationofWebsitesstoredtheirdataexclusivelyinoperatingsystemsfiles,theuseofaDBMStostoredatathatisaccessedthroughaWebbrowserisbecomingwidespread.QueriesaregeneratedthroughWeb-accessibl
14、efbsandanswersareformattedusingamarkuplanguagesuchasHTML,inordertobeeasilydisplayedinabrowser.AllthedatabasevendorsareaddingfeaturestotheirDBMSaimedatmakingitmoresuitablefordeploymentovertheInternet.Databasemanagementcontinuestogainimportanceasmoreandmoredataisbroughton-line,andmadeevermoreaccessibl
15、ethroughcomputernetworking.Todaythefieldisbeingdrivenbyexcitingvisionssuchasmultimediadatabases,interactivevideo,digitallibraries,ahostofscientificprojectssuchasthehumangenomemappingeffortandNASA,sEarthObservationSystemproject,andthedesireofcompaniestoconsolidatetheirdecision-makingprocessesandminet
16、heirdatarepositoriesforusefulinformationabouttheirbusinesses.Commercially,databasemanagementsystemsrepresentoneofthelargestandmostvigorousmarketsegments.Thusthes-tudyofdatabasesystemscouldprovetoberichlyrewardinginmorewaysthanone!INTRODUCTIONTOPHYSICALDATABASEDESIGN1.ikeallotheraspectsofdatabasedesi
17、gn,physicaldesignmustbeguidedbythenatureofthedataanditsintendeduse.Inparticular,itisimportanttounderstandthetypicalworkloadthatthedatabasemustsupport;theworkloadconsistsofamixofqueriesandupdates.Usersalsohavecertainrequirementsabouthowfastcertainqueriesorupdatesmustrunorhowmanytransactionsmustbeproc
18、essedpersecond.Theworkloaddescriptionandusers,performancerequirementsarethebasisonwhichanumberofdecisionshavetobemadeduringphysicaldatabasedesign.Tocreateagoodphysicaldatabasedesignandtotunethesystemforperformanceinresponsetoevolvinguserrequirements,thedesignerneedstounderstandtheworkingsofaDBMS,esp
19、eciallytheindexingandqueryprocessingtechniquessupportedbytheDBMS.Ifthedatabaseisexpectedtobeaccessedconcurrentlybymanyusers,orisadistributeddatabase,thetaskbecomesmorecomplicated,andotherfeaturesofaDBMScomeintoplay.DATABASEWORKLOADSThekeytogoodphysicaldesignisarrivingatanaccuratedescriptionoftheexpe
20、ctedworkload.Aworkloaddescriptionincludesthefollowingelements:1. Alistofqueriesandtheirfrequencies,asafractionofallqueriesandupdates.2. Alistofupdatesandtheirfrequencies.3. Performancegoalsforeachtypeofqueryandupdate.Foreachqueryintheworkload,Wemustidentify:Whichrelationsareaccessed.Whichattributesa
21、reretained(intheSELECTclause).Whichattributeshaveselectionorjoinconditionsexpressedonthem(intheWHEREclause)andhowselectivetheseconditionsarelikelytobe.Similarly,foreachupdateintheworkload,wemustidentify:Whichattributeshaveselectionorjoinconditionsexpressedonthem(intheWHEREclause)andhowselectivethese
22、conditionsarelikelytobe.Thetypeofupdate(INSERT,DELETE,orUPDATE)andtheupdatedrelation.ForUPDATEcommands,thefieldsthataremodifiedbytheupdate.Rememberthatqueriesandupdatestypicallyhaveparameters,forexample,adebitorcreditoperationinvolvesaparticularaccountnumber.Thevaluesoftheseparametersdetermineselect
23、ivityofselectionandjoinconditions.Updateshaveaquerycomponentthatisusedtofindthetargettuples.Thiscomponentcanbenefitfromagoodphysicaldesignandthepresenceofindexes.Ontheotherhand,updatestypicallyrequireadditionalworktomaintainindexesontheattributesthattheymodify.Thus,whilequeriescanonlybenefitfromthep
24、resenceofanindex,anindexmayeitherspeeduporslowdownagivenupdate.Designersshouldkeepthistrade-offerinmindwhencreatingindexes.NEEDFORDATABASETUNINGAccurate,detailedworkloadinformationmaybehardtocomebywhiledoingtheinitialdesignofthesystem.Consequently,tuningadatabaseafterithasbeendesignedanddeployedisim
25、portant-Wemustrefinetheinitialdesigninthelightofactualusagepatternstoobtainthebestpossibleperformance.Thedistinctionbetweendatabasedesignanddatabasetuningissomewhatarbitrary.Wecouldconsiderthedesignprocesstobeoveronceaninitialconceptualschemaisdesignedandasetofindexingandclusteringdecisionsismade.An
26、ysubsequentchangestotheconceptualschemaortheindexes,say,wouldthenberegardedasatuningactivity.Alternatively,wecouldconsidersomerefinementoftheconceptualschema(andphysicaldesigndecisionsaffectedbythisrefinement)tobepartofthephysicaldesignprocess.WhereWedrawthelinebetweendesignandtuningisnotveryimporta
27、nt.OVERVIEWOFDATABASETUNINGAftertheinitialphaseofdatabasedesign,actualuseofthedatabaseprovidesavaluablesourceofdetailedinformationthatcanbeusedtorefinetheinitialdesign.Manyoftheoriginalassumptionsabouttheexpectedworkloadcanbereplacedbyobservedusagepatterns;ingeneral,someoftheinitialworkloadspecifica
28、tionwillbevalidated,andsomeofitwillturnouttobewrong.Initialguessesaboutthesizeofdatacanbereplacedwithactualstatisticsfromthesystemcatalogs(althoughthisinformationwillkeepchangingasthesystemevolves).Carefulmonitoringofqueriescanrevealunexpectedproblems;forexample,theoptimizermaynotbeusingsomeindexesa
29、sintendedtoproducegoodplans.Continueddatabasetuningisimportanttogetthebestpossibleperformance.TUNINGTHECONCEPTUALSCHEMAInthecourseofdatabasedesign,wemayrealizethatourcurrentchoiceofrelationschemasdoesnotenableusmeetourperformanceobjectivesforthegivenworkloadwithany(feasible)setofphysicaldesignchoice
30、s.Ifso,wemayhavetoredesignourconceptualschema(andre-examinephysicaldesigndecisionsthatareaffectedbythechangesthatwemake).Wemayrealizethataredesignisnecessaryduringtheinitialdesignprocessorlater,afterthesystemhasbeeninuseforawhile.Onceadatabasehasbeendesignedandpopulatedwithdata,changingtheconceptual
31、schemarequiresasignificanteffortintermsofmappingthecontentsofrelationsthatareaffected.Nonetheless,itmaysometimesbenecessarytorevisetheconceptualschemainlightofexperiencewiththesystem.Wenowconsidertheissuesinvolvedinconceptualschema(re)designfromthepointofviewofperformance.Severaloptionsmustbeconside
32、redwhiletuningtheconceptualschema:Wemaydecidetosettlefora3NFdesigninsteadofaBCNFdesign.Iftherearetwowaystodecomposeagivenschemainto3NForBCNF,ourchoiceshouldbeguidedbytheworkload.SometimesWemightdecidetofurtherdecomposearelationthatisalreadyinBCNEInothersituationsWemightdenormalize.Thatis,wemightchoo
33、setoreplaceacollectionofrelationsobtainedbyadecompositionfromalargerrelationwiththeoriginal(larger)relation,eventhoughitsuffersfromsomeredundancyproblems.Alternatively,wemightchoosetoaddsomefieldstocertainrelationstospeedupsomeimportantqueries,evenifthisleadstoaredundantstorageofsomeinformation(andc
34、onsequently,aschemathatisinneither3NFnorBCNF).Thisdiscussionofnormalizationhasconcentratedonthetechniqueofdecomposition,whichamountstoverticalpartitioningofarelation.Anothertechniquetoconsiderishorizontalpartitioningofarelation,whichwouldleadtoourhavingtworelationswithidenticalschemas.Notethatwearen
35、ottalkingaboutphysicallypartitioningthecuplesofasinglerelation;rather,wewanttocreatetwodistinctrelations(possiblywithdifferentconstraintsandindexesoneach).Incidentally,whenweredesigntheconceptualschema,especiallyifwearetuninganexistingdatabaseschema,itisworthconsideringwhetherWeshouldcreateviewstoma
36、skthesechangesfromusersforwhomtheoriginalschemaismorenatural.TUNINGQUERIESANDVIEWSIfwenoticethataqueryisrunningmuchslowerthanweexpected,wehavetoexaminethequerycarefullytoendtheproblem.Somerewritingofthequery,perhapsinconjunctionwithsomeindextuning,canoften?xtheproblem.Similartuningmaybecalledforifqu
37、eriesonsomeviewrunslowerthanexpected.Whentuningaquery,thefirstthingtoverifyisthatthesystemisusingtheplanthatyouexpectittouse.Itmaybethatthesystemisnotfindingthebestplanforavarietyofreasons.Somecommonsituationsthatarenothandledefficientlybymanyoptimizersfollow:Aselectionconditioninvolvingnullvalues.S
38、electionconditionsinvolvingarithmeticorstringexpressionsorconditionsusingtheorconnective.Forexample,ifwehaveaconditionE.age=2*D.ageintheWHEREclause,theoptimizermaycorrectlyutilizeanavailableindexonE.agebutfailtoutilizeanavailableindexonD.age.ReplacingtheconditionbyE.age2=D.agewouldreversethesituatio
39、n.Inabilitytorecognizeasophisticatedplansuchasanindex-onlyscanforanaggregationqueryinvolvingaGROUPBYclause.Iftheoptimizerisnotsmartenoughtoandthebestplan(usingaccessmethodsandevaluationstrategiessupportedbytheDBMS),somesystemsallowuserstoguidethechoiceofaplanbyprovidinghintstotheoptimizer;forexample
40、,usersmightbeabletoforcetheuseofaparticularindexorchoosethejoinorderandjoinmethod.AuserwhowishestoguideoptimizationinthismannershouldhaveathoroughunderstandingofbothoptimizationandthecapabilitiesofthegivenDBMS.(8)0THERTOPICSMOBILEDATABASESTheavailabilityofportablecomputersandwirelesscommunicationsha
41、screatedanewbreedofnomadicdatabaseusers.Atoneleveltheseusersaresimplyaccessingadatabasethroughanetwork,whichissimilartodistributedDBMSs.Atanotherlevelthenetworkaswellasdataandusercharacteristicsnowhaveseveralnovelproperties,whichaffectbasicassumptionsinmanycomponentsofaDBMS,includingthequeryengine,t
42、ransactionmanager,andrecoverymanager.UsersareconnectedthroughawirelesslinkwhosebandwidthistentimeslessthanEthernetand100timeslessthanATMnetworks.CommunicationcostsarethereforesignificantlyhigherinproportiontoI/OandCPUcosts.Users,locationsareconstantlychanging,andmobilecomputershavealimitedbatterylif
43、e.Therefore,thetruecommunicationcostsisconnectiontimeandbatteryusageinadditiontobytestransferred,andchangeconstantlydependingonlocation.Dataisfrequentlyreplicatedtominimizethecostofaccessingitfromdifferentlocations.Asausermovesaround,datacouldbeaccessedfrommultipledatabaseserverswithinasingletransac
44、tion.Thelikelihoodoflosingconnectionsisalsomuchgreaterthaninatraditionalnetwork.Centralizedtransactionmanagementmaythereforebeimpractical,especiallyifsomedataisresidentatthemobilecomputers.WemayinfacthavetogiveuponAClDtransactionsanddevelopalternativenotionsofconsistencyforuserprograms.MAINMEMORYDAT
45、ABASESThepriceofmainmemoryisnowlowenoughthatwecanbuyenoughmainmemorytoholdtheentiredatabaseformanyapplications;with64-bitaddressing,modernCPUsalsohaveverylargeaddressspaces.Somecommercialsystemsnowhaveseveralgigabytesofmainmemory.ThisshiftpromptsareexaminationofsomebasicDBMSdesigndecisions,sincedisk
46、accessesnolongerdominateprocessingtimeforamemory-residentdatabase:Mainmemorydoesnotsurvivesystemcrashes,andsowestillhavetoimplementloggingandrecoverytoensuretransactionatomicityanddurability.Logrecordsmustbewrittentostablestorageatcommittime,andthisprocesscouldbecomeabottleneck.Tominimizethisproblem
47、,ratherthancommiteachtransactionasitcompletes,wecancollectcompletedtransactionsandcommittheminbatches;thisiscalledgroupcommit.Recoveryalgorithmscanalsobeoptimizedsincepagesrarelyhavetobewrittenouttomakeroomforotherpages.Theimplementationofin-memoryoperationshastobeoptimizedcarefullysincediskaccesses
48、arenolongerthelimitingfactorforperformance.Anewcriterionmustbeconsideredwhileoptimizingqueries,namelytheamountofspacerequiredtoexecuteaplan.Itisimportanttominimizethespaceoverheadbecauseexceedingavailablephysicalmemorywouldleadtoswappingpagestodisk(throughtheoperatingsystemsvirtualmemorymechanisms),
49、greatlyslowingdownexecution.Page-orienteddatastructuresbecomelessimportant(sincepagesarenolongertheunitofdataretrieval),andclusteringisnotimportant(sincethecostofaccessinganyregionofmainmemoryisuniform).(一)从历史日勺角度回忆从数据库B初期开始,存储和操纵数据就一直是重要的应用焦点。第一种通用的DBMS是由CharlesBechman于20世纪60年代初期在通用电器企业设计0,称为集成数据存储QntegratedDataStOre).它奠定了网状数据模型0基础。网状数据模型由数据系统语言协会(CODASYL)原则化,并在整个20世纪60年代对数据库系统产生了巨大的影响。由于Bachman在数据库领域的奉献,他成为第一种ACM图灵奖(相称于计算机科学界的诺贝尔奖)的