大数据平台搭建详细教程CHD.docx

上传人:夺命阿水 文档编号:1013466 上传时间:2024-02-26 格式:DOCX 页数:25 大小:120.10KB
返回 下载 相关 举报
大数据平台搭建详细教程CHD.docx_第1页
第1页 / 共25页
大数据平台搭建详细教程CHD.docx_第2页
第2页 / 共25页
大数据平台搭建详细教程CHD.docx_第3页
第3页 / 共25页
大数据平台搭建详细教程CHD.docx_第4页
第4页 / 共25页
大数据平台搭建详细教程CHD.docx_第5页
第5页 / 共25页
点击查看更多>>
资源描述

《大数据平台搭建详细教程CHD.docx》由会员分享,可在线阅读,更多相关《大数据平台搭建详细教程CHD.docx(25页珍藏版)》请在课桌文档上搜索。

1、大数据平台搭建详细教程目录1 .引言41.1 编写目的42 .详细搭建步骤42.1前期准备42.1.1添力口hostname42.1.2添加子用户52.1.3设置免密登陆52.1.4关闭selinux52.1.5关闭防火墙52.1.6安装JDK62.2安装hadoop集群62.2.1Zookeeper62.2.1.1配置Zookeeper62.2.1.2Zookeeper的使用72.3.2Hadoop72.3.2.1配置HadOoP82.3.2.2第一次启动hadoop92. 3.3Spark103. 3.1安装SCale(全部节点)102.3.3.2安装spark112.3.4Hive112

2、.3.4.1部署MySQL主从集群112.3.4.2配置HiVe142.3.5Sqoop172.3.5.1配置SqoOP172.3.5.2使用sqoop182.4安装HbaSe集群182.4.1Hbase182.4.1.2部署分布式hbase集群182.4.1.3操作hbase212.4.2Kafka222.4.2.1分布式部署kafka222.4.2.2使用Kafka222.4.3Kafka-MONITOR232.4.3.1配置Kafka-MONITOR232.5环境变量242.5.1在hadoop节点上添加的环境变量242.5.2在hbase集群节点配置环境变量251.引言1.1编写目的本

3、教程基于CentOS7.3编写,主要用于大数据平台搭建,其中组件有Zookeeper.HDFS.YARN、MaPredUCeS2、HBaSe、Spark、HiVe和SQe)OP。本系统一共2套,一套hadoop集群,一套HbaSe集群功能部若组件IHadoo集群管理方点(2台)NameNode(hadoop)、DFszKFalloverControIIer(hadoop)、ResourceManager(hadoop)HIVE(MYSQL),SQOOPMYSQLHadoO隙群数据节点(3台)hadpO1JournaINode(hadoop),DataNode(hadoop),QuorumPee

4、rMain(Zookeeper),SPARK(master、worker),NodeManager(hadoop)hadoop01hadoop02HbaSe集群管理三点(2台)hbaseManagerO1NameNode(hadoop)xDFszKFaliovefControIIer(hadoop)、ResourceManager(hadoop),Hmaster(hbase)KafkaOffsetMonitorhbaseManagerC2HbaSe集群数据E点(3台)hbaseO1JournaINode(hadp),DataNode(hadoop).Zookeeper1HRegionServe

5、r(hbase),KAFKA,NodeManager(hadoop)hbaseO2hbaseO3图1.1组件2.详细搭建步骤2.1 前期准备在全部节点配置2.1.1 添加hostname修改主机名,并且在每个节点上etchosts文件中添加hostnamegIP,如果有域名服务器可以不127.0.0.1localhostlocalhost.localdomainlocalhost4localhost4.localdomain4:1localhostlocalhost.Iocaldomain1。CalhOSt6localhost6.localdomain692.168.19.31hadoop01

6、192.168.19.32hadoop02192.168.19.33hadoop03192.168.19.34hadoop04192.168.19.35hadoop05用添加.2-1-1添加生机名2.1.2 添加子用户在全部主机上添加子用户,hadoop集群子用户名为hadoop,HbaSe集群子用户名为hbaseadduserHadoopadduserhbase2.1.3 设置免密登陆生成sshkey,设置主机之间子用户免密登陆,将所有主机子用户的rsa.pub复制到authorized_keys中,然后将authorized.keys复制到所有节点,并将authorized.keys权限改

7、为644.chown-RHadooprhadoophomeHadoopchmod700homeHadoopchmod700homehadoop.sshchmod644homehadoop.sshauthorized-keyschmod600homehadoop.sshid-rsa配置完成后,验证配置是否成功,相互免密登陆就算配置成功.2.1.4 关闭selinu修改所有节点的etcselinuxconfig中值为disabled,并重启SELINX=disabled用usrsbinsestatus检查2.1.5 关闭防火墙使用如下命令关闭所有节点防火墙Systemctlstopfirewall

8、d.servicesystemctldisablefirewalld.servicesystemctlstatusfirewalld.service2.1.6 安装JDK因为had。P所有组件都需要使用JDK,所以要提前安装JDK。本教程默认使用的jdk-8ul62-linux-x64.rpm版本.在官网下载好安装包后,拷贝到节点中,使用如下命令安装:yuminstall-yjdk-8ul62-linux-x64.rpmroothadoop01#java-versionjavaversion1.8.0_162Java(TM)SERuntimeEnvironment(build1.8.0_162

9、-bl2)JavaHotSpot(TM)64-BitServerVM(build25.162-bl2rmixedmode)rootQhadoopOl#2-1-5安装JDK2.2 安装hadoop集群环境安装顺序如下:Zookeeper-hadoop-spark-hive-sqoop2.2.1 Zookeeper在节点hadoop01,hadoop02和hadoop03上配置安装Zookeeper,用户为子用户hadoop2.2.1.1 配置ZOokeePer1.创建先关文件夹mkdir-phomehadoopoptdatazookeepermkdir-phomehadoopoptdatazoo

10、keeperzookeeperjog2 .上传ZK安装包至JhomehadoopZOOkeePer-3.4.5-Cdh5.10.0.tar.gz,然后解压tar-zxvfzookeeper-3.4.5-cdh5.10.0.tar.gz3 .创建homehadoopzoOkeePer-3.4.5-Cdh5.10.0confZOo.cfgroot()hadoop01conf#catzoo.cfgtickTime=2000initLimit=5syncLimit=2dataDir=homehadoopoptdatazookeeperdataLogDir=homehadC)OPoptdataZoOke

11、ePer/zookeeperOgclientPort=2181server.33=hadoop01:2888:3888server.34=hadoop02:2888:3888server.35=hadoop03:2888:38884.在每个节点上的homehadoopoptdataZOokeePer中创建文件myid,并且写入对应的值hadoop01,llmyid写入33hadoop02l,myid写入34hadoop03rlmyid写入352.2.1.2Zookeeper的使用1 .启动ZK在每个节点上用如下命令启动Zookeepehomehadoopzookeeper-3.4.5-cdh5

12、.10.0binzkServer.shstart2 .测试连接ZKhomehadoopzookeeper-3.4.5-cdh5.10.0binzkCli.sh-serverhadoop01:21803 .杳看状态homehadoopzookeeper-3.4.5-cdh5.10.0binzkServer.shstatus4 .3.2Hadoop在全部节点上配置hadoop,用户为子用户hadoop2.3.2.1配置Hadoop1 .解压hadoop-2.6.0-cdh5.10.0.tar.gz至Jhomehadooptar-zxvfhadoop-2.6.0-cdh5.10.0.tar.gz2

13、.创建文件夹mkdir-phomehadoopoptdatahadooptmpmkdir-phomehadoopoptdatahadoophadoop-namemkdir-phomehadoopoptdatahadoophadoop-datamkdir-phomehadoopoptdatahadoopeditsdirdfsjournalnodemkdir-phomehadoopoptdatahadoopnm-local-dirmkdir-phomehadoopoptdatahadoophadoopjogmkdir-phomehadoopoptdatahadoopuserlogs3. ighom

14、ehadoophadoop-2.6.0-cdh5.10.0etchadoophadoop-env.sh#Thejavaimplementationtouse.exportJAVA_HOME=/usr/java/jdkl.8.0_1624,配置hdfsha配置文件如下,详细配置在文件夹had。P中core-site.xmlhdfs-site.xml5 .配置yarnHA配置文件如下,详细配置在文件夹hadoop中yam-site.xml(单独到管理节点配置yarn.resourcemanager.ha.id指定为当前管理节点)mapred-site.ml6 .YarndatamanagerDat

15、amanager节点将文件SPark-2.3.0-yam-shuffle.jar放入homehadoophadoop-2.6.0-cdh5.10.0sharehadoopyarnspark-2.3.0-yarn-shuffle.jar2.3.2.2第一次启动hadoop1 .在namenodel上执行,创建命名空间homehadoophadoop-2.6.0-cdh5.10.0binhdfszkfc-formatZK检查:Ka-ActiveStandbyEIector:Successfullycreatedhadoop-habigdataclusterinZK.2 .journalnode(h

16、adoop01fhadoop02jhadoop03)homehadoophadoop-2.6.0-cdh5.10.0sbinhadoop-daemon.shstartjournalnode检查:homehadoopoptdatahadoophadoopjoghadoop-hadoop-journalnode-hadoop02.log3 .主namenode上运行命令,格式化,只在主NN格式化,产生唯一ID标识homehadoophadoop-2.6.0-cdh5.10.0binhdfsnamenode-formatbigdatacluster检查:没报错4 .在主namenode启动namen

17、ode进程homehadoophadoop-2.6.0-cdh5.10.0sbinhadoop-daemon.shstartnamenode检查:homehadoopoptdatahadoophadoopjoghadoop-hadoop-namenode-hadoopmanager01.log5 .在从namenode上运行,从主NN上copy元数据,同步元数据homehadoophadoop-2.6.0-cdh5.10.0binhdfsnamenode-bootstrapstandby检查:Exitingwithstatus06 .在从namenode上启动NNhomehadoophadoo

18、p-2.6.0-cdh5.10.0sbinhadoop-daemon.shstartnamenode检查:homehadoopoptdatahadoophadoopjoghadoop-hadoop-namenode-hadoopmanager02.log7 .2个namenode节点启动DFSZKFaiIOVerCOntre)Ilerhomehadoophadoop-2.6.0-cdh5.10.0sbinhadoop-daemon.shstartzkfc检查:homehadoopoptdatahadoophadoopjoghadoop-hadoop-zkfc-hadoopmanager02.l

19、o8 .启动HDFShomehadoophadoop-2.6.0-cdh5.10.0sbinstart-dfs.sh9 .启动yarnhomehadoophadoop-2.6.0-cdh5.10.0sbinstart-yarn.sh检查:homehadoophadoop-2.6.0-cdh5.10.0logsyarn-hadoop-resourcemanager-hadoopmanager01.loghttp:/172.16.20.11:8088/cluster/nodes10 3.3Spark在节点hadoopOLhadoop02,hadoop03上配置部署SPark用户为子用户hadoop

20、3.3.1安装SCale(全部节点)安装scale用户为root账户1 .上传安装包,然后解压后移动到/usr/l。CalFtarzxvfscala-2.12.5.tgzmvscala-2.12.5usrlocal2 .配置环境变量并SOUrCe生效VietcprofileexportSCALA_HOME=/usr/local/scala-2.12.5exportPATH=$PATH:$HADOOP_HOME/bin:$JAVA_HOME/bin:$SCALA_HOME/bin:$SPARK_HOME/bsourceetcprofile3 .查看SCalascala-version4 .33.

21、2安装spark1 .上传spark-2.3.0-bin-hadoop2.6.tgz并解压到homehadooptar-zvfspark-23.0-bin-hadoop2.6.tgz2 .配置spark-env.shHADOOP_CONF_DIR=/home/hadoop/hadoop-2.6.0-cdh5.10.0SPARK_HOME=/home/hadoop/spark-2.3.0-bin-hadoop2.63 .启动SPark进入hadoop01-03homehadoopspark-2.3.0-bin-hadoop2.6sbinstart-all.sh4 .查看WebUlhttp:/IP

22、:8080/2.3.4Hive在节点hadoopManager01上配置Hive,用户为子用户hadoop,然后在hadoopManager01-02上配置主从MyS矶用户为root.23.4.1部署MySQL主从集群1 .清除默认安装的MariaDBrpm-qagrep-imariadbrpm-enodepsmariadb-libs-5.5.52-l.el7.86-642 .上传MySQL安装包,然后解压tar-xvfmysql-5.7.21-l.el7.86.64.rpm-bundle.tar3 .几个包由依赖关系,执行有先后其中,client依赖于libs,server依赖于common

23、和client按照如下顺序安装yuminstallperl-y&yuminstallnet-tools-yrpm-ivhmysql-community-common-5.7.21-l.el7.x86_64.rpmrpm-ivhmysql-community-libs-5.7.21-l.el7.86.64.rpmrpm-ivhmysql-community-client-5.7.21-l.el7.x86_64.rpmrpm-ivhmysql-community-server-5.7.21-l.el7.x86_64.rpm4 .为了保证数据库目录为与文件的所有者为mysql登陆用户,如果你是以ro

24、ot身份运行mysql服务,需要执行下面的命令初始化mysqld-initializeuser=mysql5 .启动mysql数据库systemctlstartmysqld.serviceSystemctlstatusmysqld.service6 .登陆MySQL使用获取初始密码,然后登陆catvarlogmysqld.logmysql-uroot-p;7 .设置新密码mysqlsetpassword=password(2018);8 .设置授权(远程访问)mysqlgrantallprivilegeson*.*to,mysq%identifiedby,2018,;mysqlflushpri

25、vileges;10.修改主数据库配置(hadoop01)vietcf#加入下列参数log-bin=mysql-binserver-id=2binlog-ignore-db=information_schemabinlog-ignore-db=clusterbinlog-ignore-db=mysqlbinlog-do-db=test重启数据库登陆后:grantFILEon*.*to,mysql(g),172.16.20.12identifiedby,2018,;grantreplicationslaveon*.*tomysqr172.16.20.12identifiedby,2018,;fl

26、ushprivileges;SHOWMASTERSTATUS;11、修改从节点vietcf#加入下列参数log-bin=mysql-binserver-id=3binlog-ignore-db=information_schemabinlog-ignore-db=clusterbinlog-ignore-db=mysqlreplicate-do-db=testreplicate-ignore-db=mysqllog-slave-updatesslave-skip-errors=allslave-net-timeout=60重启数据库登陆后:CHANGEMASTERTOMASTERJHoST=1

27、72.16.20.11,MASTER_USER=mysql,MASTER-PASSWORD=2018,MASTER.LOG-FILE=,mysql-bin.000002,MASTER-LG.POS=883;stopslave:startslave;23.4.2配置Hive1 .登陆mysqlf创建一个用户,并且赋予权限mysqlcreateuserhive%identifiedbyhive,;mysqlgrantallon*.*to,hive(),%,identifiedby,hive,;mysqlflushprivileges;2 .创建文件夹mkdir-phomehadoopoptdata

28、hivemkdir-phomehadoopoptdatahivelogs3 .上传hive-1.1.0-cdh5.10.0并解压至Jhomehadooptar-zxvfhive-1.1.0-cdh5.10.0.tar.gz4 .创建文件hive-site.mljavaxjdo.option.ConnectionURLjdbc:mysql:/hadoopmanager01:3306/hive?createDatabaseIfNotExist=trueJDBCconnectstringforaJDBCmetastorejavaxjdo.option.ConnectionDriverNamecom.

29、mysqljdbc.DriverDriverclassnameforaJDBCmetastorejavaxjdo.option.ConnectionUserNamehiveusernametouseagainstmetastoredatabasejavaxjdo.option.ConnectionPasswordhivepasswordtouseagainstmetastoredatabasehive.hwi.war.filelibhive-hwi-1.1.0-cdh5.10.0.jarThissetsthepathtotheHWIwarfile,relativeto$HIVE_HOME.hi

30、ve.hwi.Iisten.host0.0.0.0ThisisthehostaddresstheHiveWebInterfacewilllistenonhive.hwi.listen.port9999ThisistheporttheHiveWebInterfacewilllistenonhive.exec.scratchdirhomehadoopoptdatahivehive-Suser.nameScratchspaceforHivejobshive.exec.local.scratchdirhomehadoopoptdatahiveSuser.nameLocalscratchspacefor

31、Hivejobs5 .修改hive-env.xmlcphive-env.sh.templatehive-env.sh# SetHADOOP_HOMEtopointtoaspecifichadoopinstalldirectoryHADOOP,HOME=homehadoophadoop-2.6.0-cdh5.10.0# HiveConfigurationDirectorycanbecontrolledby:exportHIVE_CONF_DIR=/home/hadoop/hive-1.1.0-cdh5.10.0/conf# Foldercontainingextraibrariesrequire

32、dforhivecompilation/executioncanbecontrolledby:exportHIVE_AUX_JARS_PATH=/home/hadoop/hive-1.1.0-cdh5.10.0/lib6 .上传mysqlJDBC的jar到hive的Iib下tar-zxvfmysql-connectorava-5.1.46.tar.gzcpmysql-connectorava-5.1.46jarhomehadoophive-1.1.0-cdh5.10.0lib2.3.5Sqoop在节点hadoopManagerOl上配置Hive,用户为子用户hadoop23.5.1配置SqoO

33、P1 .上传Sqoop安装包至JhomehadoopSqOoP-146-Cdh5.10.0.tar.g乙然后解压tar-zxvfsqoop-1.4.6-cdh5.10.0.tar.gz2 .mysql的jdbc驱动mysql-connector-java-5.1.10.jar复制到SqOOP项目的Iib目录下cpmysql-connectorava-5.1.46jarhomehadoopsqoop-1.4.6-cdh5.10.0lib3 .修改hbase-env.shexportJAVA_HOME=/usr/java/jdkl.8.0_162exportHBASE_LOG_DIR=/home/

34、hadoop/data/hbase/logsexportHADOOP_HOME=/home/hadoop/hadoop-2.6.0-cdh5.10.0exportHBASE_MANAGES_ZK=false4 .配置SqOOP-env.shexportHADOOP_COMMON_HOME=/home/hadoop/hadoop-2.6.0-cdh5.10.0exportHADOOP_MAPRED_HOME=/home/hadoop/hadoop-2.6.0-cdh5.10.0exportHIVE_HOME=/home/hadoop/hive-1.1.0-cdh5.10.023.5.2使用sqo

35、op列出mysql数据库中的所有数据库sqooplist-databases-connectjdbc:mysql:/localhost:3306/-usernamemysql-password20182.4安装HbaSe集群hbase集群安装顺序:Zookeeper-hadoop-hbase-kafka-KafkaOffsetMonitorZOokeePer和hadoop的安装参考2.3.1和2.3.2安装,注意用户名的配置.2.4.1 Hbase在所有节点配置hbase集群2.4.1.2部署分布式hbase集群1 .上传Hbase安装包至Jhomehbasehbase-120-cdh5.10

36、.0.tar.gz,然后解压tar-zvfhbase-1.2.0-cdh5.10.0.tar.gz2 .创建先关文件夹mkdir-phomehbaseoptdatahbaselogsmkdir-phomehbaseoptdatahbasezookeepermkdir-phomehbaseoptdatahbasetmp3 .修改hbase-env.shexportJAVA_HOME=/usr/java/jdkl.8.0_162exportHBASE_LOG_DIR=/home/hbase/opt/data/hbase/logsexportHADOP-HOME=homehbasehadoop-2.

37、6.0-cdh5.10.0exportHBASE_MANAGES_ZK=false4 .修改hbase-site.xmlhbase.rootdirhdfsbigdataclusterhbasehbase.cluster.distributedtruehbase.master.port16000hbase.zookeeper.quorumhbase01,hbase02,hbase03hbase.zookeeper.property.clientPort2181hbase.zookeeper.property.dataDirhomehbaseoptdatahbasezookeeperhbase.t

38、mp.dirhomehbaseoptdatahbasetmphbase.coprocessor.user.region.classesorg.apache.hadoop.hbase.coprocessor.AggregateImplementationhbase.superuserhbase,root,hadoophbase.security.authorizationtruehbase.coprocessor.master.classesorg.apache.hadoop.hbase.security.access.AccessControllerhbase.coprocessor.region.classesorg.apache.hadoop.hbase.security.token.TokenProvideorg.apache.hadoop.hbase.security.access.AccessController

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 在线阅读 > 生活休闲


备案号:宁ICP备20000045号-1

经营许可证:宁B2-20210002

宁公网安备 64010402000986号