大数据培训学习之Oozie调度MapReduce任务

发布时间:2020年02月06日作者:atguigu浏览次数:937

Oozie调度MapReduce任务

目标:使用Oozie调度MapReduce任务

分步执行:

1)找到一个可以运行的mapreduce任务的jar包(可以用官方的,也可以是自己写的)

2)拷贝官方模板到oozie-apps

[atguigu@hadoop102 oozie-4.0.0-cdh5.3.6]$ cp -r /opt/module/cdh/ oozie-4.0.0-cdh5.3.6/examples/apps/map-reduce/ oozie-apps/

  • 测试一下wordcount在yarn中的运行

[atguigu@hadoop102 oozie-4.0.0-cdh5.3.6]$ /opt/module/cdh/hadoop-2.5.0-cdh5.3.6/bin/yarn jar /opt/module/cdh/hadoop-2.5.0-cdh5.3.6/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.6.jar wordcount /input/ /output/

4) 配置map-reduce任务的job.properties以及workflow.xml

job.properties

nameNode=hdfs://hadoop102:8020

jobTracker=hadoop103:8032

queueName=default

examplesRoot=oozie-apps

#hdfs://hadoop102:8020/user/admin/oozie-apps/map-reduce/workflow.xml

oozie.wf.application.path=${nameNode}/user/${user.name}/${examplesRoot}/map-reduce/workflow.xml

outputDir=map-reduce

workflow.xml

<workflow-app xmlns=”uri:oozie:workflow:0.2″ name=”map-reduce-wf”>

    <start to=”mr-node”/>

    <action name=”mr-node”>

        <map-reduce>

            <job-tracker>${jobTracker}</job-tracker>

            <name-node>${nameNode}</name-node>

            <prepare>

                <delete path=”${nameNode}/output/”/>

            </prepare>

            <configuration>

                <property>

                    <name>mapred.job.queue.name</name>

                    <value>${queueName}</value>

                </property>

                <!– 配置调度MR任务时,使用新的API –>

                <property>

                    <name>mapred.mapper.new-api</name>

                    <value>true</value>

                </property>

 

                <property>

                    <name>mapred.reducer.new-api</name>

                    <value>true</value>

                </property>

 

                <!– 指定Job Key输出类型 –>

                <property>

                    <name>mapreduce.job.output.key.class</name>

                    <value>org.apache.hadoop.io.Text</value>

                </property>

 

                <!– 指定Job Value输出类型 –>

                <property>

                    <name>mapreduce.job.output.value.class</name>

                    <value>org.apache.hadoop.io.IntWritable</value>

                </property>

 

                <!– 指定输入路径 –>

                <property>

                    <name>mapred.input.dir</name>

                    <value>/input/</value>

                </property>

 

                <!– 指定输出路径 –>

                <property>

                    <name>mapred.output.dir</name>

                    <value>/output/</value>

                </property>

 

                <!– 指定Map类 –>

                <property>

                    <name>mapreduce.job.map.class</name>

                    <value>org.apache.hadoop.examples.WordCount$TokenizerMapper</value>

                </property>

 

                <!– 指定Reduce类 –>

                <property>

                    <name>mapreduce.job.reduce.class</name>

                    <value>org.apache.hadoop.examples.WordCount$IntSumReducer</value>

                </property>

 

                <property>

                    <name>mapred.map.tasks</name>

                    <value>1</value>

                </property>

            </configuration>

        </map-reduce>

        <ok to=”end”/>

        <error to=”fail”/>

    </action>

    <kill name=”fail”>

        <message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>

    </kill>

    <end name=”end”/>

</workflow-app>

5)拷贝待执行的jar包到map-reduce的lib目录下

[atguigu@hadoop102 oozie-4.0.0-cdh5.3.6]$ cp -a  /opt /module/cdh/hadoop-2.5.0-cdh5.3.6/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.6.jar oozie-apps/map-reduce/lib

6)上传配置好的app文件夹到HDFS

[atguigu@hadoop102 oozie-4.0.0-cdh5.3.6]$ /opt/module/cdh/hadoop-2.5.0-cdh5.3.6/bin/hdfs dfs -put oozie-apps/map-reduce/ /user/admin/oozie-apps

7)执行任务

[atguigu@hadoop102 oozie-4.0.0-cdh5.3.6]$ bin/oozie job -oozie http://hadoop102:11000/oozie -config oozie-apps/map-reduce/job.properties -run

想要了解跟多关于大数据培训课程内容欢迎关注尚硅谷大数据培训,尚硅谷除了这些技术文章外还有免费的高质量大数据培训课程视频供广大学员下载学习


上一篇:
下一篇:
相关课程

java培训 大数据培训 前端培训

关于尚硅谷
教育理念
名师团队
学员心声
资源下载
视频下载
资料下载
工具下载
加入我们
招聘岗位
岗位介绍
招贤纳师
联系我们
全国统一咨询电话:010-56253825
地址:北京市昌平区宏福科技园2号楼3层(北京校区)

深圳市宝安区西部硅谷大厦B座C区一层(深圳校区)

上海市松江区谷阳北路166号大江商厦3层(上海校区)

武汉市东湖高新开发区东湖网谷(武汉校区)

西安市雁塔区和发智能大厦B座3层(西安校区)

成都市成华区北辰星拱青创园综合楼3层(成都校区)