Idea+maven+scala构建包并在spark on yarn 运行
配置Maven项目
在pom.xml配置文件中配置spark开发所需要的包,根据你Spark版本找对应的包,Maven中央仓库
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.3.1</version>
</dependency>构建方式
配置Artifacts构建包



配置Maven构建包
- 使用
Maven构建包只需要在pom.xml中添加如下插件(maven-shade-plugin)即可
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>META-INF/spring.handlers</resource>
</transformer>
<transformer
implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>META-INF/spring.schemas</resource>
</transformer>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>cn.mucang.sensor.SensorMain</mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>构建示例scala代码
import org.apache.spark.storage.StorageLevel
import org.apache.spark.{SparkConf, SparkContext}
object InfoOutput {
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf().setMaster("local[*]").setAppName("NginxLog")
val sc = new SparkContext(sparkConf)
val fd = sc.textFile("hdfs:///xxx/logs/access.log")
val logRDD = fd.filter(_.contains(".baidu.com")).map(_.split(" "))
logRDD.persist(StorageLevel.DISK_ONLY)
val ipTopRDD = logRDD.map(v => v(2)).countByValue().take(10)
ipTopRDD.foreach(println)
}
}
上传Jar包
- 使用
scp上传Jar包到spark-submit服务器,Jar位置在项目的out目录下 - 因为没有依赖第三方包所以打出怕jar会很小,使用spark-submit提示任务:
spark-submit --class InfoOutput --verbose --master yarn --deploy-mode cluster nginxlogs.jar
相关推荐
yegen00 2020-10-21
Notzuonotdied 2020-09-17
hline 2020-07-29
tomli 2020-07-26
xieting 2020-07-04
YarnSup 2020-06-28
flyingbird 2020-06-14
Notzuonotdied 2020-06-13
xieting 2020-05-29
tomli 2020-05-27
xieting 2020-05-26
tomli 2020-05-25
tomli 2020-05-11