Data Solution 2019(3)Run Zeppelin in Single Docker

DataSolution2019(3)RunZeppelininSingleDocker

ExceptionwhenStartHDFSinDocker

ERROR:Attemptingtooperateonhdfsnamenodeasroot

ERROR:butthereisnoHDFS_NAMENODE_USERdefined.Abortingoperation.

Solution:

AddthistoENVsolvetheproblem.

exportHDFS_NAMENODE_USER="root"

exportHDFS_DATANODE_USER="root"

exportHDFS_SECONDARYNAMENODE_USER="root"

exportYARN_RESOURCEMANAGER_USER="root"

exportYARN_NODEMANAGER_USER="root"

ExceptionwhenStartHDFSinDocker

Startingnamenodeson[0.0.0.0]

0.0.0.0:/tool/hadoop-3.2.0/bin/../libexec/hadoop-functions.sh:line982:ssh:commandnotfound

Startingdatanodes

localhost:/tool/hadoop-3.2.0/bin/../libexec/hadoop-functions.sh:line982:ssh:commandnotfound

Startingsecondarynamenodes[140815a59b06]

140815a59b06:/tool/hadoop-3.2.0/bin/../libexec/hadoop-functions.sh:line982:ssh:commandnotfound

Solution:

https://stackoverflow.com/questions/40801417/installing-ssh-in-the-docker-containers

InstallandStartSSHServer

RUNapt-getinstall-yopenssh-server

RUNmkdir/var/run/sshd

RUNssh-keygen-q-trsa-N''-f/root/.ssh/id_rsa

RUNcat~/.ssh/id_rsa.pub>>~/.ssh/authorized_keys

#startsshservice

nohup/usr/sbin/sshd-D>/dev/stdout&

ExceptionwhenStartHDFS

ERROR:JAVA_HOMEisnotsetandcouldnotbefound

Solution:

AddJAVA_HOMEinHadoop-env.sh

exportJAVA_HOME="/usr/lib/jvm/java-8-oracle”

ItseemsHDFSisrunningfineinDocker.

ButfromtheUI,IgeterrorlikethisfromUIhttp://localhost:9870/dfshealth.html#tab-overview

Exception:

Permissiondenied:user=dr.who,access=WRITE,inode="/":root:supergroup:drwxr-xr-x

Solution:

https://stackoverflow.com/questions/11593374/permission-denied-at-hdfs

SincethisismylocalDocker,Iwilljustdisablethepermissioninpdfs-site.xml

<property>

<name>dfs.permissions</name>

<value>false</value>

</property>

CheckDockerStats

>dockerstats

Mymemoryisonly2G,toosmall,maybeCPUisnotpowerenoughaswell.

CONTAINERIDNAMECPU%MEMUSAGE/LIMITMEM%NETI/OBLOCKI/OPIDS

382b064708ecubuntu-spark-1.00.64%1.442GiB/1.952GiB73.89%216kB/437kB255MB/10.1MB256

>nproc

4

MaybeCPUisok

IamusingMAC,sothewaytoincreasethememoryistoopenthetool

DockerDesktop—>References—>Advanced—>CPUs4,Memory2GB,Swap1.0GB

https://stackoverflow.com/questions/44533319/how-to-assign-more-memory-to-docker-container

CleanupmyDockerImageswhichIamnotusinganymore

>dockerimages|grepnone|awk'{print$3;}'|xargsdockerrmi

OfficialWebsite

https://hub.docker.com/r/apache/zeppelin/dockerfile

FinallyImadeitworking.

conf/core-site.xml

<?xmlversion="1.0"encoding="UTF-8"?>

<?xml-stylesheettype="text/xsl"href="configuration.xsl"?>

<configuration>

<property>

<name>fs.defaultFS</name>

<value>hdfs://0.0.0.0:9000</value>

</property>

</configuration>

conf/hadoop-env.sh

exportJAVA_HOME="/usr/lib/jvm/java-8-oracle”

exportHADOOP_OS_TYPE=${HADOOP_OS_TYPE:-$(uname-s)}

case${HADOOP_OS_TYPE}in

Darwin*)

exportHADOOP_OPTS="${HADOOP_OPTS}-Djava.security.krb5.realm="

exportHADOOP_OPTS="${HADOOP_OPTS}-Djava.security.krb5.kdc="

exportHADOOP_OPTS="${HADOOP_OPTS}-Djava.security.krb5.conf="

;;

esac

conf/hdfs-site.xml

<?xmlversion="1.0"encoding="UTF-8"?>

<?xml-stylesheettype="text/xsl"href="configuration.xsl"?>

<configuration>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

<property>

<name>dfs.permissions</name>

<value>false</value>

</property>

</configuration>

conf/spark-env.sh

HADOOP_CONF_DIR=/tool/hadoop/etc/hadoop

Needtoputoutzeppelin/confandzeppelin/notebookoutsideandmappingtodockerapplicationtosavedata.

ThisistheimportantDockerfile

#Runakafkaserverside

#PreparetheOS

FROMubuntu:16.04

MAINTAINERCarlLuo<luohuazju@gmail.com>

ENVDEBIAN_FRONTENDnoninteractive

ENVJAVA_HOME/usr/lib/jvm/java-8-oracle

ENVLANGen_US.UTF-8

ENVLC_ALLen_US.UTF-8

RUNapt-get-qqupdate

RUNapt-get-qqydist-upgrade

#Preparethedenpendencies

RUNapt-getinstall-qywgetunzipvim

RUNapt-getinstall-qyiputils-ping

#InstallSUNJAVA

RUNapt-getupdate&&\

apt-getinstall-y--no-install-recommendslocales&&\

locale-genen_US.UTF-8&&\

apt-getdist-upgrade-y&&\

apt-get--purgeremoveopenjdk*&&\

echo"oracle-java8-installershared/accepted-oracle-license-v1-1selecttrue"|debconf-set-selections&&\

echo"debhttp://ppa.launchpad.net/webupd8team/java/ubuntuxenialmain">/etc/apt/sources.list.d/webupd8team-java-trusty.list&&\

apt-keyadv--keyserverkeyserver.ubuntu.com--recv-keysEEA14886&&\

apt-getupdate&&\

apt-getinstall-y--no-install-recommendsoracle-java8-installeroracle-java8-set-default&&\

apt-getcleanall

#Prepareforhadoopandspark

RUNapt-getinstall-yopenssh-server

RUNmkdir/var/run/sshd

RUNssh-keygen-q-trsa-N''-f/root/.ssh/id_rsa

RUNcat~/.ssh/id_rsa.pub>>~/.ssh/authorized_keys

RUNmkdir/tool/

WORKDIR/tool/

#addthesoftwarehadoop

ADDinstall/hadoop-3.2.0.tar.gz/tool/

RUNln-s/tool/hadoop-3.2.0/tool/hadoop

ADDconf/core-site.xml/tool/hadoop/etc/hadoop/

ADDconf/hdfs-site.xml/tool/hadoop/etc/hadoop/

ADDconf/hadoop-env.sh/tool/hadoop/etc/hadoop/

#addthesoftwarespark

ADDinstall/spark-2.4.0-bin-hadoop2.7.tgz/tool/

RUNln-s/tool/spark-2.4.0-bin-hadoop2.7/tool/spark

ADDconf/spark-env.sh/tool/spark/conf/

#addthesoftwarezeppelin

ADDinstall/zeppelin-0.8.1-bin-all.tgz/tool/

RUNln-s/tool/zeppelin-0.8.1-bin-all/tool/zeppelin

#setuptheapp

EXPOSE9000987080804040

RUNmkdir-p/app/

ADDstart.sh/app/

WORKDIR/app/

CMD["./start.sh”]

ThisistheMakefilewhichwillmakeitworking

IMAGE=sillycat/public

TAG=ubuntu-spark-1.0

NAME=ubuntu-spark-1.0

prepare:

wgethttp://mirror.olnevhost.net/pub/apache/hadoop/common/hadoop-3.2.0/hadoop-3.2.0.tar.gz-Pinstall/

wgethttp://ftp.wayne.edu/apache/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.7.tgz-Pinstall/

wgethttp://apache.claz.org/zeppelin/zeppelin-0.8.1/zeppelin-0.8.1-bin-all.tgz-Pinstall/

docker-context:

build:docker-context

dockerbuild-t$(IMAGE):$(TAG).

run:

dockerrun-d-p9870:9870-p9000:9000-p8080:8080-p4040:4040-v$(shellpwd)/zeppelin/notebook:/tool/zeppelin/notebook-v$(shellpwd)/zeppelin/conf:/tool/zeppelin/conf--name$(NAME)$(IMAGE):$(TAG)

debug:

dockerrun-ti-p9870:9870-p9000:9000-p8080:8080-p4040:4040-v$(shellpwd)/zeppelin/notebook:/tool/zeppelin/notebook-v$(shellpwd)/zeppelin/conf:/tool/zeppelin/conf--name$(NAME)$(IMAGE):$(TAG)/bin/bash

clean:

dockerstop${NAME}

dockerrm${NAME}

logs:

dockerlogs${NAME}

publish:

dockerpush${IMAGE}

Thisisthestart.shtostarttheapplication

#!/bin/sh-ex

#prepareENV

exportHDFS_NAMENODE_USER="root"

exportHDFS_DATANODE_USER="root"

exportHDFS_SECONDARYNAMENODE_USER="root"

exportYARN_RESOURCEMANAGER_USER="root"

exportYARN_NODEMANAGER_USER="root"

exportSPARK_HOME="/tool/spark"

#startsshservice

nohup/usr/sbin/sshd-D>/dev/stdout&

#starttheservice

cd/tool/hadoop

bin/hdfsnamenode-format

sbin/start-dfs.sh

cd/tool/zeppelin

bin/zeppelin.sh

Afterthat,wecanvisitthis3UItoworkonourdata

###Hadoop3.2.0Spark2.4.0Zeppelin0.8.1

###HDFS

http://localhost:9870/explorer.html#/

###ZeppelinUI

http://localhost:8080/

###AfteryouRuntheFirstDemoJOB,SparkJobsUI

http://localhost:4040/stages/

References:

https://stackoverflow.com/questions/48129029/hdfs-namenode-user-hdfs-datanode-user-hdfs-secondarynamenode-user-not-defined

https://www.cnblogs.com/sylar5/p/9169090.html

https://www.jianshu.com/p/b49712bbe044

https://stackoverflow.com/questions/40801417/installing-ssh-in-the-docker-containers

https://stackoverflow.com/questions/27504187/ssh-key-generation-using-dockerfile

https://github.com/twang2218/docker-zeppelin

相关推荐