spark1.3.1单机安装测试备忘

1.下载,安装spark和scala:
下载1.3.1的hadoop2.6版本. spark-1.3.1-bin-hadoop2.6.tgz
下载到本地之后直接解压即可:
helight@helight-xu:/data/spark$ tar zxf spark-1.3.1-bin-hadoop2.6.tgz
下载scala,2.11.6,也是直接解压即可:
helight@helight-xu:/data/spark$ tar zxf scala-2.11.6.tgz

安装spark和scala直接配置环境变量即可,可以直接写到 系统环境变量配置文件/etc/profile
或者写道用户配置文件中~/.bashrc中
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
export SCALA_HOME=/data/spark/scala-2.11.6
export PATH=$PATH:$JAVA_HOME/bin:$SCALA_HOME/bin
以上就是基本配置.
2.ssh本地互信登录配
这里和hadoop中的互信配置一样.
首先在机器上安装openssh-server和openssh-client.
helight@helight-xu:~/.ssh$ ssh-keygen
一直回车即可,不要输入任何东西
helight@helight-xu:~/.ssh$ ls
id_rsa id_rsa.pub known_hosts
helight@helight-xu:~/.ssh$ cat id_rsa.pub >authorized_keys
helight@helight-xu:~/.ssh$ ll
total 24
drwx—— 2 helight helight 4096 6月 8 15:06 ./
drwxr-xr-x 23 helight helight 4096 6月 9 09:59 ../
-rw——- 1 helight helight 400 6月 8 15:06 authorized_keys
-rw——- 1 helight helight 1679 6月 8 15:06 id_rsa
-rw-r–r– 1 helight helight 400 6月 8 15:06 id_rsa.pub
-rw-r–r– 1 helight helight 444 6月 8 15:21 known_hosts
authorized_keys文件的权限设置为600,如上,这里或需要重新注销登录一下才可以无密码登录
helight@helight-xu:~/.ssh$ ssh localhost
Welcome to Ubuntu 15.04 (GNU/Linux 3.19.0-20-generic x86_64)

* Documentation: https://help.ubuntu.com/

Last login: Mon Jun 8 15:20:51 2015 from localhost
helight@helight-xu:~$
如上面的登录方式,则表示本机无密码登录ok了.
3.spark启动配置

3.1 配置spark-env.sh

Copy一份文件spark-env.sh.template重命名为spark-env.sh,在文件末尾添加

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
export SCALA_HOME=/data/spark/scala-2.11.6
export SPARK_MASTER_IP=helight-xu
export SPARK_WORKER_CORES=1
export SPARK_WORKER_INSTANCES=1
export SPARK_WORKER_MEMORY=512M

可以看到,JAVA_HOMESCALA_HOME都关联上了。

赋予spark-env.sh可执行权限

chmod 777 spark-env.sh

 3.2    配置slaves 

Copy一份slaves.template文件重命名为slaves,添加机器名(或者ip,不过ip没试过)

# A Spark Worker will be started on each of the machines listed below.

# localhost
helight-xu

3.3配置spark-defaults.conf

Copy一份spark-defaults.conf.template重命名为spark-defaults.conf,把相关项打开(最后spark.executor.extraJavaOptions这项我目前还不知道使用,待研究)。

# Default system properties included when running spark-submit.

# This is useful for setting default environmental settings.

 

# Example:

spark.master                    spark://helight-xu:7077
spark.executor.memory 512m
spark.eventLog.enabled true
spark.eventLog.dir          /data/spark/spark-1.3.1-bin-hadoop2.6/logs/
spark.serializer                 org.apache.spark.serializer.KryoSerializer
spark.driver.memory       512m

3.4       配置log4j.properties

Copy一份log4j.properties.template文件重命名为log4j.properties即可。内容如下:

# Set everything to be logged to the console