How to install run .netcore 2.x with spark on Ubuntu

This guide describes how to install Spark to run Spark through .net core.
If you only want to know how to do this, see later.

To install Spark, Java must be installed by default. This is a required setting for all Hadoop installations, and the default account worked through a user named hadoop.

To install Spark, Java must be installed by default. This is a required setting for all Hadoop installations, and the default account worked through a user named hadoop.

Spark 2.4.2 is not currently supported by .net core.

The .netcore 3.0 method will be explained later.

Install OpenJDK 8

sudo apt install openjdk-8-jdk

You can check whether it is installed well through java command. And if you want to specify OpenJDK, type the following command.

sudo update-alternatives --config java

Install Apache Maven

To install Apache Maven, you need to register it as an environment variable to download and run it. Enter the following command

mkdir -p ~/bin/maven
cd ~/bin/maven
wget http://apache.tt.co.kr/maven/maven-3/3.6.2/binaries/apache-maven-3.6.2-bin.tar.gz
tar -xvzf apache-maven-3.6.2-bin.tar.gz
ln -s apache-maven-3.6.2 current
export M2_HOME=~/bin/maven/current
export PATH=$M2_HOME/bin:$PATH
source ~/.bashrc

If the mvn command runs well, that’s good install.

Install Apache Spark

Now install Spark.

cd ~/bin/
wget http://apache.tt.co.kr/spark/spark-2.4.4/spark-2.4.4-bin-hadoop2.7.tgz
tar -xvzf spark-2.4.4-bin-hadoop2.7.tgz
export SPARK_HOME=~/bin/spark-2.4.4-bin-hadoop2.7
export PATH="$SPARK_HOME/bin:$PATH"
source ~/.bashrc

If the spark-shell command works, it works fine.

Spakr .NET Build

Now let’s copy Spark .NET and proceed with the build.

git clone https://github.com/dotnet/spark.git ~/dotnet.spark
cd ~/dotnet.spark/src/scala
mvn clean package

You should have a JARs file in the src / scala subdirectory that supports Spark execution.

microsoft-spark-2.3.x/target/microsoft-spark-2.x.x-<version>.jar

.NET core Install

Install .NET core for run .NET core programs.
.NET core can be installed by selecting the following version.

VersionStatusLatest releaseLatest release dateEnd of support
.NET Core 3.1Preview 3.1.0-preview32019-11-14
.NET Core 3.0 (recommended)Current 3.0.12019-11-19
.NET Core 2.2Maintenance 2.2.82019-11-192019-12-23
.NET Core 2.1LTS 2.1.142019-11-19
wget -q https://packages.microsoft.com/config/ubuntu/18.04/packages-microsoft-prod.deb -O packages-microsoft-prod.deb
sudo dpkg -i packages-microsoft-prod.deb
sudo add-apt-repository universe
sudo apt-get update
sudo apt-get install apt-transport-https
sudo apt-get update
sudo apt-get install dotnet-sdk-2.1

.NET Program build

Let’s build and run the .NET core Spark Worker and Examples.

cd ~/dotnet.spark/src/csharp/Microsoft.Spark.Worker/
dotnet publish -f netcoreapp2.2 -r ubuntu.18.04-x64
cd ~/dotnet.spark/examples/Microsoft.Spark.CSharp.Examples/ dotnet publish -f netcoreapp2.2 -r ubuntu.18.04-x64

Once you have a good build, you can run the program.

spark-submit \
  [--jars <any-jars-your-app-is-dependent-on>] \
  --class org.apache.spark.deploy.dotnet.DotnetRunner \
  --master <ip/local> \
  <path-microsoft-spark-work-jar> \
  <path-.netcore-app-binary>

The actual command is shown below. The long path can be a little hard to see.

spark-submit --class org.apache.spark.deploy.dotnet.DotnetRunner --master local ~/dotnet.spark/src/scala/microsoft-spark-2.4.x/target/microsoft-spark-2.4.x-0.6.0.jar ~/dotnet.spark/artifacts/bin/Microsoft.Spark.CSharp.Examples/Debug/netcoreapp2.1/ubuntu.18.04-x64/publish/Microsoft.Spark.CSharp.Examples Sql.Batch.Basic /home/hadoop/bin/spark-2.4.4-bin-hadoop2.7/examples/src/main/resources/people.json

Facebook Comments

Leave A Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Detection ADBlockPlease, Disable or add to white list on our site.