Installing CDSW on HDP Cluster


Steps to install Cloudera data science workbench(CDSW) with HDP

Install Pre-requisites


Cluster Requirement:-

HDP-2.6.5 or HDP-3.1


Edge node requirements:

  • Enable memory cgroups on your operating system (Use RHEL7.5 - and should be enabled by default)
  • Disable swap for optimum stability and performance. 
"sudo sysctl -w vm.swappiness=1"
  • Cloudera Data Science Workbench uses uid 8536 for an internal service account. Make sure that this user ID is not assigned to any other service or user account.
cat /etc/passwd |grep -i 8536
  • JDK 8
  • Disable all pre-existing iptables rules.
sudo iptables -P INPUT ACCEPT
sudo iptables -P FORWARD ACCEPT
sudo iptables -P OUTPUT ACCEPT
sudo iptables -t nat -F
sudo iptables -t mangle -F
sudo iptables -F
sudo iptables -X
  • Disable SELINUX or permissive mode.
To disable:-
vi /etc/selinux/config
Change "SELINUX=disabled"

(or)

setenforce 0
  • No DNS server running on port 53 the CDSW machines  ( check by running lsof -i:53)

  • yum -y install bzip2  

  • Install anaconda
  • /anaconda/anaconda2
    • Read and agree and type "yes"
Note this path , we need to set this cdsw.conf file with the environment variable ANACONDA_DIR later.
Anaconda2 will now be installed into this location:
/root/anaconda2


Install CDSW:



  1. Add a new host to the cluster using ambari.  Go to the Hosts page and select Actions > + Add New Hosts.
  2. On the edge node,
Cd /etc/yum.repos.d
  1. sudo rpm --import https://archive.cloudera.com/cdsw1/1.5.0/redhat7/yum/RPM-GPG-KEY-cloudera
  1. sudo yum install cloudera-data-science-workbench
  2. vi /etc/cdsw/config/cdsw.conf
DOMAIN="rraman-docker-1.openstacklocal"

MASTER_IP="172.26.75.53"

DOCKER_BLOCK_DEVICES="/dev/vdb"

APPLICATION_BLOCK_DEVICE=""

JAVA_HOME="/usr/jdk64/jdk1.8.0_112"

TLS_ENABLE="false"

TLS_CERT=""
TLS_KEY=""

HTTP_PROXY=""

HTTPS_PROXY=""

ALL_PROXY=""

NO_PROXY=""

NVIDIA_GPU_ENABLE=false

NVIDIA_LIBRARY_PATH=

DISTRO="HDP"

DISTRO_DIR=""

ANACONDA_DIR="/root/anaconda2"

  1. cdsw init


Hit the browser URL:-













Reference:

Deploying Cloudera Data Science Workbench 1.5.x on Hortonworks Data Platform:-



1 comment:

Boost Your Download Speed with lftp Segmentation

Looking for a faster way to download files via sftp to a Linux machine? Try using "lftp" instead. This tool offers segmented downl...

Other relevant topics