How to update custom stack version number in ambari

If you have a custom stacks installed in Ambari, for example HAProxy or your other custom services and if you want to update their latest version number on the UI, you can follow the below steps.

Here is the example to update this in the UI for a Mahout service.  You can do the same for your custom service like HAProxy.

Take backup:-
create table repo_version_bkp as select * from repo_version;

Query to check the data before update:-


select replace(version_xml,'name="MAHOUT" version="0.9.0"','name="MAHOUT" version="0.9.1"') from repo_version where repo_version=1;


Check the output of this select sql, it should show what the replaced value is.  Here i am replacing the string (name="MAHOUT" version="0.9.0"') from the column version_xml and the replaced value will be (name="MAHOUT" version="0.9.1"). 


Update:

Once you confirm the output looks correct, then update the table.


update repo_version set version_xml=replace(version_xml,'name="MAHOUT" version="0.9.0"','name="MAHOUT" version="0.9.1"') where repo_version_id=1;


Note: use the correct repo_version_id number in the above query based on your environment.

After the table change restart Ambari-server

Installing CDSW on HDP Cluster


Steps to install Cloudera data science workbench(CDSW) with HDP

Install Pre-requisites


Cluster Requirement:-

HDP-2.6.5 or HDP-3.1


Edge node requirements:

  • Enable memory cgroups on your operating system (Use RHEL7.5 - and should be enabled by default)
  • Disable swap for optimum stability and performance. 
"sudo sysctl -w vm.swappiness=1"
  • Cloudera Data Science Workbench uses uid 8536 for an internal service account. Make sure that this user ID is not assigned to any other service or user account.
cat /etc/passwd |grep -i 8536
  • JDK 8
  • Disable all pre-existing iptables rules.
sudo iptables -P INPUT ACCEPT
sudo iptables -P FORWARD ACCEPT
sudo iptables -P OUTPUT ACCEPT
sudo iptables -t nat -F
sudo iptables -t mangle -F
sudo iptables -F
sudo iptables -X
  • Disable SELINUX or permissive mode.
To disable:-
vi /etc/selinux/config
Change "SELINUX=disabled"

(or)

setenforce 0
  • No DNS server running on port 53 the CDSW machines  ( check by running lsof -i:53)

  • yum -y install bzip2  

  • Install anaconda
  • /anaconda/anaconda2
    • Read and agree and type "yes"
Note this path , we need to set this cdsw.conf file with the environment variable ANACONDA_DIR later.
Anaconda2 will now be installed into this location:
/root/anaconda2


Install CDSW:



  1. Add a new host to the cluster using ambari.  Go to the Hosts page and select Actions > + Add New Hosts.
  2. On the edge node,
Cd /etc/yum.repos.d
  1. sudo rpm --import https://archive.cloudera.com/cdsw1/1.5.0/redhat7/yum/RPM-GPG-KEY-cloudera
  1. sudo yum install cloudera-data-science-workbench
  2. vi /etc/cdsw/config/cdsw.conf
DOMAIN="rraman-docker-1.openstacklocal"

MASTER_IP="172.26.75.53"

DOCKER_BLOCK_DEVICES="/dev/vdb"

APPLICATION_BLOCK_DEVICE=""

JAVA_HOME="/usr/jdk64/jdk1.8.0_112"

TLS_ENABLE="false"

TLS_CERT=""
TLS_KEY=""

HTTP_PROXY=""

HTTPS_PROXY=""

ALL_PROXY=""

NO_PROXY=""

NVIDIA_GPU_ENABLE=false

NVIDIA_LIBRARY_PATH=

DISTRO="HDP"

DISTRO_DIR=""

ANACONDA_DIR="/root/anaconda2"

  1. cdsw init


Hit the browser URL:-













Reference:

Deploying Cloudera Data Science Workbench 1.5.x on Hortonworks Data Platform:-



How to rename a existing HDP cluster

To rename a HDP cluster :-

1. Rename through manage ambari -- > rename the cluster

2. Rename the existing ranger repositories, for the services where ranger plugin in enabled.

for example:-

Old cluster name is ClusterA and the renamed cluster name ClusterA_NEW

The HDFS ranger repository name will look like "ClusterA_hadoop" , that needs to be renamed to "ClusterA_NEW_hadoop" to match with cluster name.

Repeat this for all the service repositories.

3. Restart the relevant services where the changes are made.
example: HDFS, YARN, HIVE, Atlas, Kafka,..

4. (Optional)  The principals / keytabs can still function with old cluster name on it.  To avoid confusion, if you need  all the service principals to carry the new cluster name, you can regenerate the keytabs through ambari.  If we are doing this, then we can skip the step3 and restart after the new keytabs generated.

How to Fix Ranger Usersync Failure on Your HDP / CDP Cluster

Problem:-

If you're setting up a cluster and experiencing issues with Ranger usersync, you may encounter error messages in the /var/log/ranger/usersync/usersync.log file. Specifically, you might see errors like the following:

11 Feb 2022 15:15:46 ERROR CustomSSLSocketFactory [UnixUserSyncThread] - Unable to obtain keystore from file [/usr/hdp/current/ranger-usersync/conf/mytruststore.jks]

11 Feb 2022 15:15:46 ERROR UserGroupSync [UnixUserSyncThread] - Failed to initialize UserGroup source/sink. Will retry after 3600000 milliseconds. Error details: javax.naming.CommunicationException: adhost1.example.com:636 [Root exception is java.lang.NullPointerException]

These errors can be frustrating to deal with, but there are steps you can take to address them. One solution involves extracting an Active Directory (AD) cert and importing it into the Ranger usersync truststore. Finally, you'll need to update the password for the truststore through Ambari. By following these steps, you can get Ranger usersync up and running smoothly.


Solution: How to Fix Ranger Usersync Failure on Your Cluster

To resolve this issue, follow these simple steps:

Step 1: Extract the AD cert To extract the AD cert, use the following command:

perl
echo -n | openssl s_client -connect adhost1.example.com:636 \ | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > /tmp/ad_cert.cert

Step 2: Import the extracted cert into Ranger usersync truststore To import the extracted cert into the Ranger usersync truststore, use the following command:

bash
keytool -import -trustcacerts -alias AD_cert -keystore /usr/hdp/current/ranger-usersync/conf/mytruststore.jks -file /tmp/ad_cert.cert

Make sure to choose the password you want to set for this keystore.

Step 3: Update the Ranger usersync truststore password To update the password for the Ranger usersync truststore, follow these steps:

  1. Go to Ambari.
  2. Navigate to Ranger --> Configs --> Advanced --> Advanced ranger-ugsync-site --> ranger.usersync.truststore.password.
  3. Update the password.

By following these simple steps, you should be able to fix the Ranger usersync failure on your cluster.

How to install localstack - to use S3 API's onprem cluster

Localstack Install Steps:-


curl -sL https://rpm.nodesource.com/setup_10.x | bash -
yum -y install python-pip python-devel gcc gcc+ nodejs maven lsof wget
pip install --upgrade pip
pip install localstack awscli-local


These steps include installing , all relevant binaries required for localstack to install.

Ambari - How to store KDC admin credential persisted and regenerate keytabs using API call


Step1:

ambari-server setup-security

and choose  [2] Encrypt passwords stored in ambari.properties file
and choose the password you want to set

Step2:

curl -H "X-Requested-By:ambari" -u admin:admin -X  POST -d '{ "Credential" : { "principal" : "admin@EXAMPLE.COM", "key" : "paswwd12345", "type" : "persisted" } }' http://ambari-node1.example.com:8080/api/v1/clusters/hdp_cluster1/credentials/kdc.admin.credential

--> update with your actual principal name and password.
--> update the admin id password for ambari.


Step3:

 curl -H "X-Requested-By:ambari" -u admin:admin -X PUT -d '{ "Clusters": { "security_type" : "KERBEROS" } }' http://ambari-node1.example.com:8080/api/v1/clusters/hdp_cluster1/?regenerate_keytabs=ALL

OS commands


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
To watch CPU core interrupts:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
/usr/bin/watch -d 'cat /proc/interrupts'

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
To increase WAN transfers, increase txqueuelen in eth interface:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
“For WAN transfers, it was discovered that a setting of 2,000 for the txqueuelen is sufficient to prevent any send stalls from occurring.”5 Default value for txqueuelen is 1000. I have successfully tested a value of 2500. Type the follow command to change it: 

Example:

ifonfig eth4 txqueuelen 2500

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# To run the application or commands to a specific core "numactl" 
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#get the cpu physical core info from the below and use it in numactl.
less /proc/cpuinfo

numactl --physcpubind=1

Ex:  numactl --physcpubind=1 top

This will run the top command using physical cpu core id "1"
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#To Check the ethernet interface driver version:
ethtool -i eth0
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#To get Inode number for a group a files in a folder
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
cd /test
stat * |grep -i -E 'File:'\|'Inode' | awk '{ if($3 == "Inode:") print "   "$3" "$4; else print $0 }'

#Good Format:
stat * |grep -i -E 'File:'\|'Inode' | awk '{ if($3 == "Inode:") print "\t"$3" "$4; else print $1 $2 }' |awk 'NR%2{printf "%s ",$0;next;}1'
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#To process any operation line by line in do while loop:(Example)

filename=/tmp/file1.txt
while read -r line
do
    name="$line"
    echo "Name read from file - $name"
done < "$filename"
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#To extract the logs between two timestamps 

Example:

sort hiveserver2.log |sed -n '/2017-04-26 20:40:44/,/2017-04-26 20:43:44/p' >> extracted_file.log

or

sed -n '/2017-04-26 20:43:44/,/2017-04-26 20:43:44/p' hiveserver2.log  >> extracted_file.log

cat hiveserver2_knox_25_april1 |grep -a ''  |sort  |sed -n '/2017-04-25 20:/,/2017-04-25 21:/p' > hiveserver2_8:30PM_9:30PM
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

##Monitor I/O using SAR command:

sar -d -p 1

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Mac OS : converting delimiter $ to ','

perl -pi -w -e 's/\$/,/g;' test1.txt

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Hive Server(HS2) open files group by PID :-

grep 'open files' /proc/$(ps aux | grep -i "hiveserver2"|grep -v 'grep'|  awk '{print $2}')/limits

lsof -u hive |awk '{print $2}' |uniq

lsof -u hive |awk '{print $2}' |sort |uniq -c
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Boost Your Download Speed with lftp Segmentation

Looking for a faster way to download files via sftp to a Linux machine? Try using "lftp" instead. This tool offers segmented downl...

Other relevant topics