Laniakea team has completed several BigData projects in the past. This project was done for a client with specific requirement. Please reach out to us for further queries.

Create Directory Service in AWS.

Please use below parameter for SimpleAD configuration.

Password – Welcome1234

You can create the access URL, however its not necessary.

You can also configure applications with this Directory

Ranger Install.

Launch an EC2 instance.

Lau

Launch an instance with below configuration.

Download the Key Pair.

I tried to create the whole stack with the link Provided in the blog, However it didnt work. It created only the RangerServer and failed or said CREATE_IN_PROGRESS for most of the tasks.

I tried to create the The above screen will not have all the tasks completed. 

Now, login to the RangerServer using the credentials.

Get the repository for the Hadoop 2.4 via below command.

wget -nv http://public-repo-1.hortonworks.com/HDP/centos6/2.x/updates/2.4.2.0/hdp.repo -O /etc/yum.repos.d/hdp.repo

Below command is required in order for the yum to work.

yum clean all

Install ranger-admin 

yum install ranger-admin

Install Maven

wget http://mirrors.advancedhosters.com/apache/maven/maven-3/3.5.3/binaries/apache-maven-3.5.3-bin.tar.gz

Extract maven tar file

tar -xvf apache-maven-3.5.3-bin.tar.gz

Set environment variables

export M2_HOME=/usr/local/apache-maven-3.5.3

export M2=$M2_HOME/bin

export PATH=$M2:$PATH

mvn -version

Install git

yum install git

Install GCC

yum install gcc

Install MySQL

rpm -Uvh /home/ec2-user/mysql-community-release-el6-5.noarch.rpm

 yum install mysql-community-server

Start MySQL service

service mysqld start

Build Ranger admin source

cd ~/dev

git clone https://github.com/apache/incubator-ranger.git

 Under target directory, there should be several .tar files including ranger admin tar.

 

You need to install ranger policy admin using below procedure.

 

cd /usr/local

tar zxvf ~/dev/incubator-ranger/target/ranger-0.5.0-admin.tar.gz

ln -s ranger-0.5.0-admin ranger-admin

cd /usr/local/ranger-admin

update install.properties file

db_root_user=root

db_root_password=root

db_host=localhost

db_name=ranger

  db_user=rangeradmin

  db_password=rangeradmin

audit_db_name=ranger

  audit_db_user=rangerlogger

  audit_db_password=rangerlogger

 

start setup

./setup.sh

Start ranger admin

ranger-admin start

Login to MySQL and change the root password.

 

 

mysql -u root -p

 

I Verify login credential by login into the MySQL.

mysql -u root -p $r53dfftR

Download Hive – You can choose to install on ranger-admin server or it will be installed on EMR servers.

cd dev/

wget http://apache.claz.org/hive/hive-2.3.3/apache-hive-2.3.3-bin.tar.gz

Create S3 bucket which will be used to store logs.

Please configure aws cli in your laptop so that below commands can work.

This command will create default Roles and Profiles.

Create Role – EC2

Launch EC2 Windows machine for SimpleAD configuration. Make sure to select Domain Join Directory option below.

Create a windows EC2 instance and configure it with corp.emr.local SimpleAD. Follow below steps to login to this EC2 instance.

Install AD tools on windows server – Program feature

Login with SimpleAD administrator password – corp\administrator

Create user Analyst1 from AD tools users

Login to Hue using this user

Create EMR Cluster using below command. Make sure to go through all the options and change based on your environment.  Some of the locatioin below s3://aws-bigdata-blog shouldn’t be changed. InstancePorfile, service-role shouldn’t be changes as well. They are default.

aws emr create-cluster –applications Name=Hive Name=Spark Name=Hue –tags ‘Name=EMR-Security’ –ec2-attributes ‘{“KeyName”:”emr-amz”,”InstanceProfile”:”EMR_EC2_DefaultRole”,”SubnetId”:”subnet-0bcd2b56″,”EmrManagedSlaveSecurityGroup”:”sg-458d2e0e”,”EmrManagedMasterSecurityGroup”:”sg-828a29c9″}’ –release-label emr-5.0.0 –log-uri ‘s3n://emr-local-log/emrlog/‘ –steps ‘[{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/createHiveTables.sh”,”us-east-2″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://emr-local-log/emrlog/script-runner.jar”,”Properties”:””,”Name”:”CreateHiveTables”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/createHiveTables.sh”,”us-east-2″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3a://elasticmapreduce/libs/script-runner/script-runner.jar“,”Properties”:””,”Name”:”CreateHiveTables”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/createHiveTables.sh”,”us-east-2″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://elasticmapreduce/libs/script-runner/script-runner.jar”,”Properties”:””,”Name”:”CreateHiveTables”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/createHiveTables.sh”,”us-east-2″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://elasticmapreduce/libs/script-runner/script-runner.jar”,”Properties”:””,”Name”:”CreateHiveTables”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/createHiveTables.sh”,”us-east-2″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://elasticmapreduce/libs/script-runner/script-runner.jar”,”Properties”:”=”,”Name”:”CreateHiveTables”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/loadDataIntoHDFS.sh”,”us-east-1″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://elasticmapreduce/libs/script-runner/script-runner.jar”,”Properties”:”=”,”Name”:”LoadHDFSData”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/install-hive-hdfs-ranger-plugin.sh”,”34.227.84.73″,”0.6″,”s3://aws-bigdata-blog/artifacts/aws-blog-emr-ranger”],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://elasticmapreduce/libs/script-runner/script-runner.jar”,”Properties”:”=”,”Name”:”InstallRangerPlugin”},{“Args”:[“spark-submit”,”–deploy-mode”,”cluster”,”–class”,”org.apache.spark.examples.SparkPi”,”/usr/lib/spark/examples/jars/spark-examples.jar”,”10″],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”command-runner.jar”,”Properties”:”=”,”Name”:”SparkStep”},{“Args”:[“/mnt/tmp/aws-blog-emr-ranger/scripts/emr-steps/install-hive-hdfs-ranger-policies.sh”,”34.227.84.73″,”s3://aws-bigdata-blog/artifacts/aws-blog-emr-ranger/inputdata”],”Type”:”CUSTOM_JAR”,”ActionOnFailure”:”CONTINUE”,”Jar”:”s3://elasticmapreduce/libs/script-runner/script-runner.jar”,”Properties”:”=”,”Name”:”InstallRangerPolicies”}]’ –instance-groups ‘[{“InstanceCount”:0,”InstanceGroupType”:”TASK”,”InstanceType”:”c1.medium”,”Name”:”Task”},{“InstanceCount”:1,”InstanceGroupType”:”CORE”,”InstanceType”:”m3.2xlarge”,”Name”:”CORE”},{“InstanceCount”:1,”InstanceGroupType”:”MASTER”,”InstanceType”:”m3.2xlarge”,”Name”:”MASTER”}]’ –configurations ‘[{“Classification”:”hue-ini”,”Properties”:{},”Configurations”:[{“Classification”:”desktop”,”Properties”:{},”Configurations”:[{“Classification”:”auth”,”Properties”:{“backend”:”desktop.auth.backend.LdapBackend”},”Configurations”:[]},{“Classification”:”ldap”,”Properties”:{“bind_dn”:”binduser”,”trace_level”:”0″,”search_bind_authentication”:”false”,”debug”:”true”,”base_dn”:”dc=corp,dc=emr,dc=local”,”bind_password”:”Welcome1234″,”ignore_username_case”:”true”,”create_users_on_login”:”true”,”ldap_username_pattern”:”uid=usertest1,cn=users,dc=corp,dc=emr,dc=local”,”force_username_lowercase”:”true”,”ldap_url”:”ldap://172.31.84.239″,”nt_domain”:”corp.emr.local”},”Configurations”:[{“Classification”:”groups”,”Properties”:{“group_filter”:”objectclass=*”,”group_name_attr”:”cn”},”Configurations”:[]},{“Classification”:”users”,”Properties”:{“user_name_attr”:”sAMAccountName”,”user_filter”:”objectclass=*”},”Configurations”:[]}]}]}]}]’ –bootstrap-actions ‘[{“Path”:”s3://aws-bigdata-blog/artifacts/aws-blog-emr-ranger/scripts/download-scripts.sh”,”Args”:[“s3://aws-bigdata-blog/artifacts/aws-blog-emr-ranger”],”Name”:”Download scripts”}]’ –service-role EMR_DefaultRole –name ‘EMRRangerTest’ –scale-down-behavior TERMINATE_AT_TASK_COMPLETION –region us-east-1

Create Install Atlas-metadata

Atlas

yum install atlas-metadata

Create client.properties