Integrator sandbox virtual machine contains following applications pre-installed.
Integrator Hadoop Agent
Apache Hadoop with spark
Please note that this is not an ideal configuration for production use and is intended for testing purposes only.
Oracle VirtualBox: Download and install latest version of Oracle VirtualBox from https://www.virtualbox.org/
Integrator Sandbox Image for VirtualBox: Download the latest sandbox image from https://s3.amazonaws.com/Integrator/ELTM_HD_SBX_VBOX.ova
Select [System] on left panel
On [Motherboard] tab set base memory to 8192MB or higher
Select [Network] on left panel
On [Adapter 1] tab check [Enable Network Adapter]
Set Attached to: [Bridged Adapter]
Set Name: <Your network adapter name> (if you are on wifi connection select appropriate wifi device name)
The adapter name is the name of network adapter your computer is using to connect to network(internet). If you are on wireless network, select appropriate wireless device name.
Set MAC Address: 0800271FB789 (Important!!)
Set Check [Cable Connected]
On Virtual Machine Console
Note: Click anywhere on console to start using it. Right-Ctrl key to get mouse control back.
Use following credentials to log in
User Name: hadoop
Note down IP Address: 192.168.1.22 (your server ip address may be different than shown), this address may/may not change after rebooting. Note down the ip address if it changes after reboot
We will refer this IP Address as SANDBOX_IP_ADDRESS for remainder of documentation
It may take few minutes after server has started for all applications to initialize. A log file is updated when the services start during startup. For debugging purpose you can check contents of log file /tmp/sandbox.log
The last line on log file should contain "********* Spark Submit [Found]"
User Name: Integrator
Click [Login] to start using Integrator Client
User Name Password ————— ————– root welcome123 hadoop welcome123
You can use ssh client like Putty to connect to SANDBOX_IP_ADDRESS using above credentials
|http://SANDBOX_IP_ADDRESS:50090/||Secondary Name Node|
SSH Login as user [root]
Run Following Commands
su - hadoop -c "/home/hadoop/scripts/stop_services.sh"
Integrator and Hadoop related services are set to automatically start when Sandbox Virtual Machine reboots. It may take few minutes for all services to come online after server has been started.
This database comes pre-installed on sandbox postgresql database system which is sourced from http://www.postgresqltutorial.com/postgresql-sample-database/
ER Model for this database is available at http://www.postgresqltutorial.com/wp-content/uploads/2018/03/printable-postgresql-sample-database-diagram.pdf
A pre-designed workflow called LOAD_ACTORS copies table [actors] from postgresql into hadoop.
A pre-configured onstage jdbc connection DVD_RENTAL_DATABASE_LOCAL is available to load into hadoop.