I am always trying to read and practice with topics / subjects that I feel learning or call my attention. I am currently taking some specialization courses on Coursera. You can never give up and stop learning. In this post I will describe how I installed Hadoop and the issues I encountered.
The goal was to install and run Hadoop on a Windows 10 machine. The ecosystem of Hadoop is quite extensive. The set of tools to be installed came from Cloudera. They provide a set of virtual machine images for you to
choose. A virtual machine image needs a virtualization product, in this case VirtualBox.
The suggested steps:
- Download the proper VirtualBox software.
- Install VirtualBox software.
- Download the proper VM image from cloudera.
- Unzip the VM image.
- Start the VM image using the VirtualBox software.
Using Google Chrome I found and went to the following URL: https://www.virtualbox.org/wiki/Downloads and selected the VirtualBox 5.1 build link. Selected the VirtualBox 5.1.38 for Windows hosts and downloaded to my c:\temp\VirtualBox-5.1.38-122592-Win.exe installer. Step 1 completed.
I recalled I had installed and used for work VirtualBox a few years ago. In general it is not a good idea to install a different / newer software version over existing one. I proceeded to uninstall the older version 4.x of VirtualBox. All went well. After uninstalling software on any computer I like to restart the system. I did that and was ready to install VirtualBox on my Windows machine.
I double clicked on the installer that was downloaded in the previous step. I took all the defaults. Once again I do not recall if the installer wanted to restart my computer, but I always like to restart after a software installation. The machine came up and at this point we are done with step 2 and ready to move on to step three.
Downloaded to my c:\temp the zipped archive found at the following link: https://downloads.cloudera.com/demo_vm/virtualbox/cloudera-quickstart-vm-5.4.2-0-virtualbox.zip. The file is somewhat large so it took me a couple minutes to get this task done. So far so good, we are done with the third step.
To unzip files I use 7-zip. I placed the VM image in C:\Cloudera\cloudera-quickstart-vm-5.4.2-0-virtualbox in my Windows computer. This completed the fourth step. No issues so far.
For the fifth and last step, I only had to start the cloudera image in VirtualBox. Double clicked on the VirtualBox icon on my taskbar and the software came up. As always Murphy shows up. I then selected the cloudera image and clicked on the Start icon. A pop-up window with the following message showed up:
Google search: Result Code: E_FAIL (0x80004005) Component: ConsoleWrap Interface: IConsole {872da645-4a9b-1727-bee2-5585105b9eed}
When I run into issues I select what I think the relevant part of the message is, stick it into Chrome and start looking at the suggestions. To keep it short I will not go over each of the links I found and read. Based on experience I decided to pursue the following approaches:
- Verify that virtualization at the BIOS level was enabled.
- Remove and other software that might be compromising the launch.
- Check and adjust the settings in the VirtualBox software.
You can check if virtualization is enabled on Windows by selection Task Manager -> Performance and looking for the following line under the CPU utilization graph: Virtualization: Enabled In my case it was enabled. Once again, based on experience I decide to use the BIOS and make sure that virtualization was ON. After a couple reboots I confirmed that it was enabled in the BIOS.
For step two, based on what I read and recalled, years ago I first installed VirtualBox 4.x and a year or two later I installed Docker for Windows. Since the Docker installation I had not used VirtualBox on this computer. To make the story short, I uninstalled Docker and rebooted my machine. The issue with the Cloudera VM image persisted.
Microsoft uses virtualization software called Hyper-V. Different types of virtualization software may have conflicts. I used ‘Turn Windows features on off’ and removed support for Hyper-V. I had to restart the machine. The shutdown and restart phases look like a Windows update. That is normal when you are adding or removing features from Windows.
Attempted to load and start the Cloudera software and a couple issues came up. One had to do with the amount of memory for the display. It had not allocated enough to support my monitors. Bumped up and that issue was resolved. I then ran into the following issue with associated message:
This kernel requires an x86-64 CPU, but only detected an i686 CPU. Unable to boot - please use a kernel appropiate for your CPU.
In the VirtualBox settings set it to Red Hat 64-bit and that was it. After a few minutes the VM was up and operational.
This set of issues seems to be somewhat expected when dealing with a set of software products specially when they come from different vendors and involve some type of virtualization. In retrospect, I could have used a Linux computer which is just a button click away on my KVM switch.
If you have comments or questions, please leave me a note at the bottom of this post.
Keep on reading and experimenting. That is the only way to learn.
John
Follow me on Twitter: @john_canessa