I run into a lot of issues in building Hadoop in windows. I
have to build Hadoop in windows in order to generate some windows native
components which are not included in Hadoop binary distribution.
Without these windows native components, if I try to run a
Hadoop example in windows, I will get this error:
c:\hadoop-2.4.1>hadoop jar share\hadoop\mapreduce\hadoop-mapreduce-examples-2.4.1.jar pi 10 10014/08/11 15:11:32 ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path
This blog http://www.srccodes.com/p/article/38/build-install-configure-run-apache-hadoop-2.2.0-microsoft-windows-os
is the most detailed step-by-step guideline I could google, and yet, even it
failed to mention some essential steps that I had to take to succeed. So I
wrote this blog to fill in the gaps.
My windows system is:
c:\>systeminfo | findstr /B /C:"OS Name" /C:"OS Version" /C:"System Type"OS Name: Microsoft Windows 7 EnterpriseOS Version: 6.1.7601 Service Pack 1 Build 7601System Type: x64-based PC
First of all, read carefully the “Building
on Windows” section of hadoop-2.4.1-src\BUILDING.txt.
Hadoop
I am getting the latest version (2.4.1) of binary
distribution and source distribution.
Unzip the source distribution to C:\hadoop.
It is important that you use a short path to hold the
source code, otherwise you will run into “too long path” error when building. In fact, C:\hadoop-2.4.1-src (the default name) is apparently too long to build some classes.
Cgwin
Add C:\cygwin\bin into
windows system variable path.
Microsoft Windows SDK v7.1
If you fail to install Microsoft Windows SDK with 5100
error, check out this:
Maven
Add MAVEN_HOME system variable:
Protocol Buffers 2.5.0
Add protoc location to path.
CLib
This step is missing from the above mentioned blog.
I downloaded zlib1.2.7 because it is mentioned in hadoop-2.4.1-src\BUILDING.txt
that this is the version tested with.
Add ZLIB_HOME system variable:
The following is important: copy the
two header files (zconf.h, zlib.h) from %ZLIB_HOME%\include to %ZLIB_HOME%.
Platform variable
You are almost set now, one last step:
set Platform=x64 (when building on a 64-bit system)
set Platform=Win32 (when building on a 32-bit system)
To avoid typing, you can setup this as a windows system
variable:
Build
Start cmd from Windows SDK:
Run command:
mvn package -Pdist,native-win -DskipTests –Dtar
Now you have the windows native files, which you can copy these files into Hadoop binary distribution hadoop-2.4.1\bin.
No comments:
Post a Comment