Sunday, December 13

Hadoop Eclipse Tip: Lib Dependencies

I'm writing a Hadoop job and I ran into a little problem that I wanted to share (and remind myself of the solution for the future).

I am packaging up my Hadoop program into a Jar file. It has external dependencies on text parsers. To include these with my program, one way to do this is to package the dependencies inside the jar in a /lib directory. This ensures the jar and all dependencies get copied to the Hadoop Mappers.

I create my jar file by right-clicking on the project --> export --> Java --> Jar file. I then select my code and the lib directory. However, the problem I had was that my lib directory was not being exported. I learned that this happens if the jars in lib are on your build path. To solve this, the jars need to be "external" or in a different folder. Then you can export the lib directory as a resource.

Anyone care to share a better solution?