Monday, September 16, 2013

Write and run map reduce jobs using Eclipse IDE!!

I started working in Hadoop field since last 1.5 years and i can see a picture where every one will be moving to Hadoop and related big data processing tools and suddenly it will become even much more popular than it is now and at that time it will be a field where everyone will be somehow related. As a developer, i really feel that everyone, at least every developer, should know how to write a MapReduce job and run it over a Hadoop cluster. I tried to find some help to write MapReduce jobs over the internet, but i found very few and even there there was no complete information.

So, i felt an urge to write this blog, so that every body should be able to know, in very easy and faster manner.

In this example i am using latest stable release of Hadoop which is 1.2.1. If you are using a different version then it may not work.

Requirements to run this exercise:

  1. Hadoop cluster (even a single node cluster will work) . I found Michael G. Noll's website very useful for setting up a single node cluster. You can follow the steps on this site to setup your cluster for testing.
  2. It is very good practice to keep source files of an opensource software, so that you will be able to consult sources or even change, if required. You can download sources of Hadoop 1.2.1 from here.
  3. Eclipse IDE 3.7. (I have tested it on Eclipse 3.7. It may not work on higher versions.)

Steps:

  1. Download MapReduce eclipse plugin from here
  2. Copy it into "Eclipse installation directory -> plugins" directory
  3. Restart eclipse
  4. Open MapReduce perspective in Eclipse































5. Add MapReduce locations































6.  Click on add new locations































7. There are few more advanced parameters also which may be changed depending upon your environment

































Now, development environment setup is done. You are ready to write and run your MapReduce job. To create a new MapReduce project and run over hadoop cluster follow below steps

8. Create new Map reduce project































9. Name your project and give your hadoop installation directory































10.  Your project structure may look like this






























11.  Now you need to add new Mapper class

































12. It should look like this































You can write your code for mapper class here.

13. Add Reducer class





























































14. Your Reducer class should look like this































You can add your code for reducer class here

15. Add your Test runner class like this






























16. To run the program Click on Run As->Run Configurations... 
 Your screen should look like this



Click on Run and your program should start running.

You can find the output of the program at your configured location.




Please let me know if you find any difficulty in running the above said steps. 
Thanks



63 comments:

  1. great work....a very simple step by step representation of the facts.

    ReplyDelete
  2. Brilliant man. Very useful and saved a lot of time for me.

    Thanks very much.

    ReplyDelete
  3. Thank you for this step by step tutorial.
    But what if i have Eclipse in windows system and hadoop cluster on CentOs server machines and no GUI available. How can i run eclipse with hadoop eclipse plugin on windows machine ?

    ReplyDelete
  4. I have read your blog, it was good to read & I am getting some useful info's through your blog keep sharing... Informatica is an ETL tools helps to transform your old business leads into new vision. Learn Informatica training in chennai from corporate professionals with very good experience in informatica tool.
    Regards,
    Best Informatica Training In Chennai|Informatica training center in Chennai|Informatica training chennai

    ReplyDelete
  5. Good day. I was impressed with your article. Keep it up . You can also visit my site if you have time. Thank you and Bless you always.



    Spark Training Academy Chennai

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. I read the above article..its good and I tried doing it but I am getting error for input file/path does not exists...Can you please mention the steps of execution in brief and where to find the output exactly how to give the input path.. As the input args given directly at last step are giving me error

    ReplyDelete
    Replies
    1. I am not getting your question. Can you send me the code and at which step you are getting this error?

      Delete
    2. It worked successfully with some correction...Thanks for the above article ...Its really helpful..

      Delete
  9. Good day. I was impressed with your article. Keep it up .
    sql-server-dba training in chennai

    ReplyDelete
  10. This comment has been removed by the author.

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete
  12. The information is very much useful,Thanks for sharing it
    Hadoop training in chennai

    ReplyDelete
  13. we are taking this Android training with basic and advance concepts. We are the best training institute in chennai. We are having the training professionals with more than 10+ years experience.
    Android Training in Chennai |
    Android Training in Chennai |
    Android Training in Chennai | Android Training in Chennai |

    ReplyDelete
  14. hi welcome to this blog. really you have posted an informative blog. it will be really helpful to many peoples. thank you for sharing this bog.
    java training in chennai

    ReplyDelete
  15. This article is very much helpful and i hope this will be an useful information for the needed one. Keep on updating these kinds of informative things...

    Android App Development Company
    iOS App Development Company

    ReplyDelete
  16. Hadoop Training in noida is an open-source programming system for putting away information and running applications on groups of product equipment. Croma campus gives huge capacity to any sort of information, tremendous preparing power and the capacity to deal with for all intents and purposes boundless simultaneous undertakings or occupations.

    ReplyDelete
  17. I have read your blog and i got a very useful and knowledgeable information from your blog.You have done a great job .Please visit our page Hadoop Training in Chennai

    ReplyDelete
  18. Nice and best blog for Big data. Thank you for sharing step by step screenshot with nice examples for Hadoop. Best Hadoop Training in Chennai

    ReplyDelete
  19. This comment has been removed by the author.

    ReplyDelete
  20. Hi,

    I just came across your blog and it's very interesting regarding hadoop

    Thank you

    ReplyDelete
  21. Very Well Written Article on Hadoop Technology. Please Post More Post of this Technology To grab latest Updates and Information.
    Hadoop Training in Bangalore

    ReplyDelete
  22. Thanks for the information and links you shared this is so should be a useful and quite informative!
    Big data

    ReplyDelete

  23. In Hadoop, MapReduce is a calculation that decomposes large manipulation jobs into individual tasks that can be executed in parallel cross a cluster of servers. The results of tasks can be joined together to compute final results.
    Mapreduce program example
    Hadoop fs command using java api

    ReplyDelete
  24. This comment has been removed by the author.

    ReplyDelete
  25. I just want to know about Hadoop Map reduce and found this post these post is perfect one ,Thanks for sharing the informative post of Map reduce and able to understand the concepts easily,Thoroughly enjoyed reading
    Check out the https://www.credosystemz.com/training-in-chennai/best-hadoop-training-in-chennai/

    ReplyDelete
  26. Hi , your post on hadoop was really superb and eclipse project representation was good ,thanks for posting!! Hadoop Training in Velachery | Hadoop Training .

    ReplyDelete
  27. Existing without the answers to the difficulties you’ve sorted out through this guide is a critical case, as well as the kind which could have badly affected my entire career if I had not discovered your website.
    google-cloud-platform-training-in-chennai

    ReplyDelete
  28. It Really saved my time. Thanks for the post.
    To gain More knowledge in Hadoop Visit our site below

    Best Hadoop Training Institute in Chennai

    Hadoop Training for Begineers in Chennai

    ReplyDelete
  29. You did a great job by posting this amazing information. Thanks for sharing with us.

    hadoop training in pune
    hadoop spark classes in pune
    hadoop testing
    hadoop pune

    ReplyDelete
  30. Excellent blog, I wish to share your post with my folks circle. It’s really helped me a lot, so keep sharing post like this
    Selenium Training in Chennai | Selenium Training in Bangalore | Selenium Training in Pune

    ReplyDelete
  31. Really Nice post. By reading your blog, i get inspired and this provides some useful information. Thank you for posting this exclusive post for our vision.
    Best software Training institute in Bangalore

    ReplyDelete
  32. Appreciating the persistence, you put into your blog and detailed information you provide.
    safety course institute in chennai

    ReplyDelete
  33. Appreciating the persistence, you put into your blog and detailed information you provide.
    safety course in chennai

    ReplyDelete
  34. Very good brief and this post helped me alot. Say thank you I searching for your facts. Thanks for sharing with us!
    python training in rajajinagar
    Python training in bangalore
    Python training in usa

    ReplyDelete
  35. All are saying the same thing repeatedly, but in your blog I had a chance to get some useful and unique information, I love your writing style very much, I would like to suggest your blog in my dude circle, so keep on updates.
    Java training in Bangalore | Java training in Marathahalli

    Java training in Bangalore | Java training in Btm layout

    Java training in Bangalore |Java training in Rajaji nagar

    Java training in Bangalore | Java training in Kalyan nagar

    ReplyDelete
  36. Nice information, valuable and excellent design, as share good stuff with good ideas and concepts, lots of great information and inspiration, both of which I need, thanks to offer such a helpful information here.
    Python Online training
    python Training in Chennai
    Python training in Bangalore

    ReplyDelete
  37. Thank you for giving the information and it is use full for me. training with placementcompany in Hyderabad

    ReplyDelete
  38. Thanks for sharing the good information and post more information. I need some facilitate to my website. please check once http://talentflames.com/
    training and placement company in Hyderabad

    ReplyDelete
  39. Thanks For Sharing The Information The Information Shared Is Very Valuable Please Keep Updating

    Us Time Just Went On Reading The article Hadoop Online Course

    ReplyDelete
  40. Nice Post! Thank you for sharing knowledge, it was very good post to update my knowledge and improve my skills. keep blogging.
    Java Training in Electronic City

    ReplyDelete
  41. Hi, Thanks for sharing nice articles, are you guys done a fgreat job...

    For More:

    AI Training In Hyderabad

    ReplyDelete
  42. Contents of the posts are Giving more information's about Studies and getting new innovative ideas Through this articles.Thanks for sharing.
    python training in Chennai | python training in annanagar | python training in omr | python training in porur | python training in tambaram | python training in velachery

    ReplyDelete
  43. Nice Blog. the blog is really very Impressive. every content of this blog is uniquely represented.keep sharing your information regularly for my future reference. This content creates a new hope and inspiration with me.it was a wonderful chance to visit this kind of site...
    Salesforce Training in Chennai

    Salesforce Online Training in Chennai

    Salesforce Training in Bangalore

    Salesforce Training in Hyderabad

    Salesforce training in ameerpet

    Salesforce Training in Pune

    Salesforce Online Training

    Salesforce Training

    ReplyDelete
  44. Hi there, I found your blog via Google while searching for such kinda informative post and your post looks very interesting for me data science course in mysore

    ReplyDelete
  45. Very useful that this post is well written and useful. I bookmarked this blog a while ago because of the useful content and I am never being disappointed. Keep up the good work..I have one more information related with roblox.RobloxPlayer.Exepost and I think it is rather easy to see from the other comments as well

    ReplyDelete