In this blog i am going to write about all the difficulties that i faced during the execution of my project .
To be honest i had no knowledge regarding big data and hadoop before this project but i had some interest on this topic because in last semester i did DAT601 in which our teacher Todd cochrane talk about Big data and how it is becoming a good topic in database.
First problem that i faced during this project was to decide the exact topic for my project which relates to big data .During this whole time even my objective of project also changeed to some sort . First confuson was to whether choose a research based topic or choose a practical implementation topic . I decided to do a research based topic in which i compare different data analysing tools to analyse big data but when i told it to my respected supervisor lars dam about it then he told me that i should also try to focus on practical implementation of these data analysis tools . So under his guidance i decided my topic which was both a research based and practical implementation of that by using “hadoop ” which is a data analysis tool to analyse big data by building a one node cluster on my laptop .
Next problem that i faced was that when i tried to research about hadoop practical implementation , i watched many youtube videos , which discuss in different ways about hadoop cluster implementation which made me confused for a couple of weeks . I was taking my time because if i started my project in a wrong way then it would be difficult to finish it off properly,
In one video i watched hadoop one node implementation but the virtual machine he used was for vmware playstation but he did not not specifically told that in the video . I did not had Vmware and i had oracle virtual player in my desktop . Due to not being compatible with oracle virtual player that virtual machine did not work .
when i watched some other youtube videos i realized that the virtual machine that we are downloading from cloudera( which is an open source hadoop distribution) needs to be compatible with the virtualization software that is installed in my desktop.
This time i wanted to install the virtual machine of cloudera(CDH) which is compatible with the oracle virtual player.But this time while i was watching the video i notice that the RAM requirement for this virtual machine is 8 GB which was not sufficient according to my laptop specification .
I told this thing to my supervisor Mr Lars Dam and he advised me to use help from Mr . Mark caukill (He is networking specialist ) so that i can use the Talos room as a host for my the CDH virtual machine.
I even got the permission from him to use Talos server room as a host .
So now i started to work over this . But one more hurdle came in my way again .
I thought that i can directly export my virtual machine from my desktop into the virtual environment provided by Mr.Mark Caukill but was wrong .
For the virtual machine to be exported into the virtual environment , it needed to be into the library of the environment which can be done only by the administrator means Mr Mark Caukill . But at that time he was on vacation so i could not ask help from him.
So therefore i decided to borrow my flatmates laptop for hadoop cluster demonstration because his laptop has 16 Gb of RAM and finally i was able to run the CDH virtual machine