CCA-175 Cloudera CCA Spark and Hadoop Developer
This is the kind of certification that is intimidating at first but even if you have a third of knowledge which any half way decent Big Data developer should have; then its a piece of cake, red velvet if you ask me :)
I had been back and forth to appear for the certification having worked on spark in 2018. Spare time due to the pandemic urged to get this one done. So lets get one of the myth’s out of our mind before starting this post — CCA-175 is not difficult. You just need to play smart and be confident to clear this one.
Exam Focus Area
The exam (since 2020) focuses completely on Spark (2.x) and there are no questions on Sqoop, if you see sqoop questions in any of the online trainings or practice exams just ignore them completely.
The areas you need to focus on for the exam are -
- Reading and writing files to HDFS (all types — text, csv, avro, parquet etc.), I mean ALL
- Spark Data Frame
- Spark SQL
If you study all these things smartly then you can crack the certification in 1–2 weeks of preparation.
Exam Preparation
My personal preparation was a bit different than what I would recommend anyone to do, so let me tell what I had studied regarding spark. It also speaks to my knowledge rather than the certification preparation.
I had done the cloudera on-demand training in 2018 (sponsored by my organization). When I decided to appear for the exam, I did a Udemy course by our trusted Frank Kane in september 2020. Due to an organization engagement, I did the on-demand training again in October 2020. Completed the Udemy course and took the practice test in November 2020.
I know…I know…this was probably an overkill. I wont recommend anyone to do what I had done, may be I was enjoying exploring spark. Anyway, here’s what you should do -
If you have relevant spark experience you should be able to finish the above in 1–2 weeks. If you are confident enough perhaps because you work on spark daily, then I would suggest you do the practice test directly. If you score 100% in the test you will probably clear the exam without breaking a sweat.
Things to take care before appearing for the Exam
The exam although performance based is not at all difficult, the questions are predictable and you can find practice tests for free which are pretty close to the actual one.
What matters in the exam is the timing because the environment is dead slow. If the examiner does not allow you to connect a second monitor then you probably have to deal with a small laptop screen split into terminal window and mozilla (for questions) coupled with a dead beat VM. The VM slows you down considerably so practice the test mentioned above and you should be able to complete that in 1 hour 30 minutes to accommodate the delay in the actual exam.
You don’t have to import any packages, everything is available in the environment.
All the best to everyone appearing for the exam. Feel free to drop any comment or feedback.