Unlock your Full CCA-500 Cloudera Stable Exam

Cloudera Certified Administrator for Apache Hadoop (CCAH) Questions and Answers

Question 1

You want to understand more about how users browse your public website. For example, you want to know which pages they visit prior to placing an order. You have a server farm of 200 web servers hosting your website. Which is the most efficient process to gather these web server across logs into your Hadoop cluster analysis?

Options:

Sample the web server logs web servers and copy them into HDFS using curl

Ingest the server web logs into HDFS using Flume

Channel these clickstreams into Hadoop using Hadoop Streaming

Import all user clicks from your OLTP databases into Hadoop using Sqoop

Write a MapReeeduce job with the web servers for mappers and the Hadoop cluster nodes for reducers

Question 2

Which YARN daemon or service negotiations map and reduce Containers from the Scheduler, tracking their status and monitoring progress?

Options:

NodeManager

ApplicationMaster

ApplicationManager

ResourceManager

Question 3

You decide to create a cluster which runs HDFS in High Availability mode with automatic failover, using Quorum Storage. What is the purpose of ZooKeeper in such a configuration?

Options:

It only keeps track of which NameNode is Active at any given time

It monitors an NFS mount point and reports if the mount point disappears

It both keeps track of which NameNode is Active at any given time, and manages the Edits file. Which is a log of changes to the HDFS filesystem

If only manages the Edits file, which is log of changes to the HDFS filesystem

Clients connect to ZooKeeper to determine which NameNode is Active

Question 4

You use the hadoop fs –put command to add a file “sales.txt” to HDFS. This file is small enough that it fits into a single block, which is replicated to three nodes in your cluster (with a replication factor of 3). One of the nodes holding this file (a single block) fails. How will the cluster handle the replication of file in this situation?

Options:

The file will remain under-replicated until the administrator brings that node back online

The cluster will re-replicate the file the next time the system administrator reboots the NameNode daemon (as long as the file’s replication factor doesn’t fall below)

This will be immediately re-replicated and all other HDFS operations on the cluster will halt until the cluster’s replication values are resorted

The file will be re-replicated automatically after the NameNode determines it is under-replicated based on the block reports it receives from the NameNodes

Question 5

You are planning a Hadoop cluster and considering implementing 10 Gigabit Ethernet as the network fabric. Which workloads benefit the most from faster network fabric?

Options:

When your workload generates a large amount of output data, significantly larger than the amount of intermediate data

When your workload consumes a large amount of input data, relative to the entire capacity if HDFS

When your workload consists of processor-intensive tasks

When your workload generates a large amount of intermediate data, on the order of the input data itself

Question 6

Identify two features/issues that YARN is designated to address: (Choose two)

Options:

Standardize on a single MapReduce API

Single point of failure in the NameNode

Reduce complexity of the MapReduce APIs

Resource pressure on the JobTracker

Ability to run framework other than MapReduce, such as MPI

HDFS latency

Question 7

Which two are features of Hadoop’s rack topology? (Choose two)

Options:

Configuration of rack awareness is accomplished using a configuration file. You cannot use a rack topology script.

Hadoop gives preference to intra-rack data transfer in order to conserve bandwidth

Rack location is considered in the HDFS block placement policy

HDFS is rack aware but MapReduce daemon are not

Even for small clusters on a single rack, configuring rack awareness will improve performance

Question 8

A slave node in your cluster has 4 TB hard drives installed (4 x 2TB). The DataNode is configured to store HDFS blocks on all disks. You set the value of the dfs.datanode.du.reserved parameter to 100 GB. How does this alter HDFS block storage?

Options:

25GB on each hard drive may not be used to store HDFS blocks

100GB on each hard drive may not be used to store HDFS blocks

All hard drives may be used to store HDFS blocks as long as at least 100 GB in total is available on the node

A maximum if 100 GB on each hard drive may be used to store HDFS blocks

Question 9

Your cluster implements HDFS High Availability (HA). Your two NameNodes are named nn01 and nn02. What occurs when you execute the command: hdfs haadmin –failover nn01 nn02?

Options:

nn02 is fenced, and nn01 becomes the active NameNode

nn01 is fenced, and nn02 becomes the active NameNode

nn01 becomes the standby NameNode and nn02 becomes the active NameNode

nn02 becomes the standby NameNode and nn01 becomes the active NameNode

Load More CCA-500 Questions

Weekend Sale Limited Time Flat 70% Discount offer - Ends in 0d 00h 00m 00s - Coupon code: 70spcl

Activedumpsnet Logo

Activedumpsnet Navigation

Activedumpsnet Slider

Cloudera CCA-500 Cloudera Certified Administrator for Apache Hadoop (CCAH) Exam Practice Test

Cloudera Certified Administrator for Apache Hadoop (CCAH) Questions and Answers

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Explanation:

Options:

Answer:

Options:

Answer:

Options:

Answer:

Explanation:

Copyright © 2014-2025 Activedumpsnet. All Rights Reserved