?Apache Hadoop Cloudera管理員培訓(xùn)課程
培訓(xùn)大綱:
1. Hadoop and HDFS
- Why Hadoop?
 
- HDFS
 
- MapReduce
 
- Hive, Pig, HBase, and Other Ecosystem Projects
 
2. Planning Your Hadoop Cluster
- General Planning Considerations
 
- Choosing the Right Hardware
 
- Node Topologies
 
- Choosing the Right Software
 
3. Deploying Your Cluster
- Installing Hadoop
 
- Using SCM Express for Easy Installation
 
- Typical Configuration Parameters
 
- Configuring Rack Awareness
 
- Using Configuration Management Tools
 
4.? Managing and Scheduling Jobs
- Starting and Stopping MapReduce Jobs
 
- FIFO Scheduler
 
- Fair Scheduler
 
5. Cluster Maintenance
- Checking HDFS with Fsck
 
- Copying Data with Distcp
 
- Rebalancing Cluster Nodes
 
- Adding and Removing Cluster Nodes
 
- Backup and Restore
 
- Upgrading and Migrating
 
- NameNode Metadata
 
6. Cluster Monitoring, Troubleshooting, and Optimizing
- Hadoop Log Files
 
- Using the NameNode and JobTracker Web UIs
 
- Interpreting Job Logs
 
- Monitoring with Ganglia
 
- Other Monitoring Tools
 
- General Optimization Tips
 
- Benchmarking Your Cluster
 
7. Populating HDFS from External Sources
- Using Sqoop
 
- Using Flume
 
- Best Practices for Data Ingestion
 
8. Installing and Managing Other Hadoop Projects