如果你需要維護大型而且復雜的Hadoop集群的話,《Hadoop操作手冊(影印版)》是絕對必需的。隨著Hadoop變成數據中心里大規模數據處理的行業標準,操作手冊方面的需求急劇增長。薩默爾,cloudera公司的首席方案架構師,在本書中為你展示了產品級Hadoop的運行細節,從規劃、安裝和配置系統到提供可持續的維護管理。
《Hadoop操作手冊(影印版)》這本操作指南并沒有列舉每種可能的場景,它更注重實效,描述了在重要部署中的各項步驟。
本書內容: HDFS和MapRedLice概覽:它們存在的原因和原理;從硬件和OS選擇到網絡需求來規劃Hadoop部署; 根據重要屬性列表來學習搭建和配置細節; 通過在多個組中共享集群來管理資源;獲取最常見的集群維護任務運行手冊; 監控Hadoop集群——以及學習基于實際例子的故障檢測;使用基礎工具和技術來處理備份和災難性故障。
如果你需要維護大型而且復雜的Hadoop集群的話,《Hadoop操作手冊(影印版)》是絕對必需的。隨著Hadoop變成數據中心里大規模數據處理的行業標準,操作手冊方面的需求急劇增長。薩默爾,cloudera公司的首席方案架構師,在本書中為你展示了產品級Hadoop的運行細節,從規劃、安裝和配置系統到提供可持續的維護管理。
Preface 1.Introduction 2.HDFS Goals and Motivation Design Daemons Reading and Writing Data The Read Path The Write Path Managing Filesystem Metadata Namenode High Availability Namenode Federation Access and Integration Command—Line Tools FUSE REST Support 3.MapReduce The Stages of MapReduce Introducing Hadoop MapReduce Daemons When It All Goes Wrong YARN 4.Planning a Hadoop Cluster Picking a Distribution and Version of Hadoop Apache Hadoop Cloudera’S Distribution Including Apache Hadoop What Should I Use? Hardware Selection Master Hardware Selection Worker Hardware Selection Cluster Sizing Blades,SANs,and Virtualization Operating System Selection and Preparation Deployment Layout Software Hostnames.DNS.and Identmcation Users,Groups,and Privileges Kernel Tuning vm.swappiness vm.overcommit_memory Disk Configuration Choosing a Filesystem Mount Options Network Design Network Usage in Hadoop:A Review 1 Gb versus 10 Gb Networks Typical Network Topologies 5.Installation andConfiguration Installing Hadoop Apache Hadoop CDH Configuration:An 0verview The Hadoop XML Configuration Files Environment Variables and Shell Scripts Logging Configuration HDFS Identification and Location Optimization and Tuning Formatting the Namenode Creating a/tmp Directory Namenode High Availability Fencing Options Basic Configuration Automatic Failover Configuration Format and Bootstrap the Namenodes Namenode Federation MapReduce Identification and Location Optimization and Tuning Rack Topology Security 6.Identity,Authentication,and Authorization Identity Kerberos and Hadoop Kerberos:A Refresher Kerberos Support in Hadoop Authorization HDFS MapReduce Other Tools and Systems Tying It Together 7.ResojJrceManagement What Is Resource Management? HDFS Quotas MapReduce Schedulers The FIFO Scheduler The Fair Scheduler The Capacity Scheduler The Future 8.ClusterMaintenance Managing Hadoop Processes Starting and Stopping Processes with Into Scripts Starting and Stopping Processes Manually HDFS Maintenance Tasks Adding a Datanode Decommissioning a Datanode Checking Filesystem Integrity with fsck Balancing HDFS Block Data Dealing with a Failed Disk MapReduce Maintenance Tasks Adding a Tasktracker Decommissioning a Tasktracker Killing a MapReduce Job Killing a MapReduce Task Dealing with a Blacklisted Tasktracker 9.Troubleshooting Differential Diagnosis Applied to Systems