下载此文档

北大 分布式系统FaultTolerance.ppt


文档分类:IT计算机 | 页数:约68页 举报非法文档有奖
1/68
下载提示
  • 1.该资料是网友上传的,本站提供全文预览,预览什么样,下载就什么样。
  • 2.下载该文档所得收入归上传者、原创者。
  • 3.下载的文档,不会出现我们的网址水印。
1/68 下载此文档
文档列表 文档介绍
Fault Tolerance ./~course/cs501/2013 Zhi Yang School of EECS, Peking University 3/28/2013 Example: Costs ? As a scaling technique, may not always be applicable. P Access replica N times per second Update replica M times per second ? As a scaling technique, may not always be applicable. What if N << M? ?" Failure is not an option. es bundled with your software .“(--unknown) ?"You know you have [a distributed system] when the crash of puter you've never heard of stops you from getting any work done .“(--Leslie Lamport) Some real-world datapoints Sources: ? Disk Failures in the Real World: What Does an MTTF of 1,000,000 Hours Mean to You?, Bianca Schroeder and Garth A. Gibson (FAST 07) [ pdf ] ? Failure Trends in a Large Disk Drive Population, Eduardo Pinheiro, Wolf-Dietrich Weber, and Luiz Andr é Barroso (FAST ’ 07) [ pdf ] Contents 01: Introduction 02: Architectures 03: Processes 04: Communication 05: Naming 06: Synchronization 07: Consistency & Replication 08: Fault Tolerance 09: Security 10: Distributed Object-Based Systems 11: Distributed File Systems 12: Distributed Web-Based Systems 13: Distributed Coordination-Based Systems 5 /N Outline ? Basic concepts ? Process resilience ? Reliable client-munication (++) ? Reliable munication ? mit (++) ? Recovery Fault handling approaches ? Fault prevention : prevent the occurrence of a fault ? Fault tolerance : build ponent in such a way that it can meet its specifications in the presence of faults (., mask the presence of faults) ? Fault removal : reduce the presence, number, seriousness of faults ? Fault forecasting : estimate the present number, future incidence, and the consequences of faults Design Goal (with regard to fault tolerance): Design a (distributed) system that can recover from partial failures without affecting correctness or significantly impacting overall performance 分布式系统设计出发点?一个进程 P可能依赖不同计算机上其他进程提供的服务,如果那些进程由于出现错误或故障而失去联系,则 P无法正常运行。?计算机死机,或许网络断开,或许对方负载太重,暂时无法

北大 分布式系统FaultTolerance 来自淘豆网m.daumloan.com转载请标明出处.

相关文档 更多>>
非法内容举报中心
文档信息
  • 页数68
  • 收藏数0 收藏
  • 顶次数0
  • 上传人xwbjll1
  • 文件大小1.49 MB
  • 时间2017-03-03