: .
DOT: 一个开发处理大数据软件的分析模型
actions ProjectBSP is a Scale-Up Model for HPC
A parallel architecture model
• Key parameters: p, node speed, message
P0 P1 P2 … Pn speed, synch latency
… • Low ratio of computing/message is key
A programming model
• Reducing message/synch latency is key:
i – Overlapping computing and communication
– Exploiting locality
– Load balance to minimize synch latency
Superstep … A cost model
• Combining both hardware/software
parameters, we can predict execution time
BSP does not support
Barrier • Data-intensive applications, big data
• hardware independent performance
• sustained scalability and high throughput
5Scale-out is the Foundation for Big Data Analytics
Scale-out = sustained throughput growth as # nodes grows
• MR processed 1 PB data in 6h 2m on 4000 nodes in 11/2008
• MR processed 10 PB in 6 h 27 m on 8000 nodes 9/2011 (VLDB’11)
• The data is not movable after it is placed in a system
Existing big data software is in a scale-out mode by
• Focusing on scalability and fault-tolerance in large systems
• Provide a easy-to-programming environment
Effectively responds urgent big data challenges
• Parallel databases with limited scalability cannot handle big data
• Big data demands a large scope of data analytics
• Traditional database business mo
DOT一个开发处理大数据软件的分析模型 来自淘豆网m.daumloan.com转载请标明出处.