I just ran across a paper from IBM comparing scaling-up (using bigger boxes) to scaling-out (using more boxes). They use Nutch search as their workload, and conclude “… that scale-out solutions have an indisputable performance and price/performance advantage over scale-up for search workloads.” Not exactly a big surprise, but it’s good to have objective data. They also conclude that “Scale-out systems are still in a significant disadvantage with respect to scale-up when it comes to systems management.” Hmm. With frameworks like Hadoop, folks shouldn’t be bothered as much by the more frequent host failures that a scale-out system is prone to.
Scale-up versus Scale-out