Record Detail Back
Cheap Recovery: A Key to Self-Managing State
Cluster hash tables (CHTs) are a key persistent-storage
component of many large-scale Internet services due to
their high performance and scalability. We show that a
correctly-designed CHT can also be as easy to manage
as a farm of stateless servers. Specifically, we trade away
some consistency to obtain reboot-based recovery that
is simple, maintains full data availability, and only has
modest impact on performance. This simplifies management
in two ways. First, it simplifies failure detection by
lowering the cost of acting on false positives, allowing
us to use simple but aggressive statistical techniques to
quickly detect potential failures and node degradations;
even when a false alarm is raised or when rebooting will
not fix the problem, attempting recovery by rebooting is
relatively non-intrusive to system availability and performance.
Second, it allows us to re-cast online repartitioning
as failure plus recovery, simplifying dynamic scaling
and capacity planning. These properties make it possible
for the system to be continuously self-adjusting, a key
property of self-managing, autonomic systems.
NONE
Cheap Recovery: A Key to Self-Managing State
Management
English
2014
LOADING LIST...
LOADING LIST...