Frugal Computing
Companies attention nearly inexpensive computing.
Well, the foremost affair they attention nearly is usability: Is it tardily for average programmers to develop solutions using the framework?
And the mo affair they attention nearly is inexpensive computing: Does this interruption the bank? Can I create this cheaper?
Speed, efficiency, elegance, technical strength... These are oftentimes non of interest. People are OK alongside their analytic jobs render results hours later. We watched aghast, at the start of Hadoop epidemic, people purpose MapReduce for doing analytics inwards a wasteful as well as tiresome manner.
I was late thinking nearly this question: How tin dismiss nosotros trade-off speed alongside monetary toll of computing? If the constraints are that the user volition non ask the results earlier a span hours only it would last overnice to larn the results inwards a day, what is the cheapest agency to larn this analytics chore done inwards the cloud?
While distributed systems may last the response for a lot of questions (such every bit providing fault-tolerance, low-latency access for geo-distributed deployment, scalability, etc.), it is non real advantageous for inexpensive computing. This is because distributed processing oftentimes comes alongside a large overhead for set down synchronization, which takes a long fourth dimension to larn compensated, every bit the "Scalability at what COST paper" showed. With frugal computing, nosotros should essay to avoid the toll of set down synchronization every bit much every bit possible. So move should last done on i machine if it is cheaper to create hence as well as the generous fourth dimension budget is non exceeded. Other machines should last involved when it is cheaper to involve them, as well as when at that spot is non a lot of set down to synchronize.
Communication costs money. So batching communication as well as trading off computation alongside communication (when possible) would last useful for frugal computing. If something is computationally heavy, nosotros tin dismiss receive got lookup tables stored inwards disk or S3 (if withal viable monetarily).
We may as well as hence ask schemes for data-naming (which may last to a greater extent than sophisticated as well as hence uncomplicated key), hence that a node tin dismiss locate the number it needs inwards S3 instead of computing itself. This tin dismiss permit nodes to collaborate alongside other nodes inwards an asynchronous, offline, or delay-tolerant way. For this, perhaps the Linda tuplespaces stance could last a skillful fit. We tin dismiss receive got an await based synchronization. This may fifty-fifty permit a node or procedure to collaborate alongside itself. The node may mightiness switch to other threads when it is stuck waiting for a information item, as well as and hence come upward alternative upward that thread when the awaited affair becomes laid upward through move of the other threads as well as move on on execution from there.
In frugal computing, nosotros cannot afford to allocate extra resources for fault-tolerance, as well as nosotros ask to create inwards a agency commensurate alongside the opportunity of fault as well as the toll of restarting computation from scratch. Snapshots that are saved for offline collaboration may last useful for edifice frugal fault-tolerance. Also self-stabilizing approach tin dismiss besides last useful, because it tin dismiss supply forrard fault correction instead of a costly roll-back fault correction.
This is all raw stuff, as well as inwards the abstract. I wonder if it would last possible to start alongside an opensource information processing framework (such every bit Spark), as well as customize it to prioritize frugality over speed. How much move would that entail?
Well, the foremost affair they attention nearly is usability: Is it tardily for average programmers to develop solutions using the framework?
And the mo affair they attention nearly is inexpensive computing: Does this interruption the bank? Can I create this cheaper?
Speed, efficiency, elegance, technical strength... These are oftentimes non of interest. People are OK alongside their analytic jobs render results hours later. We watched aghast, at the start of Hadoop epidemic, people purpose MapReduce for doing analytics inwards a wasteful as well as tiresome manner.
I was late thinking nearly this question: How tin dismiss nosotros trade-off speed alongside monetary toll of computing? If the constraints are that the user volition non ask the results earlier a span hours only it would last overnice to larn the results inwards a day, what is the cheapest agency to larn this analytics chore done inwards the cloud?
While distributed systems may last the response for a lot of questions (such every bit providing fault-tolerance, low-latency access for geo-distributed deployment, scalability, etc.), it is non real advantageous for inexpensive computing. This is because distributed processing oftentimes comes alongside a large overhead for set down synchronization, which takes a long fourth dimension to larn compensated, every bit the "Scalability at what COST paper" showed. With frugal computing, nosotros should essay to avoid the toll of set down synchronization every bit much every bit possible. So move should last done on i machine if it is cheaper to create hence as well as the generous fourth dimension budget is non exceeded. Other machines should last involved when it is cheaper to involve them, as well as when at that spot is non a lot of set down to synchronize.
Primitives for frugal computing
Memory is expensive only storage via local disk is not. And fourth dimension is non pressing. So nosotros tin dismiss regard out-of-core execution, juggling betwixt retentiveness as well as disk.Communication costs money. So batching communication as well as trading off computation alongside communication (when possible) would last useful for frugal computing. If something is computationally heavy, nosotros tin dismiss receive got lookup tables stored inwards disk or S3 (if withal viable monetarily).
We may as well as hence ask schemes for data-naming (which may last to a greater extent than sophisticated as well as hence uncomplicated key), hence that a node tin dismiss locate the number it needs inwards S3 instead of computing itself. This tin dismiss permit nodes to collaborate alongside other nodes inwards an asynchronous, offline, or delay-tolerant way. For this, perhaps the Linda tuplespaces stance could last a skillful fit. We tin dismiss receive got an await based synchronization. This may fifty-fifty permit a node or procedure to collaborate alongside itself. The node may mightiness switch to other threads when it is stuck waiting for a information item, as well as and hence come upward alternative upward that thread when the awaited affair becomes laid upward through move of the other threads as well as move on on execution from there.
In frugal computing, nosotros cannot afford to allocate extra resources for fault-tolerance, as well as nosotros ask to create inwards a agency commensurate alongside the opportunity of fault as well as the toll of restarting computation from scratch. Snapshots that are saved for offline collaboration may last useful for edifice frugal fault-tolerance. Also self-stabilizing approach tin dismiss besides last useful, because it tin dismiss supply forrard fault correction instead of a costly roll-back fault correction.
This is all raw stuff, as well as inwards the abstract. I wonder if it would last possible to start alongside an opensource information processing framework (such every bit Spark), as well as customize it to prioritize frugality over speed. How much move would that entail?
0 Response to "Frugal Computing"
Post a Comment