Paper Review. Petuum: A Novel Platform For Distributed Auto Learning On Big Data
First at that spot was large data. Industry saw that large information was good. Industry made large information storage systems, NoSQL datastores, to shop too access the large data. Industry saw they were good. Industry made large information processing systems, Map Reduce, Spark, etc., to analyze too extract information too insights (CRM, work organization logistics, etc.) from large data. Industry saw they were goodness too popular, too thus automobile learning libraries are added to these large information processing systems to supply back upwards for automobile learning algorithms too techniques.
And hither is where this newspaper makes a representative for a redesign for automobile learning systems. The large information processing systems produced past times the manufacture are full general analytic systems, too are non specifically designed for automobile learning from the start. Those are information analytics frameworks first, with roughly automobile learning libraries every bit add together on to tackle automobile learning tasks. This newspaper considers the work of a build clean slate organization pattern for a large information automobile learning system: What if nosotros designed too built a large information framework specifically for machine learning systems, what would it expect like?
This newspaper is from CMU too appeared inwards KDD 2015.
Thus, the newspaper argues, ML algorithms accept an iterative-convergent construction too share these principles:
The newspaper proposes Petuum, a novel distributed ML framework, to leverage these principles.
While the BSP (Bulk Synchronous Parallel) approach, used inwards Map Reduce, Giraph, etc., require the workers to synchronize Earth too telephone substitution messages subsequently each round, SSP cuts roughly slack. The SSP consistency model guarantees that if a worker reads from parameter server at iteration c, it is guaranteed to have all updates from all workers computed at to the lowest degree at iteration c-s-1, where second is the staleness threshold. If at that spot is a straggler to a greater extent than than second iterations behind, the reader volition halt until the straggler catches upwards too sends its updates. How practise y'all create upwards one's heed slack, s? That requires ML expertise too experimenting.
Another pregnant thought inwards the Petuum newspaper is the dichotomy betwixt large information too large model. Yes, large information is big, on the fellowship of terabytes or petabytes. And the newspaper observes, nosotros straightaway every bit good accept a large model problem: ML programs for industrial-scale large information problems purpose large models that are 100s of billions of parameters (Figure 1 gives a overnice summary). And this large model needs exceptional attending every bit well.
Here I volition purpose snippets from the newspaper to orbit the proper Definition of these concepts.
Essentially, model-parallelism provides mightiness to invoke dynamic schedules that trim down model parameter dependencies across workers, leading to faster convergence. So Petuum uses model-parallelism to leverage nonuniform convergence too dynamic construction dependency to ameliorate the efficiency/performance of ML tasks.
The scheduler is responsible for enabling model parallelism. Scheduler sends subset of parameters to workers via parameter telephone substitution channel. The parameter server stores too updates model paramerters., which tin orbit the sack live on accessed via a distributed shared retentiveness API past times both workers too scheduler. Each worker is responsible for performing operations defined past times a user on a partitioned information prepare too a parameter subset specified past times scheduler.
Here is the Petuum API:
schedule: specify the subset of model parameters to live on updated inwards parallel.
push: specify how private workers compute partial results on those parameters.
line [optional]: specify how those partial results are aggregtaed to perform the total parameter update.
To illustrate the purpose of Petuum API, the newspaper presents the code for a data-parallel Distance Metric Learning (DML) algorithm too for a model parallel Lasso algorithm.
It looks similar writing an efficient/optimized schedule would demand pregnant expertise. This volition oft require running the algorithm on examine dataset to take in relations betwixt model parameters, convergence speeds, etc.
There would live on advantages/disadvantages for each approach, too it volition live on interesting to sentinel how this plays out inwards the coming years.
And hither is where this newspaper makes a representative for a redesign for automobile learning systems. The large information processing systems produced past times the manufacture are full general analytic systems, too are non specifically designed for automobile learning from the start. Those are information analytics frameworks first, with roughly automobile learning libraries every bit add together on to tackle automobile learning tasks. This newspaper considers the work of a build clean slate organization pattern for a large information automobile learning system: What if nosotros designed too built a large information framework specifically for machine learning systems, what would it expect like?
This newspaper is from CMU too appeared inwards KDD 2015.
Machine Learning (ML) objectives too features
Naturally the newspaper starts past times commencement identifying the objective too features of ML systems. "ML defines an explicit objective component division over information where the destination is to attain optimality of this component division inwards the infinite defined past times the model parameters too other intermediate variables."Thus, the newspaper argues, ML algorithms accept an iterative-convergent construction too share these principles:
- nonuniform convergence: model parameters converge at dissimilar speeds
- error tolerance: iterative-convergent algorithms are resilient against errors too converge/heal towards the solution
- dynamic construction dependency: iterative-convergent algorithms oft accept changing correlation strengths betwixt model parameters during the course of written report of execution
The newspaper proposes Petuum, a novel distributed ML framework, to leverage these principles.
Stale Synchronous Parallel (SSP) consistency
Petuum leverages error-tolerance past times introducing SSP (Stale Synchronous Parallel) consistency. SSP reduces network synchronization costs with workers, spell maintaining bounded staleness convergence guarantees.While the BSP (Bulk Synchronous Parallel) approach, used inwards Map Reduce, Giraph, etc., require the workers to synchronize Earth too telephone substitution messages subsequently each round, SSP cuts roughly slack. The SSP consistency model guarantees that if a worker reads from parameter server at iteration c, it is guaranteed to have all updates from all workers computed at to the lowest degree at iteration c-s-1, where second is the staleness threshold. If at that spot is a straggler to a greater extent than than second iterations behind, the reader volition halt until the straggler catches upwards too sends its updates. How practise y'all create upwards one's heed slack, s? That requires ML expertise too experimenting.
Big data, come across Big model!
Another pregnant thought inwards the Petuum newspaper is the dichotomy betwixt large information too large model. Yes, large information is big, on the fellowship of terabytes or petabytes. And the newspaper observes, nosotros straightaway every bit good accept a large model problem: ML programs for industrial-scale large information problems purpose large models that are 100s of billions of parameters (Figure 1 gives a overnice summary). And this large model needs exceptional attending every bit well.
Here I volition purpose snippets from the newspaper to orbit the proper Definition of these concepts.
Essentially, model-parallelism provides mightiness to invoke dynamic schedules that trim down model parameter dependencies across workers, leading to faster convergence. So Petuum uses model-parallelism to leverage nonuniform convergence too dynamic construction dependency to ameliorate the efficiency/performance of ML tasks.
Petuum design
Petuum consists of iii brain components: Scheduler, parameter server, too workers.The scheduler is responsible for enabling model parallelism. Scheduler sends subset of parameters to workers via parameter telephone substitution channel. The parameter server stores too updates model paramerters., which tin orbit the sack live on accessed via a distributed shared retentiveness API past times both workers too scheduler. Each worker is responsible for performing operations defined past times a user on a partitioned information prepare too a parameter subset specified past times scheduler.
Here is the Petuum API:
schedule: specify the subset of model parameters to live on updated inwards parallel.
push: specify how private workers compute partial results on those parameters.
line [optional]: specify how those partial results are aggregtaed to perform the total parameter update.
To illustrate the purpose of Petuum API, the newspaper presents the code for a data-parallel Distance Metric Learning (DML) algorithm too for a model parallel Lasso algorithm.
It looks similar writing an efficient/optimized schedule would demand pregnant expertise. This volition oft require running the algorithm on examine dataset to take in relations betwixt model parameters, convergence speeds, etc.
Evaluation
The newspaper provides evaluation results on the performance of Petuum.Petuum versus TensorFlow
How does Petuum compare with Google's TensorFlow? TensorFlow framework tin orbit the sack live on used to limited a broad diversity of algorithms, including grooming too inference algorithms for deep neural network models. And yet, TensorFlow has a completely dissimilar pattern approach than Petuum. Tensorflow uses dataflow graphs. Implementing a ML algorithm/recipe on TensorFlow has a to a greater extent than distributed nature. An algorithm/recipe consists of many operations, too TensorFlow maps i or multiple operations inwards this algorithm to a node/worker. In Petuum the entire algorithm/recipe is mapped on a node/worker too efficiency is achieved via data-parallel too model-parallel partitioning.There would live on advantages/disadvantages for each approach, too it volition live on interesting to sentinel how this plays out inwards the coming years.
0 Response to "Paper Review. Petuum: A Novel Platform For Distributed Auto Learning On Big Data"
Post a Comment