Ipdps'13 Day1 Graph Algorithms

Here are about of my notes from the showtime 24-hour interval of IPDPS.

Optimizations & analysis of BSP graph processing models on populace clouds

Mapreduce/Hadoop is non really suitable for graph processing (which requires iterating over as well as over on the same graph), as well as this led to the Pregel graph processing framework yesteryear Google. Pregel is based on the Bulk Synchronous Parallelism (BSP) model. Here is a summary of Pregel if yous are non familiar amongst it. In short, Pregel uses a vertex-centric graph processing model, where the same code is executed on all vertices concurrently. Pregel uses message passing along edges as well as barrier synchronization at supersteps (i.e., rounds) to iterate over the graph. This paper looks at optimizations as well as analysis of BSP graph processing frameworks.

This grouping had access to Microsoft Azure cloud computing framework, as well as they wanted to experiment amongst Pregel there, hence they implemented Pregel (following the description inwards the Pregel paper) from scratch inwards .NET environment. They early noticed that Pregel gets fairly retention intensive every bit it holds on to all the messages sent to all the vertices inwards the worker. They started analyzing it farther to encounter how the retention usage changes inwards the lifetime of a Pregel computer program over many supersteps. They discovered that at that topographic point is a camel hump inwards the pump of the computer program lifetime (i.e., inwards the pump supersteps) for most traditional graph programs, such every bit all-pairs shortest path as well as betweenness centrality. This is because these graph programs tend to substitution to a greater extent than messages towards the pump supersteps every bit the computation flourishes as well as the number of messages exchanged subdues over again every bit the computation comes closer to termination. (It turns out this hump is non introduce for PageRank.) This hump, of course, is of import every bit it has an touching on how many workers yous demand to provision, because yous demand to provision for the worst-case retention usage of the workers.

So the grouping goes on to human face into how they tin constrain this hump to possess got a predictable retention usage through all supersteps as well as to facilitate managing the retention constraints of the workers. To this end, they come upwards up amongst the swath concept. Swath is a subset of vertices of entire graph on which the algorithm is existence initiated. Their destination is to choice the swath size that is the best fit into the top dog retention (amplitude) of the workers. They operate on identifying swath initiation heuristics (when are subsequent swaths of verices activated) as well as swath size heuristics (how many vertices are active concurrently inwards a swath). They experiment amongst ii approaches, sampling approach as well as adaptive approach, to create upwards one's withdraw heed when the adjacent swath is initiated. By breaking computation into swaths of vertices as well as using our sizing heuristics, they are able to orbit upwards to 3.5x speedup over the maximum swath size that does non crusade the a failure. Of course, a limitation of the swath approach is that it assumes that the computer program execution is embarassingly parallel as well as yous tin execute the computer program over swaths distributed inwards fourth dimension without causing whatever correctness issues. So this approach is applicable exclusively to those type of graph programs, such every bit all-pairs shortest path as well as betweenness centrality.

The hump observation inwards retention usage of BSP-based graph processing frameworks is a prissy as well as useful one. We possess got also worked on BSP-based graph processing frameworks as well as focused on improving the opensource Giraph implementation of Pregel. We provided serializability to Giraph yesteryear introducing an optimization: internal vertices inwards a worker exercise non message each other but rather read each others' nation straight from the retention of the worker they reside. Our optimization non exclusively provides a stronger serializability to Giraph, but it also prevents this retention camel-hump haunting BSP programs every bit well. Our newspaper has been accepted to EuroPar13, as well as I volition write a detailed post virtually our newspaper soon.

Multithreaded graph partitioning

This paper describes the pattern as well as evolution of an extension to the Metis partitioner to enable multithreading, as well as piece doing hence the newspaper also thoroughly explores the Metis pattern space.

The Metis partitioner is a tool for carve upwards a graph into minimally connected as well as roughly equal parts. While partitioning the graph, the constraint is that the largest division produced should endure smaller than a given size (the worker retention size), as well as this makes the work an NP-hard problem. The agency Metis industrial plant is via using a multilevel epitome of 1) coarsening, 2) initial partitioning, as well as 3) uncoarsening. In the showtime ii steps an guess partitioning is made via coarsening, as well as and hence Metis does a refinement on "the bordering vertices" to detect improve partitioning inwards the terminal step. Since the coarse partitioning industrial plant over all vertices instead of only border vertices it is to a greater extent than ofttimes than non the bottleneck step.

The newspaper investigated several alternatives for each of the iii steps above. For the coarsening measuring they looked at fine-grain matching (locking based), multi-pass matching, as well as unprotected matching (which requires a conflict resolution at the end, but this is no work because exclusively a minor percent of conflicts occurs). For the initial partitioning they tried parallel recursive bisectioning, as well as parallel k-sectioning. For the refinement measuring they tried coarse-grain as well as fine-grain approaches. They give an experimental evaluation of all these unlike approaches on graph datasets (roadmap as well as vlsi circuit) that consist of millions of vertices. They evaluated for performance as well as cutting quality, as well as showed that their multithreaded metis is a clear winner.

One of the lessons learned from the multithreaded metis is that using unprotected operations (for coarsening step) is non that unsafe or crazy, because cleaning upwards later race weather condition turned out to endure faster than preventing them. This grouping made their code opened upwards origin at http://www.cs.umn.edu/ metis/

Finally, about ramblings

I never understood the people that acquire all that problem to go to a conference who exclusively as well as hence sit down inwards the antechamber or the room to websurf as well as exercise emails hunching on their laptops. If at that topographic point is a sensible explanation for this behaviour tin somebody tell me, hence I tin halt wondering virtually this? Yes, presenters are non ever doing a dandy labor at explaining things, but later all that problem to traveling to the conference, those people owe it to themselves to acquire the most out of the conference yesteryear existence at that topographic point every bit a whole as well as non only physically.

My theory virtually the depression character of about presentations is that the presenters ofttimes give presentations to print as well as non to teach/educate/communicate. (Ted Herman in 1 lawsuit warned me virtually this when I was a graduate student, as well as I tried to exercise my best non to autumn into that trap ever since.) I believe that yesteryear only focusing on the message to communicate as well as existence committed to communicating it, most presenters would endure able to exercise a decent job. Instead presenters seem to experience similar they demand to demonstrate off how technically through as well as how clever they had been, how sophisticated they are, as well as the effect is a dry, defensive, as well as incomprehensible presentation. Look, yous possess got been thinking virtually that work for the terminal 1 twelvemonth at least, as well as I am hearing/seeing virtually it the showtime fourth dimension here, as well as yous hold off me to sympathise all the subtleties inwards that work infinite inwards 10 minutes? If yous are committed to teaching/educating/communicating inwards your allotted 25 infinitesimal fourth dimension slot, yous should focus on explaining the insight as well as the most of import technical results (not all of them) inwards the simplest of terms. You tin advert that the work has several subtleties as well as yous are referring the audience to the technical newspaper for the total details. I am non maxim that yous shouldn't endure technical; yous tin endure technical but non to the expense of existence cryptic or exclusive.