Joseph E. Gonzalez

Results 15 issues of Joseph E. Gonzalez

@simon-mo pointed out the following great list of additional reading: https://github.com/mcanini/SysML-reading-list/blob/master/README.md Let's try to pull some of these papers into the current list.

For consistency with notebook models like iPython and Mathematica when a cell is evaluated the output should probably not "overwrite" the code by default but instead be displayed below it....

This is a work in progress implementation of the collapsed Gibbs sampler for the LDA model using the GraphX abstraction primitives. While this is based on the (non-ergodic) bulk synchronous...

Add a basic implementation of the variational mean field algorithm to Analytics. In addition create a synthetic noisy image generator.

enhancement

To help with benchmarking lets create some synthetic graph generators. The Pregel paper describes a log-normal generator which is relatively easy to implement.

enhancement

The `VertexSetRDD[VD]` stores the vertex attributes as an `IndexedSeq[VD]`. When a `VertexSetRDD` is first constructed from an `RDD[(Vid,VD)]` the attributes are stored in an `Array[VD]`. When `mapValues` is in invoked...

enhancement

The spark RDD.collect operation stores the output directly into an array. Since we reuse the iterator values only a single edge triplet is stored (in duplicate) for each partition.

bug

Based on our discussion today it seems like it might be helpful to have a function of the form: ``` scala def contractEdges( ePred: EdgeTriplet[VD,ED] => Boolean, contractFun: EdgeTriplet[VD,ED] =>...

enhancement
interface change

Improve graph parsing and load performance.

enhancement