bespin
bespin copied to clipboard
Reference implementations of data-intensive algorithms in MapReduce and Spark
The current build command `mvn clean package` results in a build failure due to 501 errors Detailed error: ``` [INFO] ------------------------------------------------------------------------ [INFO] BUILD FAILURE [INFO] ------------------------------------------------------------------------ [INFO] Total time: 33.970...
PageRank-related classes still uses commons-cli for parsing args. Refactor to args4j.
Documentation can be taken from the 2018w iteration of my big data course: https://lintool.github.io/bigdata-2018w/assignment7-451.html
The reference implementation of PageRank throws away accuracy when calculating missing mass (RunPageRankBasic.java:456) by bringing the log-probability back into linear space to compute the missing mass. Obviously, this introduces error...
Cloud9 has a bunch of integration tests that weren't copied over to Bespin - bring them over.
Cloud9 has a bunch of unit tests that weren't copied over to Bespin - bring them over.
Addition of syntactic sugar in Scala for MapReduce classes, as well as Scala implementations of all Java MapReduce applications. In addition to the implementations, local integration tests were written to...