selfie icon indicating copy to clipboard operation
selfie copied to clipboard

Autograde GC assignment

Open ckirsch opened this issue 5 years ago • 8 comments

@Blazefrost @aemonk Please design an assignment for the gc in selfie. Ideally, I'd like students to bring down its complexity from O(n^2) to O(n) where n is the heap size. This requires students to find a way to determine whether a value is a gced pointer in O(1) rather than O(n). However, checking an algorithms complexity is not easy. Yet mipster knows the number of executed instructions...

ckirsch avatar Sep 13 '20 07:09 ckirsch

There is a paper about deriving the asymptotic complexity of an algorithm by measuring the empirical computational complexity: http://sfg.users.sonic.net/berkeley/trendprof-fse-2007.pdf

Basically, Goldsmith et al suggest that, using different workloads, we could try to create a linear model using least-squares linear regression and rate its performance using a goodness-of-fit statistic.

BlazeFrost avatar Oct 01 '20 14:10 BlazeFrost

I'll try to implement it myself and plot both master and my implementation using different allocation counts to see how much they differ.

BlazeFrost avatar Oct 01 '20 14:10 BlazeFrost

@Blazefrost Very nice! I am looking forward to that. We may be able to use that for other assignments too.

ckirsch avatar Oct 02 '20 11:10 ckirsch

There are two ways we could profile the instruction count:

  • GC in syscall mode: The test is run on a GC-enabled mipster which itself runs on a mipster to profile the executed instructions
  • GC in library mode: The test is compiled with selfie.h and the library GC is enabled at the begin of the test using turn_on_gc_library

I think it would be beneficial to perform the test in library mode because it allows us force a GC collection on each allocation instead of having to allocate n*(1000) objects to provoke n collections.

Furthermore, having to do n*(1000) allocations in syscall mode introduces a lot of overhead, even with precompiled selfie and (preliminary) test binaries. It took my machine 26.10s for the first test, but we will need a few more data points to perform a linear regression. Also, the nested emulation will bloat the instruction count of the outer emulator that would be used to perform the regression.

BlazeFrost avatar Oct 04 '20 19:10 BlazeFrost

@Blazefrost Measuring the GC as library is fine. However, try to design something that can determine the algorithmic complexity in a given parameter of any RISC-U binary, not just GC. Maybe a special mipster that runs a given binary n times and does so by feeding the parameter to the binary as console argument. It can then count the executed instructions and estimate complexity accordingly. Let's call that thing complexr for now.

ckirsch avatar Oct 05 '20 09:10 ckirsch

Now that conqueror is on hiatus in favor for the model checker project, how do we proceed with that issue? Should I implement the asymptotic complexity approximation into Selfie or add a Python tool?

BlazeFrost avatar Nov 17 '20 00:11 BlazeFrost

@Blazefrost Let's discuss this tomorrow. Please remind me.

ckirsch avatar Nov 19 '20 15:11 ckirsch

@Blazefrost By now there is a Boehm GC implementation in the master branch. However, I think we should still explore grader support for measuring time/space complexity of algorithms. Would be really neat to have that. What do you think?

ckirsch avatar Feb 15 '21 09:02 ckirsch