Suggestion for answer to question 7a
For any values of k or n greater than 2 the fastest algorithm is probably to simply use a hash-table which would be amortized O(n). But the answer given suggests doing a O(n^k) grid search which (in my opinion) isn't a good idea.
Another "pure" mathematical approach would be to let the non-missing values be p_1,...,p_{n-k} which are the roots of the polynomial f(x)=(x-p_1)...(x-p_{n-k})=x^{n-k}+s_1x^{n-k-1}+...+s_{n-k-1}x+s_{n-k} and calculate the symmetric homogeneous polynomials s_1,...,s_{n-k} via Newton's identities. Then we plug 1,2,...n into the polynomial f and the missing values are the non-roots. Depending on the value of k this could be faster than the grid search (but still a lot slower than the hash table).
Also a quick note about question 12: you can prove that with b blue and r red balls and 2 urns, the optimal answer is to put 1 blue in one urn and b-1 blue, r red in the other urn by noting that the events of selecting each individual red ball are mutually exclusive and have probability >= 0.5*1/(b+r-1), thus the probability of selecting a blue ball is at most 1- 0.5r/(b+r-1).
Here's an elementary solution to question 38: Let p_i be the probability of reaching n before 0 when starting from i. p_0=0 and p_n=1 and by linearity of expectation, p_i=.5p_{i-1}+.5p_{i+1} (*), so we can solve this system of equations for p_i=i/n. A shortcut for solving this is letting q_i=p_i-p_{i-1} and noting that q_{i+1}=q_i from (*), so q_i=1/n for all i.
These are great, I will try adding them to the next update of the book along with some other suggestions I've been getting.
For any values of k or n greater than 2 the fastest algorithm is probably to simply use a hash-table which would be amortized O(n). But the answer given suggests doing a O(n^k) grid search which (in my opinion) isn't a good idea.
Could you expand a bit on the hash table idea? I think this was suggested by another reader as well (if i understand correctly) and I've added it to a draft that I hope to publish in the next few days. I also used it once during an interview, but the interviewer started debating with me whether constructing the hash table amounts to "sorting" and thus should have O(nlogn) complexity.
This is a very good example, since it shows what I've always believed: There may be a right or best answer, but that is not necessarily the answer that the interviewer wants to hear.
You'll see I pushed some changes to question 7 today (suggested to me a long time ago, I really need to work on this more). Is that solution similar to your hashtable one?
Yes! The boolean array is a better way of implementing the "hash table" since you have a perfect hash function. It's the same idea as what I meant by "hash table" but more refined.
For 7a on page 11 can't we just use counting sort to place the n-k integers in an array of size n, then loop through the array to see which k integers are missing?
For example, if k=2, n=5, and the numbers are {1,4,2}, you use counting sort to place them in the respective slots in the length 5 array [1, 2, nan, 4, nan], then loop through it and see that at positions 3 and 5 it's empty, so 3 and 5 must be the missing numbers. This is O(n).