How to approach this algorithm question?

  • A website has a database of n questions.
  • You click a button and are shown one random question per click. The probability of a particular question showing up at the click event is 1/n.

On average, how many clicks would be required to see all the questions in the database?

What is the approach required for such questions?

Answers


See coupon collector's problem.


This is a mathematical question rather than an algorithmic one. As sdcvvc said, this is the famous coupon collector's problem.

Suppose you have n questions to "gather". Let X be a random variable denoting the required number of clicks. If we define Xi to be the number of clicks from the moment we have (i-1) questions to the moment we have i questions, then:

X = X1 + X2 + ... + Xn

Due to the linearity of the expected value:

E(X) = E(X1 + X2 + ... + Xn) = EX1 + EX2 + ... + EXn

If we inspect Xi, we see that in fact it has geometric distribution with p=(n-i+1)/n, hence a mean value of n/(n-i+1). Therefore:

EX = n * (1/n + 1/(n-1) + ... + 1/2 + 1/1) = n * Hn

Where Hn is the nth Harmonic number, which can be approximated by ln n:

EX ~= n * ln n

You can run a simple simulation and test this approximation.


Why don't we run a simulation and found out?

<?php

function simulate($size) {

    $range = range(1, $size);

    $hits = 0;
    $hit = array();

    while(count($hit) != $size) {
        $rand = array_rand($range);
        $hit[$rand] = 1;
        $hits++;
    }

    return $hits;

}

for ($j=10; $j<101; $j+=10) {
    $res = array();
    for ($i=0; $i<10; $i++) {
        $res[] = simulate($j);
    }
    echo "for size=$j, avg=" . array_sum($res)/10 . "<br />";
}

Output:

for size=10, avg=35.9
for size=20, avg=68.8
for size=30, avg=123.3
for size=40, avg=176.9
for size=50, avg=205.9
for size=60, avg=276.8
for size=70, avg=304.9
for size=80, avg=401.9
for size=90, avg=371
for size=100, avg=617.7

Need Your Help

Minus from slider value in AfterEffects

expression adobe numerical after-effects

I'm making a timer. When the minutes value reaches 60, it has to decrease by 60 and increment the hour. It's tracking a time lapse; the minutes is currently the time of the computation divided by ...

How to combine an image with a mask into one single UIImage with Accelerate Framework?

ios iphone uiimage core-graphics accelerate-framework

This code combines an image and a grayscale mask image into one UIImage. It works but it is slow.