How to scale no. of page hits?

I'll start with describing my problem.

I have n pages each with its own popularity factor. Popularity factor is on a scale of 10. Now, I have total page hits for each of the pages with me and I want to use those total page hits for calculating the popularity factor again on a scale of 10.

The total page hits is an absolute number and I have these values for only 1,70,000 pages. The total pages which I have with me is 41,00,000.

Now, my problem is I don't know how to normalize these total page hits for all of the total pages.

I tried doing this:

Popularity factor for each page = Total page hits for all the pages/total no. of pages.

I'll assume that the pages with no data will be having at least 1 total page hits. But that way my denominator becomes really big number and in the process of scaling on a scale of 10, I'm lost.

Can anyone please help with how can I approach it ?

Answers


There are several ways to do it. Here are some examples:

Absolute popularity

Find the number of hits of the most popular page.

Assign a popularity score bases on the number of hits compared to the most popular page:

0-10% = popularity 1, 10-20% = popularity 2 and so on.

Relative popularity

Sort all pages according to number of page hits.

Assign a popularity score based on the position in the list:

0-10% = popularity 1, 10-20% = popularity 2 and so on.

Popularity of pages without statistics

I can't give you any advice on how to handle these. If you don't know how many times a page has been accessed it is really hard to give it a popularity score.


Need Your Help

How we survive using a local time zone with Breeze

entity-framework datetime asp.net-web-api timezone breeze

I'm writing this to gather comments on our approaches and hopefully help someone else (and my memory).