[Foundation-l] Re: Research access to logs

Jerome Jamnicky jeronimwp at yahoo.com.au
Tue Aug 16 04:55:59 UTC 2005


Tobias Denninger wrote:
> Hello..
> 
> ..I think it could be useful to compute the probability an article B is read on
> condition that another article A is read whithin a short timeframe before from a
> specific reader. Based on this probabilities suggestions could be made to the
> reader of a specific article which articles could be also interesting (maybe a
> kind of collaborative filtering or Amazon's "Customers who bought this book also
> bought.."). Subscribed user could offered personalized recommendations based on
> the computation how probable it is that an article is of interest to that
> specific user who read those articles. I'd be interested in implementing that
> idea, so as a first step I'd be interested in a sample log file with a size of
> some Megabyte.
> 
> Tobias
> 

If you haven't already, please read the privacy policy carefully, and 
also this thread, where somebody made a similar request for a similar 
purpose:
http://mail.wikipedia.org/pipermail/wikitech-l/2005-July/thread.html#30917

A single line from one of the squid server log files looks like this:

1124167686.523    210 12.34.56.78 TCP_MISS/200 2962 GET 
http://en.wikipedia.org/wiki/Special:Search?search=Potato&go=Go - 
PARENT_HIT/207.142.131.200 text/html [Host: 
en.wikipedia.org\r\nUser-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; 
en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6\r\nAccept: 
text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5\r\nAccept-Language: 
en-us,en;q=0.5\r\nAccept-Encoding: gzip,deflate\r\nAccept-Charset: 
ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nKeep-Alive: 300\r\nConnection: 
keep-alive\r\nReferer: http://en.wikipedia.org/wiki/Esoterica\r\n] 
[HTTP/1.0 200 OK\r\nDate: Tue, 16 Aug 2005 04:48:06 GMT\r\nServer: 
Apache\r\nX-Powered-By: PHP/4.3.11\r\nContent-language: en\r\nVary: 
Accept-Encoding,Cookie\r\nExpires: -1\r\nCache-Control: private, 
must-revalidate, max-age=0\r\nContent-Encoding: gzip\r\nConnection: 
close\r\nContent-Type: text/html; charset=utf-8\r\n\r]

Note that this format may change in the future.  Is there anything else 
you need to know?

-Jerome

----
Send instant messages to your online friends http://au.messenger.yahoo.com 



More information about the foundation-l mailing list