On 7/25/05, tpryor(a)media.mit.edu <tpryor(a)media.mit.edu> wrote:
For this I need access to user page requests over
time- preferably stored in a
database. I can provide a script that will translate users' ip addresses to a
unique signature so that the users themselves remain anonymous, stuff the data
into a reasonably size efficient mysql table, etc.
Why would you need Wikipedia's data for the design? Couldn't you set
up a wiki and use that data, or even use pseudo-random data?
I was told that I might need to talk to Kate about the
feasibility of doing
this. Are there any existing objections to retaining anonymized apache log data
for research purposes?
No objections to anonymized data. The trouble is that you can't really
anonymize this data all that easily. Just getting a program/script to
do the anonymizing would be a great project.