On 04/09/06, Gregory Maxwell <gmaxwell(a)gmail.com> wrote:
I wish Aaronsw had been a little more open about his
metholodigy in his results.
While working a bots for automated vandalism I found pure diffwords to
be a poor metric of whats actually changed because confused new users
often manage to insert big gobs of crud that get reverted.
A simmlar test to this which used the IBM history flow tool to
attribute all the text in the most recent version of an article to its
orignal author did not find simmlar results. This might be because
copyediting can cause the history flow tool to misattribute, or it
might be indicitaive of a systematic flaw in Aaronsw's reseach.
Hop on the page and suggest useful approaches to him. I assume if he's
running for the board because he thinks this is an important issue, he
would probably prefer not to be leading himself up the garden path.
- d.