I would be more than willing to help, keyword help, compile a common typo list. Just let
me know and I could start up a spreadsheet or something.
Sent from my iPhone
On Nov 7, 2017, at 1:26 PM, Chad
<innocentkiller(a)gmail.com> wrote:
On Tue, Nov 7, 2017 at 10:45 AM Faidon Liambotis <faidon(a)wikimedia.org>
wrote:
On Tue,
Nov 07, 2017 at 11:19:57AM -0700, Bryan Davis wrote:
We could probably add checks for some common ones if someone compiled a
list.
Running a full spell check would be difficult because of the number of
false positives there would be based on a "normal" dictionary. Commit
messages often contain technical jargon (maybe something to try and
avoid) and snippets of code (e.g. class names like
TemplatesOnThisPageFormatter) that would not be in any traditional
dictionary that we could count on being on the local host.
Debian's lintian (lint tool for packages) has a check for common
typos/misspellings in its informational mode. The package ships with
/usr/bin/spellintian which is a simple spellchecker that can run
independently.
The benefit of using spellintian over e.g. aspell is that it addresses
the issues you already identified: a) it just identifies typos, not
complaining on unknown words it doesn't know, b) it's been created from
observing typos in source code and package descriptions in the wild, so
it's tailored to technical jargon and their misspellings. It could be a
good fit to git commit messages.
That doesn't mean it's free of false positives though, so I wouldn't
recommend to use it in a voting check in a CI pipeline.
Plus, you know, intentional misspellings:
"Fix misspelling of wikinedia -> wikimedia"
-Chad
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l