On Tue, Nov 7, 2017 at 10:45 AM Faidon Liambotis <faidon(a)wikimedia.org>
wrote:
On Tue, Nov 07, 2017 at 11:19:57AM -0700, Bryan Davis
wrote:
We could probably add checks for some common ones
if someone compiled a
list.
Running a full spell check would be difficult because of the number of
false positives there would be based on a "normal" dictionary. Commit
messages often contain technical jargon (maybe something to try and
avoid) and snippets of code (e.g. class names like
TemplatesOnThisPageFormatter) that would not be in any traditional
dictionary that we could count on being on the local host.
Debian's lintian (lint tool for packages) has a check for common
typos/misspellings in its informational mode. The package ships with
/usr/bin/spellintian which is a simple spellchecker that can run
independently.
The benefit of using spellintian over e.g. aspell is that it addresses
the issues you already identified: a) it just identifies typos, not
complaining on unknown words it doesn't know, b) it's been created from
observing typos in source code and package descriptions in the wild, so
it's tailored to technical jargon and their misspellings. It could be a
good fit to git commit messages.
That doesn't mean it's free of false positives though, so I wouldn't
recommend to use it in a voting check in a CI pipeline.
Plus, you know, intentional misspellings:
"Fix misspelling of wikinedia -> wikimedia"
-Chad