Re: [Wikitech-l] New diff feature for MediaWiki

9 Jun 2006

On 6/8/06, Roman Nosov &lt;rnosov(a)gmail.com&gt; wrote:
...

 Well it looks like my question about why some quotation marks do break
 words and others don't will remain unanswered ("rareness" of high
 numbered punctuation doesn't make it part of a word) … Anyway if such
 level of supporting UTF-8 is sufficient for Mediawiki then Unicode
 issue is "solved". Unicode über alles. 

I think it was adequately explained - the reason why it isn't detected is
because the algorithm doesn't know it's a seperation character. So it's not
seperated. If the algorithm did know, it would be seperated properly.

So perhaps someone, like you, should submit a quick patch to that part of
the diff engine, as outlined by Tim, that makes it properly interpret that
code point. If there's a general rule or table in the Unicode standard then
implementing that might be an even better option.

The unicode site, by the way, is www.unicode.org and you can find a database
of unicode character properties here:

http://www.unicode.org/Public/UNIDATA/UnicodeData.txt

with information on interpreting them here:

http://ftp.lanet.lv/ftp/mirror/unicode/3.2-Update/UnicodeData-3.2.0.html

Enjoy!

-- 
Ben Garney
Torque Technologies Director
GarageGames.Com, Inc.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] New diff feature for MediaWiki