Re: [Wikitech-l] Re: UTF-8 on English Wikipedia?

22 Jun 2004

On Monday 21 June 2004 17:42, Timwi wrote:
...
  Brion Vibber wrote:
  Nikola Smolenski wrote:
  Quick, dirty, and it seems to work. I know that
buffering is slow, but
 this would only be a temporary solution. When hairs on your head
 settle down, let's talk about it  :) 
 That won't work, as you have to be able to link from one page to
 another. Title management, pulling things from the database, and case
 folding are all dependent on the character set, long before we output
 anything. 
 If I understood him right, his suggestion was to take the server down
 briefly to convert everything but 'old' to UTF-8 atomically. In
 particular, this would convert 'cur', and we can use UTF-8 for
 everything including title management and case folding. 
Well, no, I suggested that articles are converted one at a time, both cur and 
old; each article should be protected from editing during the conversion.

Pulling things from the database is not a problem, database will be happy to 
think that it is still pulling ISO-8859-1 and won't even notice the change.

Title management and case folding could pose a a problem, solution for it 
would be not to convert articles which link to articles which have high 
characters in their titles. I don't think there's so many of them.

And when everything is done, convert titles, and rest of the articles.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Re: UTF-8 on English Wikipedia?