Re: [Wikitech-l] UTF-8 on English Wikipedia?

16 Jun 2004

Though not being a developer I might make a suggestion on this topic.

For I'm helping out with the Open Directiory Project I know they have been  
struggling with converting their databases to utf8 quite some time now.  
And strange characters keep appearing here and there.

Maybe the devs of mediawiki could get some cool hints from them guys on  
what unforeseen problems might arise when converting to utf8.

Cheers
Manfred

...
  On Wednesday 16 June 2004 10:16, Nikola Smolenski
wrote:
   I am
thinking about an even simpler solution. Have server-side script
 convert articles and their histories to UTF-8. Have a postprocessor
 (written in C) tell if a page is in UTF-8 and change appropriate meta    tag
  if it is. It's vastly improbable that a UTF-8
page will not be in    UTF-8,
  it could be checked on a database dump and I
don't believe that any    such
  page would be found. When all pages are
converted, site could be    switched
  to UTF-8 and
 postprocessor turned off. 
 This could even be doen without a postprocessor, there is PHP
 mb_detect_encoding function which does exactly that. 
 Quick, dirty, and it seems to work. I know that buffering is slow, but  
 this
 would only be a temporary solution. When hairs on your head settle down,
 let's talk about it  :) 

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] UTF-8 on English Wikipedia?