Thank you Brian for your suggestions. So I should make a backup of the database and execute your query, right?

Best regards!

нед, 11. сеп 2022. у 22:17 Brian Wolff <bawolff@gmail.com> је написао/ла:
Actually the $wgLegacyEncoding thing probably wont work, because that would fix if it needed to be converted to utf8, but the issue here is (i think) it has been double converted to utf8


--
Brian 

On Sunday, September 11, 2022, Brian Wolff <bawolff@gmail.com> wrote:
Not that i am aware of. If there are any, they are probably from mediawiki 1.5 and dont work anymore.

One possibility is to set $wgLegacyEncoding to ISO-8859-1, and remove utf-8 from the old_flags field (https://www.mediawiki.org/wiki/Manual:Text_table#old_flags ), so that mediawiki thinks they are from pre mediawiki 1.5.

You could also try a query like the following (i have not tested this, use at your own risk, have backups):
UPDATE text SET `old_text`=convert(cast(cast(`old_text` AS CHAR CHARACTER SET latin1) AS BINARY) USING utf8) where old_flags not like '%gzip%';

--
Brian

On Sunday, September 11, 2022, Zoran Dori <zorandori4444@gmail.com> wrote:
Hello Brian,
it was previously on version 1.31, I've upgraded it to the latest version, hoping that it will resolve all issues. :)

I've checked PHPMyAdmin, and they gave me access to it, and it shows the same article perfectly fine, without any weird characters, after I've changed encoding from latin1 to UTF-8.

But on the wiki itself, a character is shown incorrectly still. Is there any way where I can trigger MediaWiki to rebuild the database and show things correctly?

Looking forward to your response. :)

Best regards,
Zoran

суб, 10. сеп 2022. у 01:57 Brian Wolff <bawolff@gmail.com> је написао/ла:
ä is what you get when you take ä encoded as UTF-8 and interpret it as ISO-8859-1. So what probably happened, is that some text that was encoded as UTF-8 was treated as if it was ISO-8859-1/windows1252 and (unessearily) converted to UTF-8.

Common causes of this sort of thing:
- Very very old wiki from before MediaWiki adopted UTF-8 that wasn't upgraded properly. (I think MW adopted UTF-8 before MediaWiki 1.5, so it would have to be truly ancient).
- Restoring a DB from backup with some wrong options related to charset
- converting the charset of DB columns if they were originally mislabeled.

If its the entire DB that is broken, I think the easiest fix might be to take a DB dump, and use the iconv command line tool to convert UTF-8 -> windows-1252 (To undo one layer of conversion) and then import the result as if it was UTF-8.

--
brian.


On Thu, Sep 8, 2022 at 3:13 AM Zoran Dori <zorandori4444@gmail.com> wrote:
Hello,
I'm working on one wiki which shows characters in a weird way. UTF-8 is used for encoding, so I believe that it isn't an issue.

You can take a look here Statik A – Sub Bavaria (sub-bavaria.de), so you can better understand what I'm talking about.

Could you please point me to something that I should look for, so I can fix this issue?
Wiki was previously on version 1.31, I've upgraded it to 1.38.

Thanks for your help and understanding!

Best regards,
Zoran
_______________________________________________
MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
_______________________________________________
MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/
_______________________________________________
MediaWiki-l mailing list -- mediawiki-l@lists.wikimedia.org
To unsubscribe send an email to mediawiki-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/mediawiki-l.lists.wikimedia.org/