Wikitech-l January 2003

wikitech-l@lists.wikimedia.org

45 participants
143 discussions

by Erik Moeller

I think the current handling of interlanguage links is problematic and not very scalable. If we have n copies of an article, we need need n*n-1 interlanguage links. For 10 languages, that would be 90 links! All of these links have to be added to separate pages, by people speaking different languages, who often don't even have an account on the Wikipedia in question. As should be obvious, we are already missing interlanguage links for many, if not most, of the translations we have. The scalable solution requires us to have a meta-table for interlanguage links that can be accessed by all Wikipedias. This table could look like this: language1 article1 language1 article2 ------------------------------------------------------------ en Main Page de Hauptseite fr Accueil en Main Page fr Accueil es Portada ... Let's call it shared.ilinks for the moment. Instead of adding interlanguage links on top of articles, we would have a separate text line below article bodies: Interlanguage links (syntax: [[<code>:<article name>]]) The syntax would remain the same so that the link line can be cut and pasted from the body. But this line would not be stored in that form in the database. Display of interlanguage links ------------------------------ Say I visit [[Main Page]] on en.wikipedia.org. Now, in order to show the list of links, the shared.ilinks table is queried: SELECT * from shared.ilinks where (language1=en and article1="Main Page") or (language2=en and article2="Main Page") That is, a single SELECT allows us to find all translations of the word "Main Page". But don't we only save relatively little time, as we still have to tell *every* Wikipedia that homepage means "Main Page" in English? No, because we can now leave this to the code. When a user edits a page, the same list of links is generated, but this time in the wiki syntax ([[fr:Accueil]] [[de:Hauptseite]] and so on). This can be edited by anyone. When the list has been edited, and the page is saved, the following is done: 1) The same SELECT as above is run: SELECT * from shared.ilinks where (language1=en and article1="Main Page") or (language2=en and article2="Main Page") 2) Now, for each translation we get, another similar SELECT is run, so that we find further translations into other languages. 3) Every new translation we discover is stored in a new English (in our example)/<new translation> table row, so that we can do the quick, one-time SELECT to display the interlanguage links. The result: If we have a page in 10 translations, the minimum effort we have to go to is to add exactly one translation on every language Wikipedia. That is, a minimum of 9 as opposed to 90 links! The other translations are automatically discovered. Example: Someone creates a new page about Phil Collins on fr.wikipedia.org. This person knows that there's already an English page about him on en.wikipedia.org, so they type [[en:]] (suggested short syntax for "same name as here"). "fr:Phil Collins->en:Phil Collins" is inserted into the shared.ilinks table. This already means that the link is also shown on en.wikipedia.org. But it gets better: Now someone on de.wikipedia.org creates a Phil Collins page as well. He links to en.wikipedia.org's [[en:]] entry. Zap!, after saving the entry, the French translation is automatically discovered. Now the French translation has a link to the German page and vice versa as well. Editing links ------------- What happens if the folks on fr.wikipedia.org move one of their pages? The "Move this page" command now needs to automatically change every instance of the page to something else (e.g. Accueil->Homepage) in the shared.ilinks table. What happens if someone on en.wikipedia.org decides that they do not want to link to a page on nl.wikipedia.org because it contains obsolete information, or because of "link-vandalism"? To unilaterally remove a link to one translation, there would have to be a special interlanguage link, like [[nl::]]. When saved, the link would be cleared and not "rediscovered" until someone removed the [[nl::]] link. Such empty links would not be copied. If [[nl:Hoofdpagina]] is deleted, all instances of it in the shared.ilinks table are removed as well. What about links where there is no 1:1 relationship? Say I have a page about "evolution" and "theory of evolution" on one wiki (English) and only a page about "evolution" on another (French). So I add the following to en.wikipedia.org on both pages: [[fr:Théorie de l'évolution]] In the shared.ilinks table, I therefore get entries: Evolution Théorie de l'évolution Theory of Evolution Théorie de l'évolution When I visit the "Evolution" page, I get a clear match: Théorie de l'évolution. But when I visit the "Théorie de l'évolution", I get two matches. In this case, we could actually show both links on the French page: English: [1],[2] Or in edit mode: [[en:Evolution]][[en:Theory of Evolution]] It may not be desirable to autocopy these duplicate links. So, if we cannot discover an exact match, we may want to wait until someone specifies a precise translation. Discussion ---------- The process described above is complex from a technical perspective, because it has to be respected during all changes to articles (move, delete, edit etc.) It also requires us to run a separate database server specifically for this shared information. There may be scenarios that I have not yet covered in the above proposal, although I am sure solutions can be found for every problem. There are numerous advantages to this approach. Compared with the current handling, we should quickly get an accurate representation of interlanguage links on all wikis. We do not have to pick a single language as "key" language, which would require a key entry in that language to exist for all pages. [1] There may be simpler solutions that I cannot see - if so, I would love to hear about them. But I really think we should consider redesigning the interlanguage links before the problem grows out of control. Regards, Erik [1] Although that would expose us to charges of anglocentrism, I am open to discussing this alternative. -- FOKUS - Fraunhofer Insitute for Open Communication Systems Project BerliOS - http://www.berlios.de

21 years, 4 months

Alternative proposal for interlanguage links redesign and a few other issues

by Tomasz Wegrzanowski

1. move everything to Postgres 2. move everything to common database, with tables foo_cur, foo_old etc., where foo are language names 3. make single user table (needs some tweaking to allow slightly different preferences), single logging system, single recent changes, and all other nice things we can do with that 4. move everything to UTF-8, so we don't have to use %escapes in English Wikipedia 5. create table interwiki (source_lang, target_lang, source_title, target_title) 6. convert all cur_text by removing interwiki links from page tops, and add apropriate entries to interwiki. 7. compute transitive symmetric closure of foo_interwiki, and display that as interwiki links. If there will be many articles of the same language in it display it like: "English (Astronomy), English (Astrophysics), German" In practive it shouldn't be such a big problem. 8. when editing add interwiki links on top of page or in separate box. add Javascript button to add all from links from transitive symmetric closure of interwiki 9. add some magic functionality that allows changing many interwiki links at time. It is needed as transitive symmetric closure often contains many copies of the same. "delete all interwiki links to this page" and "change all interwiki links from X to Y" would probably be enough. Computing symmetric transitive closure will require a bit of magic. I think we should keep results in database and change only - editing may be a bit slower, as long as viewing is not any worse than now.

21 years, 4 months

Special pages cache?

by Magnus Manske

Some of the special pages, like Most Wanted, have been deactivated during busy times. Would it be a good idea to store the results of the last generated version as a "regular" article? Example: I request "Special:Most Wanted" when it is available. The link list I get is written into "Wikipedia:Most Wanted".

21 years, 4 months

Image handling redesign proposal

by Erik Moeller

I think the current image handling is slightly messed up, for the following reasons: 1) There are too many different ways to link small/large versions. a) There are usability problems with Brion's suggested approach of including the larger version of an image on the image page: - The headline may say something like image_small.jpg whereas the actual image displayed is large - Clicking through a second time leads to another (usually empty) page - Captions effectively have to be entered up to three times 2) Users have to go to too much effort in order to create small versions of images. This is not something that researchers and authors should have to waste time with. It also impedes uploading of high resolution images, which can really hurt us when we start thinking about a printed edition of Wikipedia. 3) Content of image pages is neglected because it is "hidden" most of the time. Many people treat image descriptions like changelog entries (relatively carelessly). The fact that it even took me a while to understand the current handling of images doesn't bode well for the usability of the concept. I propose the following changes: -------------------------------- 1) As suggested earlier, an image page should always display the image it refers to. 2) Smaller versions of images should be auto-generated in a separate directory similar to the math/ directory used for texvc's images. The small versions would be viewed on the article where the [[Image]] tag is included, whereas the image would link to the original size version. We could use the GD library functions for creating thumbnails. See, for example: http://www.onlinetools.org/articles/creating_thumbnails_all.php However, auto-determining thumbnail sizes is problematic because a useful size often depends on context. A proper way to handle this may be to support the following variants of the [[Image]] tag: [[Image:foo.jpg width=100 height=100]] [[Image:foo.jpg width=100]] [[Image:foo.jpg height=100]] -> height or width autocalculated as per aspect ratio [[Image:foo.jpg size=10%]] The smaller versions would be generated as necessary and stored in a temporary directory. The matching original image information (date, size) would be stored in a table so that they can be updated on demand. 3) The image page content should be included by default below the image (preceded by a <BR>). That way when you type [[Image:foo.jpg]] You get <img src="http://../foo.jpg"><BR> <I>This is an ugly photo!</I> To suppress this and type a manual caption, you would have to do something like: [[Image:foo.jpg notext]] That way, you can have - the standard case: image with a simple caption; no need to update twice - the extended case: image with a short caption on the page where it is embedded and a longer discussion on its image page. Discussion ---------- The approach discussed above has almost no obvious disadvantages. The following problems may ensue, though: - Existing image pages will have to be re-edited to remove now redundant image content. Existing thumbnail images can be deleted. - It is somewhat counter-intuitive to have the caption rendered implicitly on a page that includes an [[Image:foo.jpg]] tag. The alternative would be to do away with image pages as regular content-pages altogether. (Realistically, having a separate image namespace may have been a bad idea in the first place.) However, having lots of redundant (and often neglected) content is clearly the least preferable choice. There would, in my opinion, be massive advantages to having auto-generated small versions of images. This would greatly increase the usability on many pages, and make the traditional "click to view larger version" approach be usable almost anywhere. Is the GD library installed on Wikipedia's server? I would appreciate feedback on this proposal. I'd be willing to give the autogeneration a try, if no one else volunteers. Regards, Erik -- FOKUS - Fraunhofer Insitute for Open Communication Systems Project BerliOS - http://www.berlios.de

21 years, 4 months

Re: Image handling redesign proposal

by Daniel Mayer

Magnus wrote: >Erik Moeller wrote: >>I propose the following changes: >>-------------------------------- >> >>1) As suggested earlier, an image page should always >>display the image it refers to. >> >Makes sense. See my earlier post on this. IMO the only time an image should be single-click through is when an image is intentionally displayed on the image description page. Otherwise a user should have to click twice to get to the image description page (alt text should work in both cases though). >>2) Smaller versions of images should be auto->>generated in a separate directory similar to the >>math/ directory used for texvc's images. The small >>versions would be viewed on the article where the >>[[Image]] tag is included, whereas the image would >>link to the original size version. > >Two items with this one: >1. A thumbnail should be generated upon upload, so we >don't have to wade through thaton every page display, >2. *if* and *only if* that is necessary. The images >DW uploaded lately to replace mine don't really need >a thumbnail ;-) IMO this isn't the best way to do things. As described below smaller images should be created on-the-fly by using markup to specify desired width (and deleted after a specified period of not being used). Then if the [[Image:foo.jpg width=100px]] syntax is used /then/ the thumbnail can be clicked once to get to the image description page (which contains the full-sized displayed image). If there is no image displayed on the image description page then the user would have to click twice to get there (with no changing of the mouse pointer to the little hand). This would give users the greatest amount of control and flexibility. Doing things automatically upon upload would be a nightmare (esp. for images that are inserted in tables; often a very precise image width is needed). >>However, auto-determining thumbnail sizes is >>problematic because a useful size often depends on >>context. A proper way to handle this may be to >>support the following variants of the >>[[Image]] tag: >> >> [[Image:foo.jpg width=100 height=100]] >> >> [[Image:foo.jpg width=100]] >> [[Image:foo.jpg height=100]] >> -> height or width autocalculated as >>per aspect ratio >> >> [[Image:foo.jpg size=10%]] > >Why not say: *If* we need a thumbnail, it has a width >of, say, 150 pixel (just to have a number). >Width is the "problematic" factor, on smaller >screens. So, for every image wider than this, a >thumbnail is used, otherwise the original >image. Again, doing this automatically will cause a great deal of trouble. There are many cases where images larger than even 250 pixels are used and appropriate especially if the images are centered or otherwise do not have text flowing around them. (the optimal range of widths for images /with/ text flowing around them is 150-250 pixels with image detail and type usually playing the deciding factor for the resulting width). >>.... >... >>- It is somewhat counter-intuitive to have the >>caption rendered implicitly on a page that includes >>an [[Image:foo.jpg]] tag. The alternative would be >>to do away with image pages as regular content-pages >>altogether. >>(Realistically, having a separate image >>namespace may have been a bad idea in the first >>place.) >> >How about the alt tag thingy I installed at the test >site? IMO alt tags are needed for images with and without larger versions displayed on the image description page. But please to not get rid of the image description pages. Very often wiki markup is used along with external links. These links are not usable in the form of mouse over text. However, for whatever reason, if at least one image is /not/ displayed on an image description page then the image displayed in the article should have to be clicked twice in order to get to the image description page (with no display of the little hand when the mouse pointer is over the image; just the display of the alt text). >>However, having lots of redundant (and often >>neglected) content is clearly the least preferable >>choice. >> >>There would, in my opinion, be massive advantages to >>having auto-generated small versions of images. This >>would greatly increase the usability on many pages, >>and make the traditional "click to view larger >>version" approach be usable almost anywhere. > >I agree. We'll have to think about what image to use >on "printable version" - the thumbnail to keep >layout, or the large one for resolution? Don't forget WYSIWYG. The person should be given the same article layout they see in regular article as with the print version. Using the larger version in the print version would destroy the layout of tables that have images in them (not to mention that the larger versions of images with text flowing around them would result in pages with two word lines next to huge images). IMO on the image description pages we should have "Printable version: [Small image] [Large image]" -- Daniel Mayer (aka mav) __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com

21 years, 4 months

RE: [Wikitech-l] Talk pages redesign proposal

by Mark Christensen

> > Current talk pages are extremely unfriendly. > > I think they should be abandoned and replaced by > > more "natural" system of posting. > > You mean, not a wiki? > I think not. I agree talk pages must remain wiki style for three reasons. First, adding another system for talk pages makes users learn another system and that makes things inherently more complex. Second, talk pages are intended to help people make articles, and therefore the requirement to learn wiki syntax to post on talk pages just keeps out those who wouldn't take time to learn to contribute to actual articles. Ever since Ward's Wiki is that there is some minimum requirement to participate -- namely the ability to figure out wiki syntax. This is not hard, but it does require a little effort. And many have argued that this is one of the reasons that wiki's can maintain a higher level of discourse than newsgroups and web based discussion groups. Third, an important part of the "wiki way" is refactoring, I have refactored a number of large talk pages removing the dross, keeping sustentative contributions, and putting it all together to flow more naturally. This is only possible because of the flexibility of the wiki system.

21 years, 4 months

SVG image support w/ auto rasterization

by Brion Vibber

...would be awfully nice. Comments and suggestions to http://meta.wikipedia.org/wiki/SVG_image_support , if you please. -- brion vibber (brion @ pobox.com)

21 years, 4 months

Cookie expiration time much too short

by Tomasz Wegrzanowski

Please make it significantly longer. There have been lot of complaints on Polish Wikipedia.

21 years, 4 months

New Features: Was: New feature: Backlink when editing a page

by rose.parks＠att.net

Hi, I am beginning to be confused about the sequence of events when a new feature is introduced. It seems that one or two members of wikitech-l (developers) support a new feature, write the code, put it on test.wikipedia.org and wait for feedback. At times this wait is less than a day. The feature is usually implemented. Then, it is announced or noticed by someone on wikipedia-l, and 60 e-mails follow discussing and arguing over it. If someone has a new feature they want to see implemented, why don't they present it to the whole membership first and then allow a few days for discussion. After everyone has a chance to think about it and raise their objections, modifications, etc, then implement it, if most members want it. Recently, there have been remarks on a number of changes, that it might have been better to have a broader-based discussion, before implementation, when the usual flood of e-mails followed the announcement of a new feature. Just because many members of wikipedia do not have the skills to make such changes, it doesn't mean that they don't have an opinion on them. Tonight I am watching this process proceed on two changes on the edit/preview pages. All discussion is on wikitech-l. It has been suggested that at least one change will probably be implemented tomorrow. Meanwhile, the main membership has no idea that any such change/s are about to happen. As Ever, Ruth Ifcher --

21 years, 4 months

TeX ?

by Tomasz Wegrzanowski

When it will be installed ?

21 years, 4 months

← Newer
1
...
8
9
10
11
12
13
14
15
Older →

Jump to page:

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Wikitech-l January 2003