Gregory Szorc wrote:
Please read my
own proposal for reworking the extension interface at:
http://mail.wikipedia.org/pipermail/wikitech-l/2006-July/037035.html
Posted to this list 10 days ago.
Forgive my ignorance. I read the first few paragraphs of the post when it
was originally sent and ignored the rest, just thinking it was another
Wikipedia-only message. Now, having read it...
I do like your proposal for static objects being initialized as-needed.
There is great power in the just-in-time object::getInstance() method.
However, one of my criticisms of MediaWiki's architecture has always been
the over dependence on global objects, which are in some ways like static
classes using the Singleton pattern (see
http://blog.case.edu/gps10/2006/07/22/why_global_variables_in_php_is_bad_pr…
why I don't like global objects). I would much rather see the Wiki
class contain these "global objects" as static variables which can be
accessed via a just-in-time getObject() static call to the Wiki class. This
sounds like the same approach as the proposed wfGetFoo() methods (it
basically is), but polluting the global symbol table with objects and
functions not belonging to classes is unecessary when these could all belong
to a master Wiki class. If you don't buy the "don't do it because you'd
be
polluting the symbol table" argument, do it for the sake of keeping
everything organized into classes. Do wfGetFoo() functions really belong in
the global namespace, or do they belong to a class representing a wiki?
Hell, if you get rid of all the global functions and attach them to existing
classes, that is one less file to include! </rant on global objects>
I'll answer this at the end.
If you are talking about 500us, why are there still
require and require_once
calls in the trunk? These both require system calls (require_once actually
requires an additional one and hence is slower). I know work has been done
developing the __autoload function (if you ever commit to 5.1,
spl_autoload_register() is preferred), but at the level of commitment you
give to performance, every require_once has to seem like a monkey on your
back.
I got rid of about half of the require_once calls from Setup.php in
localisation-work. The remaining ones are mostly for global functions.
Also, how do you accurately profile MediaWiki?
I've used xdebug and
Kcachegrind to profile scripts before, but it always bothers me because I
cannot use xdebug alongside APC or eaccelerator to get results reflective of
my production deployment. I know APC and eaccelerator completely change the
chokepoints, but it is impossible for me to see what the new chokepoints
are!
http://noc.wikimedia.org/cgi-bin/report.py
Data is generated with ProfilerSimpleUDP. Averaging over a million requests
gives you excellent accuracy, thanks to the central limit theorem. However,
that data may be subject to slight systemic inaccuracies due to the
profiling overhead.
Can you feed MediaWiki's internal profiling output
into Kcachegrind?
No.
Now, getting back to the topic of extensions. For a
base extension class, I
was thinking of an abstract class that has numerous methods,
providesSpecialPage(), providesHook(), providesWhichHooks(),
providesParserTags(), etc. Let's say we establish a defined extensions root
directory. When MediaWiki loads, it periodically checks this directory for
all files representing extensions and loads them (perhaps this is triggered
manually via CRON, Special Page, filemtime(), etc). When the extensions are
loaded from the directory, a map is established that records the abilities
of each. This map is serialized for quick retrieval. Whenever MediaWiki
loads, it just goes to the map and loads extensions just-in-time. This
would require an extension manager class that would initialize extensions as
called for by the map. For example, when the parser sees a tag it doesn't
recognize, it would go
ExtensionManager::getExtensionForParserTag($foo)->parse($content); Or, when
a special page is called, we have
ExtensionManager::executeSpecialPage($foo); For hooks, the same deal.
I like the idea for a map between capabilities and callbacks, etc, but I
don't like the idea of a module specification file. Why should you need to
provide a specification file when the same information can be obtained from
methods inherited from a base extension class? As long as you cache the
output of these methods, there is zero performance overhead and extensions
have the added bonus of being much more structured. Yes, it would break
existing functionality. But if you are already talking about making
just-in-time calls to instantiate global objects like $wgTitle, $wgOut, etc,
then many existing extensions will be broken anyway. Sometimes you just
have to make sacrifices for the sake of progress.
You obviously also missed my post on stub globals. After I made the first
post, I discovered a method for painless migration to deferred object
initialisation, and I made that the topic of a second post. I share your
concerns about the flexibility of global variables, in fact my discussion of
the issue in phase3/docs/globals.txt mirrors your blog post very closely.
But there doesn't seem to be any pressing need to sacrifice backwards
compatibility while we pursue this goal. But a singleton object shares all
the flexibilty problems of global variables, so that's not a solution.
You make a good point about the fact that capabilities can be provided by a
member function and cached. There's still no need to sacrifice backwards
compatibility though, that I can see. We can keep both the ability of
extensions to operate across multiple MediaWiki versions, and the ability
for most old extensions to continue to work properly in new versions of
MediaWiki, if we design the interface carefully enough. If backwards
compatibility for old extensions proves to be too much of a performance
burden, then we can drop it after a couple of releases. What I want to avoid
is the requirement that extensions be simultaneously updated along with the
core. That's a hassle for both site administrators and extension developers.
Especially since many extensions are unreleased, their versions unnumbered.
-- Tim Starling