[Foundation-l] Format Conversion

Gregory Maxwell gmaxwell at gmail.com
Sun Jan 20 20:16:01 UTC 2008


On Jan 20, 2008 1:50 PM, Robert Rohde <rarohde at gmail.com> wrote:
[snip]
> Many unfree file formats are poorly supported on Linux and associated free
> software systems; however, it is also true that many free file formats are
> poorly supported on Windows and similar unfree systems.

In fact, it has historically been the case that many free formats have
been supported worse on popular "Linux" systems than non-free formats!
 Although in the last year it has gotten a lot better.

This may seem odd, but its not: A typical "Linux" desktop is
chock-full of non-free software, or quasi-free software which support
non-free formats by ignoring patents and being too uninteresting for
legal action.

Some vendors of Gnu/Linux software directly include non-free software,
 others will include software which is free outside of the United
States, but not inside because of patent restrictions,  and in many
cases users just install the software they need from repositories on
the internet. (http://rpm.livna.org/rlowiki/ for example)

The viability of free formats has almost nothing to do with what type
of software (free or proprietary) users have on their own systems,  it
has everything to do with adoption and popularity.

For example, I considered it a major victory last year when I
convinced the mplayer developers to include Theora support in their
default build!   It really is false to paint this is a free-software
user vs proprietary software user issue.

Free formats can be implemented by proprietary systems just fine and
frequently are. In the case of multimedia codecs, the Ogg codecs are
all BSDish licensed, and have already been shipped by major
proprietary software vendors (Microsoft for example) for some
applications.

> For example, one
> may have to install additional readers/converters/etc. to use OGG and
> related formats on Windows that aren't available by default.  That is not an
> insurmountable barrier, but it is a barrier, especially to those people with
> limited technical skills.  In addition, there are many situations, such as
> libraries and public school computers, where the user is not allowed to
> install any software at all.

I think we should treat readers and encoders distinctly, and I'll go
into that below.

As far as software-install for reading goes,  If the user has Java
installed we provide a zero-install java applet player for both audio
and video.

True, not everyone has Java installed, but the adoption of it is still
pretty good (the success numbers I had from the popup player were
fairly high).   (... incidentally, it works on the kiosk machines the
public libraries here :) )

Can someone tell me if Windows Vista finally includes a current
generation version of Flash?  (Don't answer unless you are sitting at
a fresh install viewing your first webpage or are otherwise absolutely
sure, ... in the past many people claimed 9x did simply because
someone else installed it or they forgot)

In recent history almost all web video playback has required an
install on Windows.  So then that really reduces to an issue of one
install or two, and what people already have installed.  If they are
already installing making an argument for an additional or a different
install is a much smaller battle.

> I also suspect that there are many people of the YouTube Generation, for
> example, that would be happy to make their content free, but only really
> know how to work with MPEGs.

Youtube supports upload in dozens of formats, and they transcode to
FLV.  We can too. I would support that.

There are also a number of popular formats which are (probably)
mostly-free but are completely unsuitable for web distribution.   For
example, a lot of cameras record video as MJPEG-in-avi. Until this
year that has been the most common recording format for still  cameras
that also support video. (This year it will likely become H264)
Accepting those for upload would be very nice.

> Rather than have Wikimedia say "X format is unfree, therefore X is
> forbidden", I would much rather see: "X format is unfree,
> therefore Mediawiki will automatically translate it into these free formats,
> and provide all users with several options for which format they can most
> easily use."

Lets split this issue.

Should we allow lots of upload formats and be able to transcode? There
was a GSOC project to provide transcoding to Mediawiki, but I don't
know if it got anywhere.  I think this is a reasonable compromise, it
makes a lot of sense.

Should we send user files in clearly non-free formats?   I think it
would be very harmful to begin doing that, even as an option, and even
presuming we get it right and don't end up with the less popular
format always being broken.

I've explained in other posts, but I'll do again here so people that
don't want to wade through my verbal-vomit elsewhere don't have to:

In cases where the a free format is very popular and well supported,
JPG, and HTML for example no one is here suggesting we use proprietary
alternatives.  But for formats that are less popular, like Ogg/Vorbis,
we hear occasional requests "Won't you also offer AAC?"

This is because an unpopular free format is not, yet, actually free:
It has a cost.  The cost is that if you want to distribute in that
format, all your friends and family will have to be free-software
geeks. Other people will suffer some inconvenience to use it.  The
tools will be less common and harder to find.   Minting more
free-software geeks is nice, but it is not our mission.

So we have a cycle:  Users don't use the free format because their
software doesn't offer it; Their software doesn't offer it because the
users don't demand it; The users don't demand it because their
friends/contacts are only giving them the competing proprietary
files...   There is a strong network effect
(http://en.wikipedia.org/wiki/Network_effect) at play.

There is a one time cost to break the cycle. This cost takes the form
of education, promotion, and sometimes a little development.    The
rights-holders of proprietary formats are well aware of the costs:
When Vorbis was released MP3 licensing became a fraction of its prior
cost, and the price is constantly adjusted to keep the sort term cost
of using the proprietary formats cheaper than driving adoption of a
free format.

(Although some recent submarine patent lawsuits against MPEG licensees
have thrown that balancing act out of wack and we're seeing a recent
upswing in interest in unencumbered formats)

Once that cost is paid, once people are convinced to support the
format in their applications, once users adopt it... The cost is gone,
and no one bothers pushing a proprietary format because a proprietary
format simply can not compete with the low cost, flexibility, and
resulting ubiquity of a well adopted free format.

In order to understand why this matters to us you have to stop
thinking in terms of "Can someone on a free software system read the
file?"  ... Forget that... We're not the FSF, and while the adoption
of Free Software is important there is an issue here far more directly
related to our mission:

Can people freely author material in the format, without being forced
to pay royalties, use software which denies them freedom, suffer a
greatly reduced audience, or accept an addition burden in technical
complexity?

Lets imagine that Wikipedia always took the route you propose: You
post a freely licensed video on Wikipedia.  Wikipedia offers it in a
free format as well as many other non-free formats, I download it in
the free format. I make some edits.  I plan on putting it up on my
site.

Now I must choose: Do I use a poorly adopted free format, meaning the
only free-software-geeks can view it, do I use a proprietary format
with expensive encoders that deny my rights and have per-download
fees, do I undertake the difficulty of providing two files while still
paying all the costs for the non-free file and knowing few people will
view the free one,  or do I ask someone else to host the file who will
pick up the licensing costs and charge me some in other way?

Freedom does not mean the freedom to choose your master, freedom is
the freedom to be your own master.

Wikimedia is in a position to drive free formats without too much
cost, and it has already done so.

Five years ago free multimedia formats were something only
free-software-geeks could read and write.  Today, the majority of
viewers to our site can view these files.  In a short time span my
popup WikiMediaPlayer had successful audio and video playback for well
over 1 million distinct IPs from all over the world, likely
representing millions of people.   Soon, mainstream browsers with
built in free multimedia format support will be shipping.

Wikimedia's exclusive use of free formats in the past has been
instrumental in getting things this far, and it will be instrumental
in crossing the finish line in the not so distant future.  When free
media formats are as widely adopted as HTML and JPG, I won't have any
more reason to protest  offering proprietary media formats, but you'll
have no good reason to ask for them either.

As an educational non-profit we can tolerate the short term costs in
order to obtain the long term benefits, and as a top ten website these
costs are low. Besides, we've already paid the cost of exclusively
using free formats for many years. It would be foolish to stop now
when the media formats issue is closer to resolved than it ever has
been before.

Success in Wikimedia's mission *requires* the existence of free
formats, not merely "free for us and terribly inconvenient for
everyone else", but formats which are free for *everyone*.

Not just free to look but not touch, but authoring, distribution, etc.
... The formats need to be as free as the content they contain,
otherwise the content isn't as free as we claim it to be.

[snip]
> If the same content could be made available in both a widely-used
> proprietary format and poorly used open source format, would anyone object,
> in principle, to using Wikimedia facilities to convert between the two and
> then distributing both?

Yes.   It's not like the idea of distributing both is new or
revolutionary. But it's expensive to do, hard to get right, and
doesn't result in fulfilling the mission over the long term.

In the case of accepting many formats, I think that makes sense:
There is little need for us to drive authoring in free formats today,
because authoring isn't the limiting factor in the usability of free
formats. Client support is.    Once client support is common the
advantages of free formats will make them ubiquitous, and the
authoring tools should follow. Since the licensing for proprietary
codecs is usually per-coder and per-download-in-that-format WMF could
legally read the files without great cost.

On of the things I liked about the resolution Florence posted was that
it talked about what Wikimedia distributes on the sites, not what
people submit to us. I think thats the right emphasis today.



More information about the foundation-l mailing list