I spent some time between projects today exploring the idea of progressive
image decoding using the VP9 video codec -- sort of a mashup of progressive
JPEG and WebP.
Like a progressive JPEG, each resolution step (a separate frame of the
"video") encodes only the differences from the previous resolution step.
Like WebP, it's more space-efficient than the ancient JPEG codec.
This sort of technique might be useful for lazy-loading images in our
modern internet, where screen densities keep going up and network speeds
can vary by a factor of thousands. On a slow network the user sees
immediate feedback during load, and on a fast network they can reach full
resolution quickly, still in less bandwidth than a JPEG. And since JS would
have control of loading, we can halt cleanly if the image scrolls
offscreen, or pick a maximum resolution based on actual network speed.
Detail notes on my blog:
https://brionv.com/log/2016/06/14/exploring-vp9-as-a-progressive-still-imag…
Sample page if you just want to look at some decoded images at various
resolutions (loading not optimized for slow networks yet!):
https://media-streaming.wmflabs.org/pic9/
It looks plausible, and should be able to use native VP9 decoding in
Firefox, Chrome, and eventually MS Edge in some configurations with a
JavaScript fallback for Safari/etc. Currently my demo just plops all the
frames into a single .webm, but to avoid loading unneeded high-resolution
frames they should eventually be in separate files.
-- brion
In the past, we've had a mixture of fixed bitrates and quality-based
settings for producing video transcodes.
Each has its advantages: fixed bitrates are more predictable for watching
while streaming, while fixed quality settings allow for reducing the
bitrate on low-complexity scenes to save bandwidth (and increasing it on
high-complexity scenes to keep quality up!)
Since "download and watch it later" is less of a thing on today's internet
than "stream it right now!", I'd been leaning for a while towards moving
more things to fixed bitrates. However, I'm starting to come down on the
side of a fixed quality setting with a variable bitrate...
Overall variable rate encoding should lead to lower bandwidth usage for
most parts of most files, while still maintaining high quality on scenes
that need it.
The downside is that a high-complexity scene encoded at a higher bitrate
might cause buffering to run out during playback that had been working ok
on earlier scenes at a lower bitrate.
Once we support adaptive streaming (using MPEG-DASH, or something like it)
the system should be able to provide a detailed enough manifest[1] to show
which segments of the file are low-bandwidth and which are high-bandwidth,
so if there's a bandwidth limitation that stops us from viewing one
particular segment at the current resolution, we can bump down and then
bump resolution back up again when the bandwidth usage goes down.
If there's no strong objection, I'm going to tinker with the quality
settings for WebM and Ogg Theora video transcodes to try to find quality
settings I'm happy with that result in reasonable bandwidth averages.
[1] An MPEG-DASH manifest (.mpd) specifies a target bitrate on each
resolution representation, but the actual segments can be different sizes.
When they're specified as byte ranges of a source file, the exact segment
size is conveniently available!
-- brion
FYI, this week's presentations, according to the Etherpad, are:
* *Derk-Jan Hartman:* Video.js progress
* *Dmitry Brant*: Wikidata infoboxes in Android app
* *Joaquin Hernandez*: Vicky chat bot
* *Baha*: mobile printing for offline reading
* *Monte*: "smart random" content service endpoint
* *Erik*: Geo boosting search queries
Cheers,
Pine
On Thu, May 12, 2016 at 9:17 AM, Adam Baso <abaso(a)wikimedia.org> wrote:
> Reminder...
>
> On Thu, Apr 14, 2016 at 12:13 AM, Adam Baso <abaso(a)wikimedia.org> wrote:
>
> > Hi all,
> >
> > The next CREDIT showcase will be Thursday, 12-May-2016 at 1800 UTC (1100
> > SF).
> >
> > https://www.mediawiki.org/wiki/CREDIT_showcase
> >
> > For this one we'll use Hangouts on Air for presenters, and the customary
> > YouTube stream for viewers.
> >
> > See you next month!
> > -Adam
> >
> >
> >
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l(a)lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
For the last decade we've supported uploading SVG vector images to
MediaWiki, but we serve them as rasterized PNGs to browsers. Recently,
display resolutions are going up and up, but so is concern about
low-bandwidth mobile users.
This means we'd like sharper icons and diagrams on high-density phone
displays, but are leery of adding extra srcset entries with 3x or 4x
size PNGs which could become very large. (In fact currently MobileFrontend
strips even the 1.5x and 2x renderings we have now, making diagrams very
blurry on many mobile devices. See https://phabricator.wikimedia.org/T133496 -
fix in works.)
Here's the base bug for SVG client side rendering:
https://phabricator.wikimedia.org/T5593
I've turned it into an "epic" story tracking task and hung some blocking
tasks off it; see those for more details.
TL;DR stop reading here. ;)
One of the basic problems in the past was reliably showing them natively in
an <img>, with the same behavior as before, without using JavaScript hacks
or breaking the hamlet caching layer. This is neatly resolved for current
browsers by using the "srcset" attribute -- the same one we use to specify
higher-resolution rasterizations. If instead of PNGs at 1.5x and 2x
density, we specify an SVG at 1x, the SVG will be loaded instead of the
default PNG.
Since all srcset-supporting browsers allow SVG in <img> this should "just
work", and will be more compatible than using the experimental <picture>
element or the classic <object> which deals with events differently. Older
browsers will still see the PNG, and we can tweak the jquery.hidpi srcset
polyfill to test for SVG support to avoid breaking on some older browsers.
This should let us start testing client-side SVG via a beta feature (with
parser cache split on the user pref) at which point we can gather more
real-world feedback on performance and compatibility issues.
Rendering consistency across browser engines is a concern. Supposedly
modern browsers are more consistent than librsvg but we haven't done a
compatibility survey to confirm this or identify problematic constructs.
This is probably worth doing.
Performance is a big question. While clean simple SVGs are often nice and
small and efficient, it's also easy to make a HUGEly detailed SVG that is
much larger than the rasterized PNGs. Or a fairly simple small file may
still render slowly due to use of filters.
So we probably want to provide good tools for our editors and image authors
to help optimize their files. Show the renderings and the bandwidth balance
versus rasterization; maybe provide in-wiki implementation of svgo or other
lossy optimizer tools. Warn about things that are large or render slowly.
Maybe provide a switch to run particular files through rasterization always.
And we'll almost certainly want to strip comments and white space to save
bandwidth on page views, while retaining them all in the source file for
download and reediting.
Feature parity also needs more work. Localized text in SVGs is supported
with our server side rendering but this won't be reliable in the client;
which means we'll want to perform a server side transformation that creates
per-language "thumbnail" SVGs. Fonts for internationalized text are a big
deal, and may require similar transformations if we want to serve them...
Which may mean additional complications and bandwidth usage.
And then there are long term goals of taking more advantage of SVGs dynamic
nature -- making things animated or interactive. That's a much bigger
question and has implementation and security issues!
-- brion
At Wikimedia Conference in Berlin I met with Felix from Wikimedia Ghana,
who is super interested in getting more immersive media available such as
360-degree panoramic photos ("photo spheres"); I showed him the tool labs
widget using panellum to do WebGL spherical photo viewing -- see
https://phabricator.wikimedia.org/T70719#2204864 -- and he was very excited
to see that it's something we could probably work out how to integrate in
the nearish term.
That got me thinking more generally about new media types (video, panos,
stereoscopic photos/videos/panos, 3D models, interactive diagrams, etc) and
how we can extend them to support annotations and linking in a way that
could create immersive visual experiences with the same kind of rich
information and interlinking that Wikipedia is famous for in the world of
text articles.
Ladies and gentlemen, I give you: "*Epic saga: immersive hypermedia (Myst
for Wikipedia)*"
https://phabricator.wikimedia.org/T133526
I would be real interested to hear y'all's ideas on medium to long term
feasibility and desirability of this sort of system, and what we can pull
more directly into the short term.
For instance I would love to get the panoramic / spherical viewers
integrated in MMV, which is much easier than figuring out how to do
clickable annotations in 3d environment. ;)
Medium term, I would also love to see us look at the annotation system
that's on Commons done in site JS, and see if we can build a
future-extensible system that's more integrated into the wiki and can be
used in MMV.
Longer term, I think it'll just be nice to have these kinds of long-term
goals to work towards.
Thoughts? Ideas? Am I crazy, or just crazy enough? ;)
-- brion
In addition to the smaller bandwidth requirements for VP9 video encoding
versus Theora or VP8, Microsoft is adding support for VP9 video and Opus
audio in WebM to Windows 10 in the summer 2016 update.
Currently in Win10 preview builds this only works in Edge when using Media
Source Extensions, and VP9 is disabled by default if not
hardware-accelerated, but it's coming along. :)
If the final version lands with suitable config, users of Edge version 15
and later shouldn't need the ogv.js JavaScript decoding shim to get media
playback on Wikipedia. Neat!
Things still to do in TimedMediaHandler:
* add transcode output for audio-only files as Opus in WebM container
(Brion)
* keep working on the Kaltura->VideoJS front-end switch to make our lives
easier fixing the UI (Derk-Jan & Brion) and to prep for...
* eventually we'll want to use MPEG-DASH manifests and Media Source
Extensions to implement playback that's responsive to network and CPU speed
and can switch resolutions seamlessly. This may or may not be a
prerequisite for Win10 Edge playback if MS sticks with the MSE requirement.
* consider improving the transcode status overview at
Special:TimedMediaHandler; it reports errors in a way that doesn't scale
well.
Things still to do in Wikimedia site config:
* add VP9/Opus transcodes to our config (audio, 240p, 360p, 480p, 720p,
1080p definitely; consider 1440p and 2160p for Ultra-HD videos)
* consider dropping some VP8 sizes (desktop browsers that support VP8
should all support VP9 now; old Android versions that don't grok VP9 might
be main target remaining for VP8)
Things to consider:
* VP9 is slower to encode than VP8, and a transition will require a lot of
back-running of existing files. We *will* need to assign more video
scalers, at least temporarily.
* I started writing a client-side bot to trigger new transcodes on old
files. Prefer I should finish that, or prep a server side script that
someone in ops will have to babysit?
* in future, the ogv.js JS decoder shim will still be used for Safari and
IE 11, but I may be able to shift it from Theora to VP9 after making more
fixes to the WebM demuxer. Decoding is slower per pixel but at a lower
resolution you often get higher quality because of better compression and
handling of motion -- and bandwidth usage is much better, which should make
it a win on iPhones. This means eventually we may be able to reduce or drop
the Ogg output. This will also tie in with MPEG-DASH adaptive streaming, so
should be able to pick best size for CPU speed more reliably than current
heuristic.
* longer term, AOMedia codec will arrive (initial code drop came out
recently, based on VP10) with definite support from Google, Mozilla, and
Microsoft. This should end up supplementing VP9 in a couple years, and
should be even more bandwidth-efficient for high resolutions.
-- brion
https://scaleyourcode.com/interviews/interview/23
Claims resizing JPEGs "any size" in 25ms or less on m3.medium AWS instances.
While this is closed source, and is part of a worrying trend of closed SaaS
frameworks one has to pay by the hour, the author reveals enough technical
details to figure out how it works. Namely:
- the inspiration for the code is an unnamed Japanese paper which describes
how to process the Y, U and V components of a JPEG in parallel
- it uses a similar technique as the jpeg:size option of ImageMagick,
whereby only parts of the JPEG are read, instead of every pixel, according
to the needed target thumbnail size
- it leverages "vector math" in the processor, which I assume means AVX
instructions and registries
Essentially, it's parallelized decoding and resizing of JPEGs, using
hardware-specific instructions for optimization.
Of course writing something similar would be a large undertaking. Let's
hope that the folks who work on ImageMagick/GraphicsMagick take note and
try to do just that :)
I have confirmation that it's extremely fast from pals at deviantArt (whose
infrastructure is on AWS) who tried it out. To the point that they're
likely getting rid of their storage of intermediary resized images.
I have a feeling that we'll be seeing more of this sort of
hardware-optimized JPEG decoding/transcoding once Intel releases their
first CPUs with integrated FPGAs, which is supposed to happen soon-ish.
Unfortunately these Xeon CPUs will be released "in limited quantities, to
cloud providers first". Here's that annoying trend again...
Hi Multimedia enthusiasts, Commonists, and Wikitechers,
The Multimedia team has been hard at work building a new extension for
editing images on-wiki, and we believe we now have a workable demo
running on Labs! You can find it on our Multimedia Alpha Wiki[0], where
there are also instructions for testing.
Note that we have a list of known bugs and failings on that wiki, and we
are working on getting those fixed before we push the extension into any
kind of deployment - our next steps will likely be to put it on
test2wiki, then to push a BetaFeature to Commons if all goes well. We
will keep you updated with the status of the project as we progress.
If you find more bugs, or have concerns about this extension, you can
share them on the Village Pump[1]. You can also file a Phabricator task
against ImageTweaks[2] if you prefer to be in more direct contact with
the team about a technical issue.
Thanks for helping us test new stuff, and I look forward to getting this
great tool out to you soon!
[0] http://multimedia-alpha.wmflabs.org/wiki/index.php/Main_Page
[1]
https://commons.wikimedia.org/wiki/Commons:Village_pump#ImageTweaks_extensi…
[2]
https://phabricator.wikimedia.org/maniphest/task/create/?projects=ImageTwea…
--
Mark Holmquist
Lead Engineer
Multimedia Team
Wikimedia Foundation
http://marktraceur.info