Discussion:
Inter-spacing improvements
(too old to reply)
Roger B. Sidje
2005-09-01 03:48:16 UTC
Permalink
As raised several times in this newsgroup (and privately with some
people), the current inter-space between MathML frames, albeit
context-sensitive, is not always ideal. To many people, the all too
common $dx$ in integrands has so far been showing up as a perplexing
"d x" in Gecko...

These inter-spacing issues started hitting Wikipedia folks particularly
hard, as they are investigating what it will take to move from TeX PNGs
images to MathML, with currently over 50,000 equations on the English
Wikipedia alone...

Revisiting the inter-spacing is something that I have always hoped to
do. I have now been able to do so, and have submitted the changes on bug
306543.

[bugzilla]https://bugzilla.mozilla.org/show_bug.cgi?id=306543

You can see the improvements on this PDF document:

[PDF-400K]https://bugzilla.mozilla.org/attachment.cgi?id=194502

The PDF was produced by browsing this MathML Page:

[MathML-180K]http://abel.math.harvard.edu/~dmharvey/blahtex/0.2/page-34.php

I got the PDF from the browser (with the patch) by just printing that
page to a PDF driver (PDFCreator in this case), hence it is as-is. It is
hard to tell the difference with TeX PNGs now. (You will notice that the
maths use TeX fonts across the board instead of the usual Times Roman.
For the purpose of testing the patch, I did set TeX fonts in mathml.css
-- that's all. They look okay on the PDF but don't anti-alias very well
on the browser screen itself. That's why I still haven't set them as the
default in place of Times Roman in the release builds.)

While still not the end of the road, the patch solves most of the cases
that people have been talking about, as can been seen on the PDF. But as
usual, there are edges cases such as $y = b \, \sin t$ or

<mrow>
<mi>y</mi>
<mo>=</mo>
<mi>b</mi>
<mspace width="0.1em"/>
<mi>sin</mi>
<mo>&ApplyFunction;</mo>
<mi>t</mi>
</mrow>

This renders (with the patch) with only an inter-space of "0.1em"
between b and sin (c.f. the PDF). By contrast, TeX will put \, in
*addition* to the intrinsic left-space that it knows for its operator
\sin. As \sin (and others) aren't built-in operators for us, I wonder
about special-casing <mspace> to recover TeX behavior here.

Keep a watch on bug 306543 to follow further developments.
Unfortunately, it is a large patch, and I don't feel like fighting to
get it into the upcoming Firefox 1.5 (Gecko1.8) which has already
branched off from the trunk (if somebody wants to fight for that, go
ahead...). In any case, expect these improvements in Firefox 2.0
(Gecko1.9) -- which might be a year or so away. In the meantime, they
will in the nightly builds soon after the patch is checked in the trunk.
---
RBS

PS: There are some oddities with the spacing values of the factorial
operator "!". The values for form="prefix" and form="postfix" seem to
have been inverted in the sample MathML REC's Operator Dictionary.
Roger B. Sidje
2005-09-01 09:29:24 UTC
Permalink
Post by Roger B. Sidje
<mrow>
<mi>y</mi>
<mo>=</mo>
<mi>b</mi>
<mspace width="0.1em"/>
<mi>sin</mi>
<mo>&ApplyFunction;</mo>
<mi>t</mi>
</mrow>
This renders (with the patch) with only an inter-space of "0.1em"
between b and sin.
On further thoughts, I am considering this behavior as okay.

Let's suppose that a renderer is given the above markup (without the
loaded assumption that it was generated from TeX). Why would the
renderer downplay the fact that the user is specifically asking for
<mspace width="0.1em"/> and do something more?

Conversely, assuming that TeX didn't add the extra left-space of \sin in
"b \, \sin t", its users would have anticipated that and used \,\,
instead -- if they wanted to recover the lost. So this egdy case seems
to be a TeX legacy affair. For now, I am going to leave it as is and not
special case it, unless people have different ideas. TeX generators are
in a better position to account for the extra space by setting 2
thinspaces (0.3em) as the width of the generated <mspace>.
---
RBS
Roger B. Sidje
2005-09-07 23:17:49 UTC
Permalink
FYI, this has been checked-in. Apart from the nicer inter-spacing
itself, people are going to be particularly relieved from the nasty
spacing caused by fonts that returned a space for invisible operators
(&it;, &af;) instead of returning nothing. If there are still issues,
let me know quickly. There is some momentum towards getting these
inter-spacing improvements in Firefox 1.5 now rather than Firefox 2.0, 1
year later. Fingers crossed...

Also of interest, I tweaked the gaps that were happening in stretchy
characters (particularly on Linux/Xft or newish high resolution screens
on Windows). The fix involved jamming the stretchy pieces even more to
mitigate pixel roundoff issues or curved end-points (c.f. bug 307157:
https://bugzilla.mozilla.org/show_bug.cgi?id=307157).

For those curious as to how I got that PDF (on WinXP), here is a diff
for the CSS settings that I made in mathml.css:
https://bugzilla.mozilla.org/attachment.cgi?id=195091

All the diff does is simply prepend CMR10 in the font-family of the
<math> element and set CMMI10 as the font-family of <mi> when it says
that it wants an italic style. Then, I used PDFCreator
(http://sourceforge.net/projects/pdfcreator/).

Of course this needs the other patches. They have all be checked in now
and are picked up for those who build Gecko themselves. Otherwise, they
are in the latest nightly binaries:
http://ftp.mozilla.org/pub/mozilla.org/mozilla/nightly/latest-trunk/.

Although it may be tempting, beware not to use the above CSS without at
least the patch in bug 247151. TeX fonts wrongly advertise themselves to
the GDI font engine as ASCII, but they aren't:
ASCII: http://czyborra.com/charsets/iso646.html
CMR10: Loading Image...

These fonts are old and predate Unicode, so we can blame much BaKoMa and
friends. But they confuse the layout code and glyphs can be mixed up,
see bug 73539. After waiting this long for that bug, I made a provision
in that patch to at least redirect all MathML text to the Unicode path
to use font mapping tables from now on, even on "plain" text. It is a
bit slower, but much safer for MathML.
---
RBS
Post by Roger B. Sidje
As raised several times in this newsgroup (and privately with some
people), the current inter-space between MathML frames, albeit
context-sensitive, is not always ideal. To many people, the all too
common $dx$ in integrands has so far been showing up as a perplexing
"d x" in Gecko...
These inter-spacing issues started hitting Wikipedia folks particularly
hard, as they are investigating what it will take to move from TeX PNGs
images to MathML, with currently over 50,000 equations on the English
Wikipedia alone...
Revisiting the inter-spacing is something that I have always hoped to
do. I have now been able to do so, and have submitted the changes on bug
306543.
[bugzilla]https://bugzilla.mozilla.org/show_bug.cgi?id=306543
[PDF-400K]https://bugzilla.mozilla.org/attachment.cgi?id=194502
[MathML-180K]http://abel.math.harvard.edu/~dmharvey/blahtex/0.2/page-34.php
I got the PDF from the browser (with the patch) by just printing that
page to a PDF driver (PDFCreator in this case), hence it is as-is. It is
hard to tell the difference with TeX PNGs now. (You will notice that the
maths use TeX fonts across the board instead of the usual Times Roman.
For the purpose of testing the patch, I did set TeX fonts in mathml.css
-- that's all. They look okay on the PDF but don't anti-alias very well
on the browser screen itself. That's why I still haven't set them as the
default in place of Times Roman in the release builds.)
While still not the end of the road, the patch solves most of the cases
that people have been talking about, as can been seen on the PDF. But as
usual, there are edges cases such as $y = b \, \sin t$ or
<mrow>
<mi>y</mi>
<mo>=</mo>
<mi>b</mi>
<mspace width="0.1em"/>
<mi>sin</mi>
<mo>&ApplyFunction;</mo>
<mi>t</mi>
</mrow>
This renders (with the patch) with only an inter-space of "0.1em"
between b and sin (c.f. the PDF). By contrast, TeX will put \, in
*addition* to the intrinsic left-space that it knows for its operator
\sin. As \sin (and others) aren't built-in operators for us, I wonder
about special-casing <mspace> to recover TeX behavior here.
Keep a watch on bug 306543 to follow further developments.
Unfortunately, it is a large patch, and I don't feel like fighting to
get it into the upcoming Firefox 1.5 (Gecko1.8) which has already
branched off from the trunk (if somebody wants to fight for that, go
ahead...). In any case, expect these improvements in Firefox 2.0
(Gecko1.9) -- which might be a year or so away. In the meantime, they
will in the nightly builds soon after the patch is checked in the trunk.
---
RBS
PS: There are some oddities with the spacing values of the factorial
operator "!". The values for form="prefix" and form="postfix" seem to
have been inverted in the sample MathML REC's Operator Dictionary.
_______________________________________________
Mozilla-mathml mailing list
http://mail.mozilla.org/listinfo/mozilla-mathml
Roger B. Sidje
2005-09-08 03:12:03 UTC
Permalink
Oops, bad typo... I meant that as the "fonts are old and predate
Unicode, we can'T blame much BaKoMa and friends".

Plus of course if they were to be re-released as Unicode, they could
also break other legacy applications. So we are caught in the middle and
don't have much choice either way, other than working our way out in
Mozilla's Gecko layout engine as I explained.
---
RBS
Post by Roger B. Sidje
These fonts are old and predate Unicode, so we can blame much BaKoMa and
friends. But they confuse the layout code and glyphs can be mixed up,
see bug 73539. After waiting this long for that bug, I made a provision
in that patch to at least redirect all MathML text to the Unicode path
to use font mapping tables from now on, even on "plain" text. It is a
bit slower, but much safer for MathML.
---
RBS
James Cloos
2005-09-23 15:36:05 UTC
Permalink
Roger> Plus of course if they were to be re-released as Unicode, they
Roger> could also break other legacy applications.

Most of the CM-derrived fontset is available as part of the unicode
font family Latin Modern. They have both type1 and opentype
available.

I'm not sure whether they have all of the math fonts yet, though.

-JimC
--
James H. Cloos, Jr. <***@jhcloos.com>
Roger B. Sidje
2005-09-09 04:56:06 UTC
Permalink
It was brought to my attention that I gave the wrong link for the
nightly precompiled binaries:

http://ftp.mozilla.org/pub/mozilla.org/mozilla/nightly/latest-trunk/

(these appear to be about two months old, despite the
/nightly/latest-trunk/).

The binaries to go for are here instead:

http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-trunk/
---
RBS
Roger B. Sidje
2005-09-17 00:41:18 UTC
Permalink
I have checked in all the patches I had for improving the inter-space.
Fortunately, Mozilla's drivers were understanding and have approved them
on the Gecko 1.8 branch as well. This means that these improvements will
show up immediately in the upcoming Mozilla Firefox 1.5 and onwards,
starting from its next Firefox 1.5 beta2. Brace yourself for this major
release, BTW. It will come with SVG built-in by default and other goodies.

The inter-space issues were about to overpass fonts as the number 1
complain in MathML (witness the latest threads on this newsgroup...).
Indeed, more and more people have become knowledgeable about the
installation of fonts (or can at least google to see what others did).
And there were the dreaded &ApplyFunction; and &InvisibleTimes; on Linux
with no good work-around there. The patches have now fixed all these in
Gecko and hopefully these issues are going to become something of the past.

For those not familiar with the terminology of Gecko/branch/trunk/etc:

Gecko is the core engine shared by Firefox, Thunderbird (mail), nvu
(composer), Netscape7/8, etc, etc, so as not to re-invent the wheel.
Each of the wrapping applications uses some build flags to enable or
disable what is of interest to them in the core Gecko codebase.

When Gecko branches (a branch is a snapshot backed-up somewhere at a
point in time), it gets a number (1.8 for the last branch). Then from
that branch, comes the next Firefox (1.5 for the next one) or other
applications, which only gets last minutes fixes approved by their
respective drivers.

Meanwhile, life goes on the live code, a.k.a. the mainline trunk. So my
patches were going there, but not in the branch. That's what the various
comments about "requesting branch approval" were referring to. With the
branch approval granted, I could then check them in the 1.8 branch as
well, and that's why they will automatically show up in Firefox 1.5.
[Otherwise, they would have only been in the next 1.9 branch (since it
will be cut from the current mainline trunk when its time comes), and so
in the next Firefox (2.0) from that branch when its time comes.]

Hope this clarifies things... It can be confusing indeed for those who
catch the train on the way.

There is some info about branch/trunk/etc here:
http://forums.mozillazine.org/viewtopic.php?t=77077

---
RBS
Loading...