Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OMML should write dPr in the order (begChr, sepChr, endChr), not (begChr, endChr, sepChr) #243

Closed
refashioned opened this issue Nov 13, 2024 · 1 comment

Comments

@refashioned
Copy link

When I generate a docx file containing math with Pandoc (v3.5) and open it in LibreOffice Writer (v7.4.7.2), every right-hand delimiters turns into right parenthesis, even when it could be something else, such as a square bracket or vertical bar. I tracked the problem down to the order in which TeXMath writes the subelements of the dPr element. TeXMath uses the order (begChr, endChr, sepChr), while LibreOffice is expecting (begChr, sepChr, endChr). The right parenthesis is a default fallback in the LibreOffice code.

This is the code in TeXMath that writes the elements in the order (begChr, endChr, sepChr):
/~https://github.com/jgm/texmath/blob/0.12.8.11/src/Text/TeXMath/Writers/OMML.hs#L209-L216

   EDelimited start end xs ->
                  [ mnode "d" $ mnode "dPr"
                               [ mnodeA "begChr" (T.unpack start) ()
                               , mnodeA "endChr" (T.unpack end) ()
                               , mnodeA "sepChr" (T.unpack sepchr) ()
                               , mnode "grow" () ]
                              : map (mnode "e" . concatMap (showExp props)) es
                  ]

This is the code in LibreOffice that expects the order (begChr, sepChr, endChr) and supplies ")" as a default:
https://git.libreoffice.org/core/+/refs/tags/libreoffice-24.8.3.2/starmath/source/ooxmlimport.cxx#301

    OUString opening = u"("_ustr;
    OUString closing = u")"_ustr;
    OUString separator = u"|"_ustr;
    if( XmlStream::Tag dPr = m_rStream.checkOpeningTag( M_TOKEN( dPr )))
    {
        if( XmlStream::Tag begChr = m_rStream.checkOpeningTag( M_TOKEN( begChr )))
        {
            opening = begChr.attribute( M_TOKEN( val ), opening );
            m_rStream.ensureClosingTag( M_TOKEN( begChr ));
        }
        if( XmlStream::Tag sepChr = m_rStream.checkOpeningTag( M_TOKEN( sepChr )))
        {
            separator = sepChr.attribute( M_TOKEN( val ), separator );
            m_rStream.ensureClosingTag( M_TOKEN( sepChr ));
        }
        if( XmlStream::Tag endChr = m_rStream.checkOpeningTag( M_TOKEN( endChr )))
        {
            closing = endChr.attribute( M_TOKEN( val ), closing );
            m_rStream.ensureClosingTag( M_TOKEN( endChr ));
        }
        m_rStream.ensureClosingTag( M_TOKEN( dPr ));
    }

I'm not an expert at OMML or XML Schemas, but I found this page that puts the contents of dPr in a sequence, which I take to mean that sepChr should precede endChr:

<complexType name="CT_DPr">
	<sequence>
	<element name="begChr" type="CT_Char" minOccurs="0"/>
	<element name="sepChr" type="CT_Char" minOccurs="0"/>
	<element name="endChr" type="CT_Char" minOccurs="0"/>
	<element name="grow" type="CT_OnOff" minOccurs="0"/>
	<element name="shp" type="CT_Shp" minOccurs="0"/>
	<element name="ctrlPr" type="CT_CtrlPr" minOccurs="0"/>
	</sequence>
</complexType>

This is a small reproduction recipe:

echo '$|x| = \max[x, -x]$' | pandoc -t docx -o x.docx

In my LibreOffice Writer, it renders like this:

|x)=max[x,−x]

When I double-click the formula, this is how it's serialized in the editor:

left lline x right ) = max left [x , − x right )
OMML output
<m:oMath>
  <m:d>
    <m:dPr>
      <m:begChr m:val="|" />
      <m:endChr m:val="|" />
      <m:sepChr m:val="" />
      <m:grow />
    </m:dPr>
    <m:e>
      <m:r>
        <m:t>x</m:t>
      </m:r>
    </m:e>
  </m:d>
  <m:r>
    <m:rPr>
      <m:sty m:val="p" />
    </m:rPr>
    <m:t>=</m:t>
  </m:r>
  <m:r>
    <m:rPr>
      <m:sty m:val="p" />
    </m:rPr>
    <m:t>max</m:t>
  </m:r>
  <m:d>
    <m:dPr>
      <m:begChr m:val="[" />
      <m:endChr m:val="]" />
      <m:sepChr m:val="" />
      <m:grow />
    </m:dPr>
    <m:e>
      <m:r>
        <m:t>x</m:t>
      </m:r>
      <m:r>
        <m:rPr>
          <m:sty m:val="p" />
        </m:rPr>
        <m:t>,</m:t>
      </m:r>
      <m:r>
        <m:rPr>
          <m:sty m:val="p" />
        </m:rPr>
        <m:t>−</m:t>
      </m:r>
      <m:r>
        <m:t>x</m:t>
      </m:r>
    </m:e>
  </m:d>
</m:oMath>

If I hack word/document.xml inside the .docx file and swap the order of endChr and sepChr, then the delimiters render correctly.

@jgm
Copy link
Owner

jgm commented Nov 13, 2024

Thanks for the excellent bug report!

@jgm jgm closed this as completed in 95b3f28 Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants