NOTE-arabic-math-20060131
66.4 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html lang="en"><head><title>Arabic mathematical notation</title><style type="text/css">
code { font-family: monospace; }
div.constraint,
div.issue,
div.note,
div.notice { margin-left: 2em; }
li p { margin-top: 0.3em;
margin-bottom: 0.3em; }
div.exampleInner pre { margin-left: 1em;
margin-top: 0em; margin-bottom: 0em}
div.exampleOuter {border: 4px double gray;
margin: 0em; padding: 0em}
div.exampleInner { background-color: #d5dee3;
border-top-width: 4px;
border-top-style: double;
border-top-color: #d3d3d3;
border-bottom-width: 4px;
border-bottom-style: double;
border-bottom-color: #d3d3d3;
padding: 4px; margin: 0em }
div.exampleWrapper { margin: 4px }
div.exampleHeader { font-weight: bold;
margin: 4px}
</style><link type="text/css" rel="stylesheet" href="http://www.w3.org/StyleSheets/TR/W3C-IG-NOTE.css"></head><body><div class="head"><p><a href="http://www.w3.org/"><img width="72" height="48" alt="W3C" src="http://www.w3.org/Icons/w3c_home"></a></p>
<h1><a id="title" name="title"></a>Arabic mathematical notation</h1>
<h2><a id="w3c-doctype" name="w3c-doctype"></a>W3C Interest Group Note 31 January 2006</h2><dl><dt>This version:</dt><dd>
<a href="http://www.w3.org/TR/2006/NOTE-arabic-math-20060131">http://www.w3.org/TR/2006/NOTE-arabic-math-20060131</a>
</dd><dt>Latest version:</dt><dd><a href="http://www.w3.org/TR/arabic-math/">http://www.w3.org/TR/arabic-math/</a></dd><dt>Previous version:</dt><dd>This is the first version</dd><dt>Editors:</dt><dd>Azzeddine Lazrek, with Mustapha Eddahibi and Khalid Sami, Cadi Ayyad University - Marrakech, Morocco
</dd><dd>Bruce R. Miller, National Institute of Standards and Technology, USA</dd></dl><p>This document is also available in these non-normative formats: <a href="arabic.xhtml">XHTML+MathML version</a>.</p><p class="copyright"><a href="http://www.w3.org/Consortium/Legal/ipr-notice#Copyright"> Copyright</a> ©2006 <a href="http://www.w3.org/"><acronym title="World Wide Web Consortium">W3C</acronym></a><sup>®</sup> (<a href="http://www.csail.mit.edu/"><acronym title="Massachusetts Institute of Technology">MIT</acronym></a>, <a href="http://www.ercim.org/"><acronym title="European Research Consortium for Informatics and Mathematics">ERCIM</acronym></a>, <a href="http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. W3C <a href="http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">liability</a>, <a href="http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">trademark</a> and <a href="http://www.w3.org/Consortium/Legal/copyright-documents">document use</a> rules apply.</p></div><hr><div>
<h2><a id="abstract" name="abstract"></a>Abstract</h2><p>
This Note analyzes potential problems with the use of MathML for the
presentation of mathematics in the notations customarily used with Arabic,
and related languages. The goal is to clarify avoidable implementation details that hinder such presentation,
as well as to uncover genuine limitations in the specification.
These limitations in the MathML specification may require extensions in future versions of the specification.</p></div><div>
<h2><a id="status" name="status"></a>Status of this Document</h2><p><em>This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the <a href="http://www.w3.org/TR/">W3C technical reports index</a> at http://www.w3.org/TR/.</em></p><p>This Note is a self-contained discussion of Arabic mathematical notation in
MathML. It provides guidelines for the handling of Arabic mathematical
presentation using MathML 2
Recommendation (2nd Edition) <a href="#MathML22e">[MathML22e]</a>
and suggests extensions for a future revision. </p><p>This Note has been written by participants in the <a href="http://www.w3.org/Math/Group/">Math Interest Group</a> (W3C
members only) which is part of the <a href="http://www.w3.org/Math/Activity">W3C Math activity</a>. Please direct
comments and report errors in this document to <a href="mailto:www-math@w3.org">www-math@w3.org</a>, a mailing list with a public <a href="http://lists.w3.org/Archives/Member/member-math/">archive</a>.
</p><p>Publication as a Interest Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.</p></div><div class="toc">
<h2><a id="contents" name="contents"></a>Table of Contents</h2><p class="toc">1 <a href="#Introduction">Introduction</a><br>
2 <a href="#ArabicScript">Some Features of Arabic Script</a><br>
2.1 <a href="#N100F4">Text Direction</a><br>
2.2 <a href="#GlyphShaping">Glyph Shaping</a><br>
2.3 <a href="#Mirroring">Mirroring</a><br>
2.4 <a href="#NumberSystems">Number Systems</a><br>
3 <a href="#Comparison">Comparison of Mathematical Notations</a><br>
3.1 <a href="#Moroccan">Arabic Notation; Moroccan Style</a><br>
3.2 <a href="#Maghreb">Arabic Notation; Maghreb Style</a><br>
3.3 <a href="#Machrek">Arabic Notation; Machrek Style</a><br>
3.4 <a href="#N106E1">Additional Arabic Notations</a><br>
3.5 <a href="#Persian">Persian</a><br>
4 <a href="#Proposals">Proposals and Clarifications</a><br>
4.1 <a href="#BiDiProposal">Clarification of bidirectional Algorithm for MathML</a><br>
4.2 <a href="#GlyphShapingProposal">Glyph Shaping</a><br>
4.3 <a href="#N10951">Additional Mathvariants</a><br>
4.4 <a href="#MirroringProposal">Mirroring</a><br>
4.5 <a href="#N10A2B">Horizontal Stretchiness</a><br>
4.6 <a href="#N10A3B">Additional Constructs</a><br>
5 <a href="#N10A49">Conclusions and Future Work</a><br>
6 <a href="#N10A58">Acknowledgments</a><br>
7 <a href="#N10A5F">Production Notes</a><br>
</p>
<h3><a id="appendices" name="appendices"></a>Appendices</h3><p class="toc">A <a href="#Localization">Localization Issues</a><br>
A.1 <a href="#NumberSystem2">Number Systems</a><br>
A.2 <a href="#SymbolsChoice">Symbols Choice</a><br>
B <a href="#Implementation">Implementation Issues</a><br>
B.1 <a href="#CharactersEncoding">Character Encoding</a><br>
B.2 <a href="#MathematicalFonts">Mathematical Fonts</a><br>
B.3 <a href="#N10AEE">Symbol Stretching</a><br>
B.4 <a href="#SoftwareTools">Software Tools</a><br>
C <a href="#N10B92">Bibliography</a><br>
</p></div><hr><div class="body"><div class="div1">
<h2><a id="Introduction" name="Introduction"></a>1 Introduction</h2><p>As the World Wide Web becomes more world wide, inclusion of the world's many languages,
scripts and cultures becomes critical. Although the development of the Mathematical
Markup Language (MathML) <a href="#MathML22e">[MathML22e]</a>, was neither intentionally nor
explicitly exclusive of non-European languages and scripts,
the focus was on the notational schema used with European languages. Indeed, most of these
notations are used unchanged in many other contexts. However, there are variations introduced
in some languages, either for historical reasons, or to fit within various writing systems,
which MathML should accommodate for improved international support (in particular educational
material requiring these variations, or historical documents).</p><p>While European languages are written left to right (LTR), Arabic, among others, is
written right to left (RTL). We will see that in Arabic mathematical texts many of the
same notational constructs are used, but may be reversed or <a href="#Mirroring">mirrored</a>,
depending on the cultural context; what we will call a <em>mathematical directionality</em>.
The mathematical directionality is not necessarily the same as the text directionality.
Moreover, since the mathematical material may commonly contain text and symbols coming from
both Arabic and European languages, the question of how the Unicode bidirectional algorithm
<a href="#UnicodeBiDi">[UnicodeBiDi]</a> should be applied arises.
Finally, several additional symbols and writing styles may be used in special ways.</p><p><img src="arabic-images/khtout.png" alt="[Arabic Script samples]"></p><p>Arabic Calligraphy is enriched by a variety of writing styles,
as European writing benefits from a variety of fonts. The graphic above illustrates
a variety of Arabic calligraphic styles; each word is the name of the corresponding style.
In the same way that European mathematics broadens the set of distinct symbols available by
using bold face, Fraktur or other styles, so does Arabic mathematics but typically
by varying strokes, adding tails or other extensions.</p><p>A given piece of mathematics marked up in
<a href="http://www.w3.org/TR/MathML2/chapter4.html">Content MathML</a>
(<a href="#MathML22e">[MathML22e]</a>, chapter 4), is generally language-neutral — although the
choices for variable names may imply a cultural context —
it intends to represent the universal meaning of the mathematics.
A given piece of mathematics marked up in
<a href="http://www.w3.org/TR/MathML2/chapter3.html">Presentation MathML</a>
(<a href="#MathML22e">[MathML22e]</a>, chapter 3),
on the other hand, conveys the visual appearance of the expression. That appearance
necessarily targets a specific language and notational conventions, indeed even of
the scientific discipline involved.
In this Note, we amplify and formalize this segregation of concerns:
Presentation MathML should be a fairly literal
representation of the visual notation to be used.</p><p>We relegate all <a href="#Localization">localization</a> issues
— which symbol to use for summation, which name to use for tangent, what
format to use for numbers — to the generator of the Presentation MathML,
rather than the renderer. This avoids guessing, perhaps wrongly, what number is
intended while deciding whether to replace periods by commas, for example. Thus,
localization entails the choice of what
text content to place within MathML's token elements, but that choice is already fixed
within a given piece of Presentation MathML.</p><p>In this Note, we have attempted to examine all notational conventions in current
use with Arabic and languages written using Arabic script, without giving preference
to one form over another.
We aim to clarify the specification of MathML, proposing extensions where needed,
so that MathML has the broadest coverage possible. Nevertheless, an in-depth analysis of issues
affecting other languages, particularly those written top to bottom is a topic for future study.
The emphasis on Arabic languages is partly a reflection of an increased interest in, and
usage of, MathML in Arabic language contexts that have highlighted the issues described here.
Another topic for future study is how Content MathML might best support the transformation
to appropriately localized Presentation MathML.</p></div><div class="div1">
<h2><a id="ArabicScript" name="ArabicScript"></a>2 Some Features of Arabic Script</h2><p>Before delving into mathematical notations, it will help to describe some
of the features of Arabic script, and how Unicode deals with these features.</p><div class="div2">
<h3><a id="N100F4" name="N100F4"></a>2.1 Text Direction</h3><p>While European languages are written from left to right (LTR), Arabic is written from
right to left (RTL). Unicode supports these scripts by not only defining codepoints
for the individual characters of these languages, but by recording the directionality
of each character.</p><p>When a mixture of LTR and RTL characters appear in text (ie. bidirectional or BiDi text,
such as an English text that includes Arabic words),
Unicode's bidirectional algorithm <a href="#UnicodeBiDi">[UnicodeBiDi]</a> describes the order in which
the characters will be displayed. All adjacent strongly-typed RTL characters (such as a
in a single Arabic word) will be presented in right-to-left order, and vice versa for
strongly-typed LTR characters. A cluster of characters with the same directionality
is called a <em>directional run</em>.</p><p>Within any given "paragraph", directional runs are then ordered according to the
overall <em>directional context</em>. The bidirectional algorithm allows for higher-level
protocols to determine which <em>segments</em> of a structured text constitute "paragraphs"
in this sense. For example, in HTML block-level elements are taken as the
paragraph segments. The top-level <code>html</code> tag determines the directional context
which can be changed on lower-level elements using the <code>dir</code> attribute.</p><p>For a gentle introduction to bidirectional text, see
<a href="#UnicodeBiDiIntro">[UnicodeBiDiIntro]</a>.</p></div><div class="div2">
<h3><a id="GlyphShaping" name="GlyphShaping"></a>2.2 Glyph Shaping</h3><p>As Arabic is a calligraphic script, letters within words are typically joined together.
When text in such calligraphic scripts is specified by character sequences, a
process called <em>shaping</em> is used to blend, or connect the character glyphs.
In Arabic words consisting of a single character, that character is drawn in the "isolated"
style. In multi-character words, alternative shapes are generally used depending on position:
the first (rightmost) character is drawn in its "initial" shape,
the last (leftmost) character gets its "final" shape, and any characters in the middle
are of the "medial" shape.</p><p>Compare the isolated characters غ ي ر
to the result of glyph shaping غير.</p></div><div class="div2">
<h3><a id="Mirroring" name="Mirroring"></a>2.3 Mirroring</h3><p>Some characters, viewed abstractly, have the same meaning in many languages,
but the form used in RTL languages are the roughly the mirror image of the
form used in LTR languages. Parentheses and quotation marks are such characters.
Unicode deals with these cases by marking some codepoints as mirrored, meaning
that an alternate glyph will be used for the character if it appears in a RTL context.</p><p>Note that mirrored symbols are not required by Unicode (See
<a href="http://www.unicode.org/reports/tr9/#Mirroring">Mirroring</a>
in <a href="#UnicodeBiDi">[UnicodeBiDi]</a>, section 6) to be literally
the exact mirror image. Indeed, it is considered an important point of Arabic calligraphy
that they are not: the feather's head (kalam) is a flat rectangle. The writer holds the pen so that
the largest side makes an angle of approximately 70° with the baseline.
This orientation is kept throughout the process of drawing the character.
Furthermore, as Arabic writing goes from right to left, some boldness is produced
around segments running from top left toward the bottom right and conversely,
segments from top right to the bottom left will rather be slim.
Thus, the Arabic sum symbol <img src="arabic-images/sigmaa.png" alt="Arabic Sigma">,
for example, is not simply the mirror image
<img src="arabic-images/sigman.png" alt="Mirrored Sigma">
of sigma <img src="arabic-images/sigmal.png" alt="Sigma">.</p></div><div class="div2">
<h3><a id="NumberSystems" name="NumberSystems"></a>2.4 Number Systems</h3><p>There are several decimal numeral systems in use in Arabic: </p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">System</th><th rowspan="1" colspan="1">Unicode</th><th colspan="10" rowspan="1">Digits</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">Regions</th></tr></thead><tbody><tr><th rowspan="1" colspan="1">European</th><td rowspan="1" colspan="1">U0030-U0039</td><td rowspan="1" colspan="1">0</td><td rowspan="1" colspan="1">1</td><td rowspan="1" colspan="1">2</td><td rowspan="1" colspan="1">3</td><td rowspan="1" colspan="1">4</td><td rowspan="1" colspan="1">5</td><td rowspan="1" colspan="1">6</td><td rowspan="1" colspan="1">7</td><td rowspan="1" colspan="1">8</td><td rowspan="1" colspan="1">9</td><td rowspan="1" colspan="1"></td><td rowspan="1" colspan="1">Maghreb Arab (eg. Morocco), as well as European</td></tr><tr><th rowspan="1" colspan="1">Arabic-Indic</th><td rowspan="1" colspan="1">U0660-U0669</td><td rowspan="1" colspan="1">٠</td><td rowspan="1" colspan="1">١</td><td rowspan="1" colspan="1">٢</td><td rowspan="1" colspan="1">٣</td><td rowspan="1" colspan="1">٤</td><td rowspan="1" colspan="1">٥</td><td rowspan="1" colspan="1">٦</td><td rowspan="1" colspan="1">٧</td><td rowspan="1" colspan="1">٨</td><td rowspan="1" colspan="1">٩</td><td rowspan="1" colspan="1"><img src="arabic-images/arind.png" alt="[Image of Arabic-Indic Digits]"></td><td rowspan="1" colspan="1">Machrek Arab (eg. Egypt)</td></tr><tr><th rowspan="1" colspan="1">Eastern Arabic-Indic</th><td rowspan="1" colspan="1">U06F0-U06F9</td><td rowspan="1" colspan="1">۰</td><td rowspan="1" colspan="1">۱</td><td rowspan="1" colspan="1">۲</td><td rowspan="1" colspan="1">۳</td><td rowspan="1" colspan="1">۴</td><td rowspan="1" colspan="1">۵</td><td rowspan="1" colspan="1">۶</td><td rowspan="1" colspan="1">۷</td><td rowspan="1" colspan="1">۸</td><td rowspan="1" colspan="1">۹</td><td rowspan="1" colspan="1"><img src="arabic-images/esarind.png" alt="[Image of Eastern Arabic-Indic Digits]"></td><td rowspan="1" colspan="1">Iran</td></tr></tbody></table></div></div><div class="div1">
<h2><a id="Comparison" name="Comparison"></a>3 Comparison of Mathematical Notations</h2><p>We will explore the spectrum of notations by choosing some samples of mathematical
content and comparing how they would typically be rendered for different languages and cultures.
We begin with an expression formatted as it might be seen in both English
and French contexts.</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">English</th><td rowspan="1" colspan="1"><img src="arabic-images/expren.png" alt="[Image of formula in English style]"></td><td rowspan="1" colspan="1">
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mrow>
<mrow>
<mi>f</mi>
<mo>⁡</mo>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>)</mo>
</mrow>
</mrow>
<mo>=</mo>
<mrow>
<mo>{</mo>
<mtable>
<mtr>
<mtd>
<mrow>
<munderover>
<mo movablelimits="false">∑</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>s</mi>
</munderover>
<mo>⁡</mo>
<msup>
<mi>x</mi>
<mi>i</mi>
</msup>
</mrow>
</mtd>
<mtd>
<mrow>
<mtext> if </mtext>
<mi>x</mi>
<mo><</mo>
<mn>0</mn>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msubsup>
<mo>∫</mo>
<mn>1</mn>
<mi>s</mi>
</msubsup>
<mo>⁡</mo>
<mrow>
<msup>
<mi>x</mi>
<mi>i</mi>
</msup>
<mo>⁢</mo>
<mi>d</mi>
<mo>⁡</mo>
<mi>x</mi>
</mrow>
</mrow>
</mtd>
<mtd>
<mrow>
<mtext> if </mtext>
<mi>x</mi>
<mo>∈</mo>
<mi mathvariant="normal">S</mi>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mi>tan</mi>
<mo>⁡</mo>
<mi>π</mi>
</mrow>
</mtd>
<mtd>
<mrow>
<mtext> otherwise </mtext>
<mrow>
<mo>(</mo>
<mtext>with </mtext>
<mi>π</mi>
<mo>≃</mo>
<mn>3.141</mn>
<mo>)</mo>
</mrow>
</mrow>
</mtd>
</mtr>
</mtable>
</mrow>
</mrow>
</math></pre>
</td></tr><tr><th rowspan="1" colspan="1">French</th><td rowspan="1" colspan="1"><img src="arabic-images/exprfr.png" alt="[Image of formula in French style]"></td><td rowspan="1" colspan="1">
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mrow>
<mrow>
<mi>f</mi>
<mo>⁡</mo>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>)</mo>
</mrow>
</mrow>
<mo>=</mo>
<mrow>
<mo>{</mo>
<mtable>
<mtr>
<mtd>
<mrow>
<munderover>
<mo movablelimits="false">∑</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>s</mi>
</munderover>
<mo>⁡</mo>
<msup>
<mi>x</mi>
<mi>i</mi>
</msup>
</mrow>
</mtd>
<mtd>
<mrow>
<mtext> si </mtext>
<mi>x</mi>
<mo><</mo>
<mn>0</mn>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msubsup>
<mo>∫</mo>
<mn>1</mn>
<mi>s</mi>
</msubsup>
<mo>⁡</mo>
<mrow>
<msup>
<mi>x</mi>
<mi>i</mi>
</msup>
<mo>⁢</mo>
<mi>d</mi>
<mo>⁡</mo>
<mi>x</mi>
</mrow>
</mrow>
</mtd>
<mtd>
<mrow>
<mtext> si </mtext>
<mi>x</mi>
<mo>∈</mo>
<mi mathvariant="normal">E</mi>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mi>tg</mi>
<mo>⁡</mo>
<mi>π</mi>
</mrow>
</mtd>
<mtd>
<mrow>
<mtext> sinon </mtext>
<mrow>
<mo>(</mo>
<mtext>avec </mtext>
<mi>π</mi>
<mo>≃</mo>
<mn>3,141</mn>
<mo>)</mo>
</mrow>
</mrow>
</mtd>
</mtr>
</mtable>
</mrow>
</mrow>
</math></pre>
</td></tr></table><p>Structurally, the expressions are identical. The differences in names,
number formatting and of course the language used for the connecting words are all
due to localization. They are effected purely by
differing textual content within the MathML token elements.</p><p>In the following sections, we will examine three common styles used
for mathematics within Arabic texts. The terms Moroccan, Maghreb and Machrek will be
used to indicate the general geographic areas where these styles are used, but
there are no clearly defined borders between the regions.</p><div class="div2">
<h3><a id="Moroccan" name="Moroccan"></a>3.1 Arabic Notation; Moroccan Style</h3><p>The current way of writing mathematical expressions in Morocco,
is closely related to the French style:</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">Moroccan</th><td rowspan="1" colspan="1"><img src="arabic-images/exprfrm.png" alt="[Image of formula in Moroccan style]"></td><td rowspan="1" colspan="1">
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mrow>
<mrow>
<mi>f</mi>
<mo>⁡</mo>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>)</mo>
</mrow>
</mrow>
<mo>=</mo>
<mrow>
<mo>{</mo>
<mtable>
<mtr>
<mtd>
<mrow>
<munderover>
<mo movablelimits="false">∑</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>s</mi>
</munderover>
<mo>⁡</mo>
<msup>
<mi>x</mi>
<mi>i</mi>
</msup>
</mrow>
</mtd>
<mtd>
<mrow>
<mtext>إذاكان </mtext>
<mi>x</mi>
<mo><</mo>
<mn>0</mn>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msubsup>
<mo>∫</mo>
<mn>1</mn>
<mi>s</mi>
</msubsup>
<mo>⁡</mo>
<mrow>
<msup>
<mi>x</mi>
<mi>i</mi>
</msup>
<mo>⁢</mo>
<mi>d</mi>
<mo>⁡</mo>
<mi>x</mi>
</mrow>
</mrow>
</mtd>
<mtd>
<mrow>
<mtext>إذاكان </mtext>
<mi>x</mi>
<mo>∈</mo>
<mi mathvariant="normal">E</mi>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mi>tg</mi>
<mo>⁡</mo>
<mi>π</mi>
</mrow>
</mtd>
<mtd>
<mrow>
<mtext>غيرذلك </mtext>
<mrow>
<mo>(</mo>
<mi>π</mi>
<mo>≃</mo>
<mn>3,141</mn>
<mtext>مع</mtext>
<mo>)</mo>
</mrow>
</mrow>
</mtd>
</mtr>
</mtable>
</mrow>
</mrow>
</math></pre>
</td></tr></table><p>Although the mathematics would be embedded within a RTL language (Arabic),
its directionality is still LTR. The connecting words and phrases within the math, however,
are RTL Arabic, and <em>should</em> be subject to <a href="#GlyphShaping">glyph shaping</a>
(although some current MathML renderers are not doing this).
Thus these phrases should appear as
"إذاكان" (for "if"),
"غيرذلك" (for "otherwise")
and "مع" (for "with").</p><p>Also, the indication is that the bidirectional algorithm <a href="#UnicodeBiDi">[UnicodeBiDi]</a> should be
applied to individual text and token elements, rather than at a higher level as in HTML;
that is, the token elements act as paragraph segments.
Even with these considerations, the ordering of phrases within the last clause
(for "otherwise (with pi=3.141)") is problematic. The obvious markup sandwiching
an <code>mrow</code> for "pi=3.141" between two <code>mtext</code>'s for "otherwise (with" and ")", respectively,
would yield an incorrect ordering. A correct rendering seems to require the possibility
of embedding <code>math</code> within <code>mtext</code>, which is not possible in MathML 2.0.
But even then, the desired ordering would need to be marked up as two separate <code>mtext</code> elements:
one for "otherwise", and one for "(with pi=3.141)". The Math Interest Group is currently
considering the possibilities of such embedding. The example above was marked up by
artificially placing the Arabic word for "with" <em>after</em> the "pi=3.141".</p><p>Given such issues, it is sometimes advantageous to minimize the use of
connecting phrases, with preference to simple punctuation, such as:</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">Moroccan</th><td rowspan="1" colspan="1"><img src="arabic-images/exprfrn.png" alt="[Image of simplified formula in Moroccan style]"></td><td rowspan="1" colspan="1">
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mrow>
<mrow>
<mi>f</mi>
<mo>⁡</mo>
<mrow>
<mo>(</mo>
<mi>x</mi>
<mo>)</mo>
</mrow>
</mrow>
<mo>=</mo>
<mrow>
<mo>{</mo>
<mtable>
<mtr>
<mtd>
<mrow>
<munderover>
<mo movablelimits="false">∑</mo>
<mrow>
<mi>i</mi>
<mo>=</mo>
<mn>1</mn>
</mrow>
<mi>s</mi>
</munderover>
<mo>⁡</mo>
<msup>
<mi>x</mi>
<mi>i</mi>
</msup>
</mrow>
</mtd>
<mtd>
<mrow>
<mtext>; </mtext>
<mi>x</mi>
<mo><</mo>
<mn>0</mn>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<msubsup>
<mo>∫</mo>
<mn>1</mn>
<mi>s</mi>
</msubsup>
<mo>⁡</mo>
<mrow>
<msup>
<mi>x</mi>
<mi>i</mi>
</msup>
<mo>⁢</mo>
<mi>d</mi>
<mo>⁡</mo>
<mi>x</mi>
</mrow>
</mrow>
</mtd>
<mtd>
<mrow>
<mtext>; </mtext>
<mi>x</mi>
<mo>∈</mo>
<mi mathvariant="normal">E</mi>
</mrow>
</mtd>
</mtr>
<mtr>
<mtd>
<mrow>
<mi>tg</mi>
<mo>⁡</mo>
<mi>π</mi>
</mrow>
</mtd>
<mtd>
<mrow>
<mtext>; </mtext>
<mrow>
<mo>(</mo>
<mi>π</mi>
<mo>≃</mo>
<mn>3,141</mn>
<mo>)</mo>
</mrow>
</mrow>
</mtd>
</mtr>
</mtable>
</mrow>
</mrow>
</math></pre>
</td></tr></table></div><div class="div2">
<h3><a id="Maghreb" name="Maghreb"></a>3.2 Arabic Notation; Maghreb Style</h3><p>The Maghreb style of notation is widely used in North Africa:</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">Maghreb</th><td rowspan="1" colspan="1"><img src="arabic-images/exprare.png" alt="[Image of formula in Maghreb style]"></td><td rowspan="1" colspan="1">Not yet attempted</td></tr></table><p>Here, the most striking difference is that the overall mathematical
layout is the mirror image of the preceding examples, that is,
the mathematical directionality is RTL. Further, some symbols
(eg ∑, <, ∈) are mirrored as well.
Thus, we need a means of specifying the mathematical directionality,
and assuring that the appropriate symbols are available in Unicode and are marked as mirrored.
</p><p>The remaining differences are due to a more pronounced use of Arabic symbols:
DAL <img src="arabic-images/dal.png" alt="DAL"> (as the initial
of <img src="arabic-images/dalt.png" alt="DALT"> = "function" in Arabic);
the Arabic letter BEH <img src="arabic-images/beh.png" alt="BEH">,
and the letters of the function name abbreviation <img src="arabic-images/tah.png" alt="TAH">
for tangent (without dots). Again, these differences fall into the category of localization,
but reinforce the idea that the Unicode bidirectional algorithm, along with glyph shaping, should apply individually
to token elements.</p></div><div class="div2">
<h3><a id="Machrek" name="Machrek"></a>3.3 Arabic Notation; Machrek Style</h3><p>As the final Arabic example, we consider the Machrek style generally used in the Middle East.</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">Machrek</th><td rowspan="1" colspan="1"><img src="arabic-images/exprarw.png" alt="[Image of formula in Machrek style]"></td><td rowspan="1" colspan="1">Not yet attempted</td></tr></table><p>Most differences between the Machrek and Maghreb styles are essentially due to localization:
a specifically Arabic symbol <img src="arabic-images/mg.png" alt="MG"> is used for the summation
(initial of <img src="arabic-images/mgmue.png" alt="MGMUE"> = "sum" in Arabic);
a different letter <img src="arabic-images/teh.png" alt="TEH"> is used for the function
(initial of <img src="arabic-images/tabet.png" alt="TABET">, also "function" in Arabic);
the letters of the elementary function name abbreviation
<img src="arabic-images/dah.png" alt="DAH"> are with dots;
and a number format using Arabic-Indic digits and a comma for the decimal separator (but not
the same as the Arabic comma used in text).</p><p>Note that the symbol used for summation should probably be a mathematical symbol
with a codepoint distinct from the Arabic letter, as the European summation symbol is
distinct from the Greek Sigma. This point also applies to the Arabic product.</p></div><div class="div2">
<h3><a id="N106E1" name="N106E1"></a>3.4 Additional Arabic Notations</h3><p>Two additional unique notations involve combinatorics, namely the factorial and
binomial coefficients:</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">English</th><td rowspan="1" colspan="1"><img src="arabic-images/drbcen.png" alt="[Image of 12 factorial in english style]"></td><td rowspan="1" colspan="1">
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mrow><mn>12</mn><mo>!</mo></mrow>
</math></pre>
</td></tr><tr><th rowspan="1" colspan="1">Arabic</th><td rowspan="1" colspan="1"><img src="arabic-images/drbc.png" alt="[Image of 12 factorial in Arabic style]"></td><td rowspan="1" colspan="1">
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block" dir="rtl">
<menclose notation="madruwb">
12
</menclose>
</math></pre>
</td></tr></table><p>The argument to the factorial must be wrapped in a form similar to the
character LAM (ل), which must
be stretched in both directions to accommodate. A new <code>menclose</code> notation,
<code>madruwb</code> is proposed for this case.</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">English</th><td rowspan="1" colspan="1"> <img src="arabic-images/arrangaen.png" alt="[Image of binomial(5,12) in english style]"></td><td rowspan="1" colspan="1">
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mrow>
<mo>(</mo><mtable><mtr><mtd>5</mtd></mtr><mtr><mtd>12</mtd></mtr></mtable><mo>)</mo>
</mrow>
</math></pre>
</td></tr><tr><th rowspan="1" colspan="1">Arabic</th><td rowspan="1" colspan="1"> <img src="arabic-images/arranga.png" alt="[image of binomial(5,12) in Arabic style]"></td><td rowspan="1" colspan="1">
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block" dir="rtl">
<mmultiscripts><mo>ل</mo>
<mn>12</mn><none/>
<mprescripts/>
<none/><mn>5</mn>
</mmultiscripts>
</math></pre>
</td></tr></table><p>Finally, although stacked fractions are rendered the same way in both European and Arabic,
bevelled fractions in RTL Arabic will appear, as one would expect, with the terms in RTL order,
i.e. A divided by B would appear as "B/A".
In some locales, the preference is for the slash to also be mirrored, as "B\A". For these cases,
we suggest that authors employ explicit markup using the REVERSE SOLIDUS \, such as
<mrow><mi>A</mi><mo>\</mo><mi>B</mi></mrow>
.</p></div><div class="div2">
<h3><a id="Persian" name="Persian"></a>3.5 Persian</h3><p>Persian languages generally use the Arabic script (written RTL), but with
the mathematical directionality LTR, similar to the Moroccan style.
We are aware of only one mathematical notation unique to Persian writing, the notation used
for limits:</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">English</th><td rowspan="1" colspan="1"><img src="arabic-images/limw.png" alt="[Image of limit formula in English style]"></td><td rowspan="1" colspan="1">
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mrow>
<mrow>
<munder>
<mo movablelimits="false">lim</mo>
<mrow>
<mi>x</mi>
<mo>→</mo>
<mfrac bevelled="true">
<mi>π</mi>
<mn>10</mn>
</mfrac>
</mrow>
</munder>
<mo>⁡</mo>
<mrow>
<mi>sin</mi>
<mo>⁡</mo>
<mi>x</mi>
</mrow>
</mrow>
<mo>=</mo>
<mrow>
<mfrac>
<mn>1</mn>
<mn>4</mn>
</mfrac>
<mo>⁢</mo>
<mrow>
<mo>(</mo>
<msqrt>
<mn>5</mn>
</msqrt>
<mo>-</mo>
<mn>1</mn>
<mo>)</mo>
</mrow>
</mrow>
</mrow>
</math></pre>
</td></tr><tr><th rowspan="1" colspan="1">Persian</th><td rowspan="1" colspan="1"> <img src="arabic-images/limf.png" alt="[Image of limit formula in Persian style]"></td><td rowspan="1" colspan="1">
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mrow>
<mrow>
<munder>
<mo movablelimits="false">حد</mo>
<mrow>
<mi>x</mi>
<mo>→</mo>
<mfrac bevelled="true">
<mi>π</mi>
<mn>۱۰</mn>
</mfrac>
</mrow>
</munder>
<mo>⁡</mo>
<mrow>
<mi>sin</mi>
<mo>⁡</mo>
<mi>x</mi>
</mrow>
</mrow>
<mo>=</mo>
<mrow>
<mfrac>
<mn>۱</mn>
<mn>۴</mn>
</mfrac>
<mo>⁢</mo>
<mrow>
<mo>(</mo>
<msqrt>
<mn>۵</mn>
</msqrt>
<mo>-</mo>
<mn>۱</mn>
<mo>)</mo>
</mrow>
</mrow>
</mrow>
</math></pre>
</td></tr></table><p>While the overall notation is similar to the Moroccan model (LTR), it uses the
Eastern Arabic-Indic digits. The word "حد" (for "limit"), is
used; this word should not only be affected by <a href="#GlyphShaping">glyph shaping</a>,
but should be stretched horizontally to match the length of the underscript.</p></div></div><div class="div1">
<h2><a id="Proposals" name="Proposals"></a>4 Proposals and Clarifications</h2><div class="div2">
<h3><a id="BiDiProposal" name="BiDiProposal"></a>4.1 Clarification of bidirectional Algorithm for MathML</h3><p>The following summarizes how directionality should be applied to MathML
and, in particular, describes how the bidirectional algorithm should be applied
(it falls into class HL4; See <a href="http://www.unicode.org/reports/tr9/#HL4">Higher Level
Protocols: HL4</a> in <a href="#UnicodeBiDi">[UnicodeBiDi]</a>, section 4.3).</p><ul><li><p>The overall <em>mathematical directionality</em> should be determined by
a (new) <code>dir</code> attribute on the outermost <code>math</code> element
which takes one of the values <code>ltr</code> or <code>rtl</code>;
the default is <code>ltr</code>.
If this attribute is <code>rtl</code> the layout of all Layout, Script, Limit,
Table and Matrix schemata should proceed from right to left. This includes
such effects as the surd of an <code>mroot</code> starting from the right.
When the mathematical directionality is <code>ltr</code>, the layout should conform
to the current MathML specification.</p></li><li><p>The text content of each Token element should be treated as a separate
directional segment and the bidirectional algorithm should be applied to each independently.
The initial directional context for each Token element is determined
by the mathematical directionality. This latter property should assure that
individual mirrored symbols are treated correctly.</p></li></ul><p>As an example, consider the MathML fragment:</p><p>
<mn>1</mn>
<mo>+</mo>
<mi>
<img src="arabic-images/behp.png" alt="BEHP">
</mi>
<mo>-</mo>
<mn>2</mn>
</p><p>Some browsers mis-apply the bidirectional algorithm to the expression as a whole, as in HTML.
Applying the HTML algorithm would set the first two items LTR, but then switch directions upon
encountering the letter <img src="arabic-images/behp.png" alt="BEHP">;
thus the last three items are reversed.</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Style</th><th rowspan="1" colspan="1">Image</th><th rowspan="1" colspan="1">MathML</th></tr></thead><tr><th rowspan="1" colspan="1">Right</th><td rowspan="1" colspan="1"> <img src="arabic-images/direction1.png" alt="[Image of expression rendered correctly]"></td><td rowspan="1" colspan="1">
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="display">
<mn>1</mn><mo>+</mo><mi>ب</mi><mo>-</mo><mn>2</mn>
</math></pre>
</td></tr><tr><th rowspan="1" colspan="1">Wrong</th><td rowspan="1" colspan="1"><img src="arabic-images/direction2.png" alt="[Image of expression rendered incorrectly]"></td><td rowspan="1" colspan="1"></td></tr></table></div><div class="div2">
<h3><a id="GlyphShapingProposal" name="GlyphShapingProposal"></a>4.2 Glyph Shaping</h3><p>Glyph shaping rules apply not only to the textual content of an <code>mtext</code>,
but also to Arabic character sequences used as mathematical symbols (particularly in
<code>mi</code> and <code>mo</code>). This shaping is the visual cue that
distinguishes a single symbol from a sequence of symbols, perhaps representing a product.
This is analogous to the use of roman font in European mathematics, to distinguish for example
<pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="display"><mi>sin</mi></math></pre>
from <pre><math xmlns="http://www.w3.org/1998/Math/MathML" display="display"><mi>s</mi><mi>i</mi><mi>n</mi></math></pre>.
<p>Thus, implementors should apply shaping to each character sequence within the text content of
any token elements.</p><p>Certain Arabic characters (ا د ذ ر ز و)
have no unique initial or medial shapes. Their use in the middle of a mathematical symbol
would tend to make the symbol look like the product of two shorter symbols.
Thus, to avoid confusion, authors should avoid using these characters
in the middle of mathematical symbols.</p></div><div class="div2">
<h3><a id="N10951" name="N10951"></a>4.3 Additional Mathvariants</h3><p>For single character tokens, additional styles, besides isolated, are used
to enlarge the set of available distinct symbols, just as the bold and Fraktur styles are
used in European mathematics. The styles used in Arabic mathematics
are "tailed", "looped" and "stretched", in addition to the "initial" style applied to
the individual character. Furthermore, the "double-struck" style is commonly used.
The following table shows the character JEEM in the various styles, in both
dotted and undotted forms (see below):</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1"></th><th rowspan="1" colspan="1">isolated</th><th rowspan="1" colspan="1">initial</th><th rowspan="1" colspan="1">tailed</th><th rowspan="1" colspan="1">looped</th><th rowspan="1" colspan="1">stretched</th><th rowspan="1" colspan="1">double-struck</th></tr></thead><tbody><tr><th rowspan="1" colspan="1">dotted</th><td rowspan="1" colspan="1"><img src="arabic-images/jeemf.png" alt="Dotted JEEM isolated form"></td><td rowspan="1" colspan="1"><img src="arabic-images/jeemi.png" alt="Dotted JEEM initial form"></td><td rowspan="1" colspan="1"><img src="arabic-images/jeemt.png" alt="Dotted JEEM tailed form"></td><td rowspan="1" colspan="1"><img src="arabic-images/jeeml.png" alt="Dotted JEEM looped form"></td><td rowspan="1" colspan="1"><img src="arabic-images/jeems.png" alt="Dotted JEEM stretched form"></td><td rowspan="1" colspan="1"><img src="arabic-images/jeemd.png" alt="Dotted JEEM double-struck"></td></tr><tr><th rowspan="1" colspan="1">undotted</th><td rowspan="1" colspan="1"><img src="arabic-images/hahf.png" alt="Undotted JEEM ISOLATED"></td><td rowspan="1" colspan="1"><img src="arabic-images/hahi.png" alt="Undotted JEEM initial form"></td><td rowspan="1" colspan="1"><img src="arabic-images/haht.png" alt="Undotted JEEM tailed form"></td><td rowspan="1" colspan="1"><img src="arabic-images/hahl.png" alt="Undotted JEEM looped form"></td><td rowspan="1" colspan="1"><img src="arabic-images/hahs.png" alt="Undotted JEEM stretched form"></td><td rowspan="1" colspan="1"><img src="arabic-images/hahd.png" alt="Undotted JEEM double-struck"></td></tr></tbody></table><p>It is proposed to consider the <code>mathvariant</code> "normal",
when applied to Arabic, to mean the result of glyph shaping, and in particular,
the "isolated" style for single character tokens. It is also proposed to
add the following values allowed for <code>mathvariant</code>:
"initial", "tailed", "looped" and "stretched".</p><p>It is not expected to be meaningful to apply the "bold", "italic", "fraktur", "script",
"sans-serif" or "monospace" mathvariants (or combinations) to Arabic (although there is some
sentiment for allowing "bold" and "italic"). Nor is it meaningful to apply any mathvariant
other than "normal" to multicharacter tokens, which should have glyph shaping applied.
The current MathML specification points out that the only combinations of characters and
mathvariant that have an unambiguous interpretation are those that correspond to the
SMP Math Alphanumeric Symbols. An analogous argument is to be made for Arabic and the proposed
Arabic Math Alphabetic Symbols <a href="#UnicodeProposition">[UnicodeProposition]</a> (not yet part of Unicode).</p><p>Both dotted and undotted alphabetic symbols are encountered in this Note.
The choice of which type to use is up to local preferences, however; documents use
either dotted or undotted symbols, but not a mixture, and in particular, the dots are not used
to indicate semantic distinctions. Thus, it is not felt that dotting is a good
candidate for a mathvariant value, but rather should be accommodated by the choice of
symbol fonts available to user's browser, or possibly through CSS.</p></div><div class="div2">
<h3><a id="MirroringProposal" name="MirroringProposal"></a>4.4 Mirroring</h3><p>The MathML attributes <code>lspace</code>, <code>rspace</code>,
<code>lquote</code> and <code>rquote</code> should be interpreted as opening and closing,
rather than strictly left and right. This historical anomaly is analogous to
the standard Unicode names for the parentheses:
The <code>LEFT PARENTHESIS</code> and <code>RIGHT PARENTHESIS</code>
are marked as <code>mirrored</code> and are taken to represent
<code>OPENING PARENTHESIS</code> and <code>CLOSING PARENTHESIS</code>, respectively.
</p><p>The Math Working Group, and other interested parties, should work to assure
that the necessary codepoints for Arabic mathematics are not only available, but
appropriately marked for mirroring.
It is also to be hoped that available fonts will be available, and will
respect the calligraphic qualities regarding mirroring.</p></div><div class="div2">
<h3><a id="N10A2B" name="N10A2B"></a>4.5 Horizontal Stretchiness</h3><p>In Arabic mathematics, the sum, product and limit are commonly stretched horizontally
to the same width as the limits (over or under) that apply to them. Such stretching
does occasionally appear, but is rare, in European mathematics.
In <a href="http://www.w3.org/TR/MathML2/chapter3.html#id.3.2.5.8.3">Horizontal
Stretching Rules of MathML</a>
(<a href="#MathML22e">[MathML22e]</a> section 3.2.5.8.3), standard allows for such horizontal stretching
of some symbols at the discretion of the rendering agent. In this Note, we
simply encourage developers to implement this feature for the appropriate Arabic symbols.</p></div><div class="div2">
<h3><a id="N10A3B" name="N10A3B"></a>4.6 Additional Constructs</h3><p>The Arabic notation for factorial is a sort of enclosure.
We propose to add an additional allowed value <code>madruwb</code> (transliteration
of the Arabic مضروب for factorial) for
the <code>notation</code> attribute of <code>menclose</code>.</p></div></div><div class="div1">
<h2><a id="N10A49" name="N10A49"></a>5 Conclusions and Future Work</h2><p>This Note describes the notational issues encountered in presenting
mathematics within Arabic and other RTL languages, in particular focusing on
how these notations differ from the model described by MathML2. To the best of
our knowledge, the unique notations described here cover all known differences.</p><p>This Note also proposes enhancements to be considered in a future revision
of the MathML specification. These enhancements would allow Presentation MathML to be
used to conveniently incorporate mathematics into Arabic documents in a style
conventionally used by Arabic speaking authors.</p><p>The successful use of mathematics in Arabic texts will also require,
in addition to the extensions proposed here, that the appropriate codepoints
are included in Unicode, and that those codepoints are correctly marked as
mirrored. Some proposals (<a href="#UnicodeProposition">[UnicodeProposition]</a>,<a href="#ArabicMathUnicode">[ArabicMathUnicode]</a>) have already been made.</p></div><div class="div1">
<h2><a id="N10A58" name="N10A58"></a>6 Acknowledgments</h2><p>This document has been produced by the members of the Math Interest
Group. The chairs of this Interest Group are David Carlisle (invited
expert) and Robert Miner (Design Science, Inc.). Other members of the
Working Group are (at the time of writing): Isam Ayoubi (invited
expert), Laurent Bernardin (Waterloo Maple Inc.), Stephane Dalmas
(Institut National de Recherche en Informatique et en Automatique),
Stan Devitt (invited expert), Max Froumentin (W3C), Patrick D F Ion
(invited expert), Azzeddine LAZREK (invited expert), Paul Libbrecht
(German Research Center for Artificial Intelligence), Manolis Mavrikis
(University of Edinburgh), Bruce Miller (National Institute of
Standards and Technology), Luca Padovani (University of Bologna), Neil
Soiffer (Design Science, Inc.), Stephen Watt (Waterloo Maple Inc.)</p><p>The editors would also like to thank Richard Ishida for initiating
the contacts that lead to the writing of this Note, and for many
constructive comments on a draft of it.</p></div><div class="div1">
<h2><a id="N10A5F" name="N10A5F"></a>7 Production Notes</h2><p>The images of Arabic and Persian expressions were composed using the RyDArab
system <a href="#RyDArab">[RyDArab]</a>, and the FarsiTeX system <a href="#FarsiTeX">[FarsiTeX]</a>, respectively.
</p></div></div><div class="back"><div class="div1">
<h2><a id="Localization" name="Localization"></a>A Localization Issues</h2><p>This section discusses some of the localization issues encountered in this Note.
Authors of MathML may want to consider these issues when composing documents.
Additionally, it may be worth parameterizing converters from Content MathML
to Presentation MathML so that they take into account the target language, locale,
and conceivably the scientific discipline involved as well.</p><div class="div2">
<h3><a id="NumberSystem2" name="NumberSystem2"></a>A.1 Number Systems</h3><p>Assuming that the text content of <code>cn</code> elements can be unambiguously
interpreted as a number, the locale selection must be able to choose not only the set of
digits to use, but what set of decimal and thousands separators.
Generally, the comma is used as a decimal separator with both the European and Arabic-Indic digits,
but note that such a comma is distinct from the
Arabic comma "،"
used to separate items in a list.</p></div><div class="div2">
<h3><a id="SymbolsChoice" name="SymbolsChoice"></a>A.2 Symbols Choice</h3><p>There are two kinds of symbols: literal and mirrored symbols used according
to the local area:
<ul><li><p>the sum operator is presented in the two ways:
<img src="arabic-images/mgmuec.png" alt="[Image of literal summation]"> and
<img src="arabic-images/mgmues.png" alt="[Image of symbolic summation]">;</p></li><li><p>the product operator is presented in the two ways:
<img src="arabic-images/gdaac.png" alt="[Image of literal product]"> and
<img src="arabic-images/gdaas.png" alt="[Image of symbolic product]">;</p></li><li><p>the limit operator is presented in the two ways:
<img src="arabic-images/nhaytc.png" alt="[Image of literal limit]"> and
<img src="arabic-images/nhaytf.png" alt="[Image of limit in Persian style]">.
This last notation is used in Persian.</p></li><li><p>the factorial operator is presented in the two ways:
<img src="arabic-images/drbc.png" alt="[Image of literal factorial]"> and
!12.</p></li></ul>
<p>These stretched operators can be compared to the
mathematical stretchy accents,
only the roles are reversed. We can also think of something similar
to the square root construction.
</p></div></div><div class="div1">
<h2><a id="Implementation" name="Implementation"></a>B Implementation Issues</h2><p>This section describes issues that an implementor of an Arabic-enhanced
MathML specification would encounter, and possible strategies for dealing with them.</p><div class="div2">
<h3><a id="CharactersEncoding" name="CharactersEncoding"></a>B.1 Character Encoding</h3><p>Even though some local symbols, used in mathematics written in an Arabic
notation, can be obtained via mirroring of already existing symbols,
there are many symbols found in Arabic mathematical handbooks that are not
yet part of the Unicode Standard and cannot be obtained through a simple mirroring
<a href="#ArabicMathUnicode">[ArabicMathUnicode]</a>.
Some of such special characters are submitted for inclusion into the Unicode
Standard <a href="#UnicodeProposition">[UnicodeProposition]</a>.</p></div><div class="div2">
<h3><a id="MathematicalFonts" name="MathematicalFonts"></a>B.2 Mathematical Fonts</h3><p>Some font families are designed to meet with the requirements of typesetting
mathematical documents in an Arabic notation.
The RamzArab Arabic mathematical font <a href="#RamzArab">[RamzArab]</a> aims to provide a complete and
homogeneous Arabic font family, in the OpenType format, respecting Arabic calligraphy rules.</p><p>Although letters in "tailed" and "stretched" forms are semantically distinct
from the "initial" forms, they can be simulated by connecting with a particular final
form of HEH and the final form of ALEF, respectively, and applying glyph shaping. This technique
may be useful when an insufficient variety of fonts is available.</p><p>Implementors are encouraged to make it feasible for users to
choose dotted or undotted mathematical symbol fonts easily in accord with local tastes.</p></div><div class="div2">
<h3><a id="N10AEE" name="N10AEE"></a>B.3 Symbol Stretching</h3><p>In the cases where operators need to be stretched to match
the width of sub- or superscripts, the lengthening should be done using
curves rather than straight lines.
This curve lengthening is called curved <em>kashida</em>. It is one of the most
important aspects of the Arabic calligraphy.</p><table cellpadding="2" border="2"><thead><tr><th rowspan="1" colspan="1">Good</th><th rowspan="1" colspan="1">Bad</th></tr></thead><tr><td rowspan="1" colspan="1"><img src="arabic-images/mgmuec.png" alt="[Image of properly stretched summation]"></td><td rowspan="1" colspan="1"><img src="arabic-images/mgmuel.png" alt="[Image of poorly stretched summation]"></td></tr><tr><td rowspan="1" colspan="1"><img src="arabic-images/gdaac.png" alt="[Image of properly stretched product]"></td><td rowspan="1" colspan="1"><img src="arabic-images/gdaal.png" alt="[Image of poorly stretched product]"></td></tr><tr><td rowspan="1" colspan="1"><img src="arabic-images/nhaytc.png" alt="[Image of properly stretched limit]"></td><td rowspan="1" colspan="1"><img src="arabic-images/nhaytl.png" alt="[Image of poorly stretched limit]"></td></tr><tr><td rowspan="1" colspan="1"><img src="arabic-images/drbc.png" alt="[Image of properly stretched factorial]"></td><td rowspan="1" colspan="1"><img src="arabic-images/drbl.png" alt="[Image of poorly stretched factorial]"></td></tr></table><p>These curvilinear extensible symbols were generated by the CurExt application
for the system T<sub>E</sub>X with a PostScript font generator <a href="#RamzArab">[RamzArab]</a>.</p><p>Although horizontal stretching of sum and product operators
is rare in European mathematics:
<img src="arabic-images/mgmuegl.png" alt="[Image of stretched summation]"> and
<img src="arabic-images/gdaagl.png" alt="[Image of stretched product]">,
this stretching is more common, and more desired, in Arabic mathematics:
<img src="arabic-images/mgmuega.png" alt="[Image of stretched mirrored summation]"> and
<img src="arabic-images/gdaaga.png" alt="[Image of stretched mirrored product]">.
</p><p>[Note: the broken corner in these symbols
is a known flaw to be repaired in a future version of RyDArab
<a href="#RyDArab">[RyDArab]</a>].</p></div><div class="div2">
<h3><a id="SoftwareTools" name="SoftwareTools"></a>B.4 Software Tools</h3><p>The Dadzilla system, an adapted version of Mozilla, allows using MathML for
Arabic mathematical notation <a href="#Dadzilla">[Dadzilla]</a>.</p></div></div><div class="div1">
<h2><a id="N10B92" name="N10B92"></a>C Bibliography</h2><dl><dt class="label"><a id="MathML22e" name="MathML22e"></a>MathML22e</dt><dd>David Carlisle, Patrick Ion, Robert Miner, Nico Poppelier,
<em>Mathematical Markup Language (MathML) Version 2.0 (2nd Edition)</em>
World Wide Web Consortium Working Draft 19. December 2002
(<a href="http://www.w3.org/TR/MathML2//">http://www.w3.org/TR/MathML2/</a>)
</dd><dt class="label"><a id="UnicodeBiDiIntro" name="UnicodeBiDiIntro"></a>UnicodeBiDiIntro</dt><dd>
Richard Ishida,
<em>What you need to know about the bidi algorithm and inline markup</em>
<a href=" http://www.w3.org/International/articles/inline-bidi-markup/">
http://www.w3.org/International/articles/inline-bidi-markup/</a>
</dd><dt class="label"><a id="UnicodeBiDi" name="UnicodeBiDi"></a>UnicodeBiDi</dt><dd>
<a href=" http://www.unicode.org/reports/tr9/">
http://www.unicode.org/reports/tr9/</a>
</dd><dt class="label"><a id="UnicodeProposition" name="UnicodeProposition"></a>UnicodeProposition</dt><dd>
<a href="http://www.ucam.ac.ma/fssm/rydarab/english/unicode.htm">
http://www.ucam.ac.ma/fssm/rydarab/english/unicode.htm</a>
</dd><dt class="label"><a id="ArabicMathUnicode" name="ArabicMathUnicode"></a>ArabicMathUnicode</dt><dd>Mohamed Jamal Eddine Benatia, Azzeddine Lazrek, Khalid Sami,
<em>Arabic mathematical symbols in Unicode</em>, IUC 27, Berlin, Germany, April 6-8, 2005.
(<a href="http://www.ucam.ac.ma/fssm/rydarab/doc/communic/unicodem.pdf">http://www.ucam.ac.ma/fssm/rydarab/doc/communic/unicodem.pdf</a>)
</dd><dt class="label"><a id="RyDArab" name="RyDArab"></a>RyDArab</dt><dd>Azzeddine Lazrek,
<em>RyDArab-Typesetting Arabic mathematical expressions</em>,
TUGboat, Volume 25 (2004), No. 2, 2004.
(<a href="http://www.ucam.ac.ma/fssm/rydarab/doc/communic/tugryd.pdf">http://www.ucam.ac.ma/fssm/rydarab/doc/communic/tugryd.pdf</a>)
</dd><dt class="label"><a id="RamzArab" name="RamzArab"></a>RamzArab</dt><dd>Mostafa Banouni, Mohamed Elyaakoubi, Azzeddine Lazrek,
<em>Dynamic Arabic mathematical fonts</em>, LNCS, Volume 3130, pp. 149-157, 2004.
(<a href="http://www.ucam.ac.ma/fssm/rydarab/doc/communic/tugfontm.pdf">http://www.ucam.ac.ma/fssm/rydarab/doc/communic/tugfontm.pdf</a>)
</dd><dt class="label"><a id="FarsiTeX" name="FarsiTeX"></a>FarsiTeX</dt><dd>Behdad Esfahbod, Roozbeh Pournader,
<em>FarsiTeX and the Iranian TeX community</em>.
(<a href="http://www.tug.org/TUGboat/Articles/tb23-1/farsitex.pdf">http://www.tug.org/TUGboat/Articles/tb23-1/farsitex.pdf</a>)
</dd><dt class="label"><a id="Dadzilla" name="Dadzilla"></a>Dadzilla</dt><dd>Mustapha Eddahibi, Azzeddine Lazrek, Khalid Sami,
<em>Arabic mathematical e-documents</em>, LNCS, Volume 3130, pp. 158-168, 2004.
(<a href="http://www.ucam.ac.ma/fssm/rydarab/doc/communic/tugmathm.pdf">http://www.ucam.ac.ma/fssm/rydarab/doc/communic/tugmathm.pdf</a>)
</dd></dl></div></div></body></html>