[texinfo-pretest] texinfo 4.7.90 pretest available

Karl Berry karl at freefriends.org
Sun Dec 5 18:48:01 EST 2004


How does this look for the specification / documentation on the two
points we've discussed?

Thanks,
k

--- texinfo.txi.~1.119.~	2004-11-30 05:29:20.000000000 -0800
+++ texinfo.txi	2004-12-05 15:45:37.000000000 -0800
@@ -16354,6 +16354,10 @@
 @enumerate
 @item
-The standard ASCII letters (a-z and A-z), and numbers (0-9) are not
-modified.  All other characters are changed as specified below.
+The standard ASCII letters (a-z and A-Z) are not modified.  All other
+characters are changed as specified below.
+
+ at item
+The standard ASCII numbers (0-9) are not modified except when a number
+is the first character of the node name.  In that case, see below.
 
 @item
@@ -16375,4 +16379,11 @@
 This includes @samp{_}, which is mapped to @samp{_005f}.
 
+ at item
+If the node name does not begin with a letter, the literal string
+ at samp{g_t} is prefixed to the result.  (Due to the rules above, that
+string can never occur otherwise; it is an arbitrary choice, standing
+for ``GNU Texinfo''.)  This is necessary because XHTML requires that
+identifiers begin with a letter.
+
 @end enumerate
 
@@ -16499,8 +16510,9 @@
 @cindex Expansion of 8-bit characters in HTML cross-references
 
-Characters other than plain 7-bit ASCII are transformed into the
-corresponding Unicode code point(s), in Normalization Form C, which
+Usually, characters other than plain 7-bit ASCII are transformed into
+the corresponding Unicode code point(s) in Normalization Form C, which
 uses precomposed characters where available.  (This is the
-normalization form recommended by the W3C and other bodies.)
+normalization form recommended by the W3C and other bodies.)  This
+holds when that code point is 0xffff or less, as it almost always is.
 
 These will then be further transformed by the rules above into the
@@ -16519,4 +16531,11 @@
 therefore expands to @samp{B_0306} (B with combining breve).
 
+When the Unicode code point is above 0xffff, the transformation is
+ at samp{__ at var{xxxxxx}}, with two leading underscores followed by six
+hex digits.  Since Unicode has declared that their highest code point
+is 0x10ffff, this is sufficient.  (We felt it was better to define
+this extra escape than to always use six hex digits, since the first
+two would nearly always be zeros.)
+
 For the definition of Unicode Normalization Form C, see Unicode report
 UAX#15, @uref{http://www.unicode.org/reports/tr15/}.  Many related


More information about the texinfo-pretest mailing list