Aquest mòdul proporciona funcions que accedeixen a la informació dels punts de codi Unicode. La informació s'obté dels mòduls de dades generats a partir de la Base de dades de caràcters Unicode, o derivada per regles donades a Especificació Unicode. Aquest mòdul i els seus submòduls van ser copiats del Viccionari anglès i després modificats; vegeu allà per a més informació.

Paràmetres i funcions

modifica

punt de codi

modifica

El punt de codi s'ha d'introduir com a valor hexadecimal. Per exemple U+00A9 © copyright sign:

|A9hex
|0xA9hex
|0x00A9hex
|0x00a9hex
{{#invoke:Unicode data|lookup|name|0x00A9}} → COPYRIGHT SIGN

Incorrect or unintended results:

169dec: {{#invoke:Unicode data|lookup|name|169}} → LATIN SMALL LETTER U WITH TILDE  2 —"U+00A9" © expected; but is read as 00A9hex (that is, 361dec
U+00A9 {{#invoke:Unicode data|lookup|name|U+00A9}}  2 —do not use "U+" prefix
غ {{#invoke:Unicode data|lookup|name|غ}}  2 —cannot enter a character as codepoint

"lookup" and "is" functions

modifica
lookup, is
Template-invokable functions that allow access to the functions starting with lookup and is.For most of the functions, add the code point in hexadecimal base as the next parameter. For is"|Latin, is|rtl, and is|valid_pagename, add character string. HTML character references in the text are decoded by the module into code points.
For example, {{#invoke:Unicode data|is|Latin|àzàhàr̃iyyā̀}} → true.
Internally, in modules, these functions are named using underscore: lookup_name|code pointlookup_name
For &A9; ©: {{#invoke:Unicode data|lookup|name|A9}} → COPYRIGHT SIGN

Visió general de les funcions

modifica
This table:
  • Code points: enter hexadecimal value, for example |0x0061 or |61; not |U+0061.
Tema Funció Tipus de paràmetre
(cadena=per caràcter(s); c.p. per valor 0xHex)
Exemple Retorna Caràcter
Unicode character name |lookup|name code point
  • {{#invoke:Unicode data|lookup|name|0xA9}}
  • {{#invoke:Unicode data|lookup|name|0x0007}}
  • COPYRIGHT SIGN
  • <control-0007>
  • ©
  • &#x0007;
Scripts |lookup|script code point {{#invoke:Unicode data|lookup|script|A061}} Yiii
Blocks |lookup|block code point {{#invoke:Unicode data|lookup|block|A061}} Yi Syllables
Planes |lookup|plane code point
  • {{#invoke:Unicode data|lookup|plane|0xA9}}
  • {{#invoke:Unicode data|lookup|plane|0x1F608}}
  • Basic Multilingual Plane
  • Supplementary Multilingual Plane
  • ©
  • 😈


General Category |lookup|category code point
  • {{#invoke:Unicode data|lookup|category|0xA9}}
  • {{#invoke:Unicode data|lookup|category|0x002B}}
  • So
  • Sm
  • ©
  • +
Controls |is|control code point
  • {{#invoke:Unicode data|lookup|control|A9}}
  • {{#invoke:Unicode data|lookup|control|FFFF}}
  • assigned
  • unassigned
  • ©
  • &#xFFFF;
Latin script |is|Latin string
  • {{#invoke:Unicode data|is|Latin|abcŁíā̀}}
  • {{#invoke:Unicode data|is|Latin|abc文xyz}}
  • true
  • false
WP:Article title (WP:NCTR) |is|valid_pagename string
  • {{#invoke:Unicode data|is|valid_pagename|Main_page}}
  • {{#invoke:Unicode data|is|valid_pagename|# (disambiguation)}}
  • true
  • false
Bidirectionality, right-to-left scripts |is|rtl string
  • {{#invoke:Unicode data|is|rtl|ش}}
  • {{#invoke:Unicode data|is|rtl|34}}
  • true
  • false
  • ش
  • 4
Combining character |is|combining code point
  • {{#invoke:Unicode data|is|combining|0300}}
  • {{#invoke:Unicode data|is|combining|64}}
  • true
  • false
  • ̀
  • d
Character assignation |is|assigned code point
  • {{#invoke:Unicode data|is|assigned|A061}}
  • {{#invoke:Unicode data|is|assigned|FFEF}}
  • true
  • false
  • ;
Printable |is|printable code point
  • {{#invoke:Unicode data|is|printable|0061}}
  • {{#invoke:Unicode data|is|printable|0007}}
  • {{#invoke:Unicode data|is|printable|FFFF}}
  • >true<
  • >false<
  • >false<
  • >a<
  • >&#x0007;<
  • >&#xFFFF;<
Whitespace character § Unicode |is|whitespace code point
  • {{#invoke:Unicode data|is|whitespace|0x20}}
  • {{#invoke:Unicode data|is|whitespace|0xA0}}
  • {{#invoke:Unicode data|is|whitespace|0x64}}
  • >true<
  • >true< NBSP
  • >false<
  • > <
  • > <
  • >d<
Hangul |Hangul [application unknown]
  • &#x;
  • &#x;
Alias names |aliases [application unknown]
  • &#x;
  • &#x;
Combining class | [application unknown]
  • &#x;
  • &#x;
Age | [application unknown]
get_best_script |get_best_script [application unknown]

Data modules

modifica

The data used by functions in this module is found in submodules. Some are generated by AWK scripts shown at User:Kephir/Unicode on English Wiktionary, others by Lua scripts on the /make subpages of the submodules.

The name data modules (Module:Unicode data/names/xxx) were compiled from UnicodeData.txt. Each one contains, at maximum, code points U+xxx000 to U+xxxFFF.

Character name data modules,
organized by first three digits of codepoint in hexadecimal base
0 1 2 3 4 5 6 7 8 9 A B C D E F
00x U+0000–
U+0FFF
modifica

The Unicode database is released by Unicode Inc. under the following terms:

Copyright © 1991-2018 Unicode, Inc. All rights reserved. Distributed under the Terms of Use in https://www.unicode.org/copyright.html.

Permission is hereby granted, free of charge, to any person obtaining a copy of the Unicode data files and any associated documentation (the "Data Files") or Unicode software and any associated documentation (the "Software") to deal in the Data Files or Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, and/or sell copies of the Data Files or Software, and to permit persons to whom the Data Files or Software are furnished to do so, provided that either (a) this copyright and permission notice appear with all copies of the Data Files or Software, or (b) this copyright and permission notice appear in associated Documentation.

THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THE DATA FILES OR SOFTWARE.

Except as contained in this notice, the name of a copyright holder shall not be used in advertising or otherwise to promote the sale, use or other dealings in these Data Files or Software without prior written authorization of the copyright holder.

Known issues

modifica
  • Reading data like Module:Unicode data/aliases not provided nor documented
  • Test fail: lookup_category U+FFFF (<noncharacter-FFFF>) expected: Cn.
{{#invoke:Unicode data|lookup|category|0xFFFF}} → [Nil]

See also

modifica
  • Named entities: for example, U+22C1 n-ary logical or: {{#invoke:LoadData|Numcr2namecr|0x22C1}} → &bigvee;, &Vee;, &xvee;