• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..05-Dec-2021-

from_glibc/H05-Dec-2021-900615

READMEH A D05-Dec-20212 KiB4532

gen_wcwidth.pyH A D05-Dec-20213.2 KiB10767

utf8-dump.pyH A D05-Dec-20212.5 KiB7030

README

1This directory contains a mechanism for GCC to have its own internal
2implementation of wcwidth functionality.  (cpp_wcwidth () in libcpp/charset.c).
3
4The idea is to produce the necessary lookup table
5(../../libcpp/generated_cpp_wcwidth.h) in a reproducible way, starting from the
6following files that are distributed by the Unicode Consortium:
7
8ftp://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt
9ftp://ftp.unicode.org/Public/UNIDATA/EastAsianWidth.txt
10ftp://ftp.unicode.org/Public/UNIDATA/PropList.txt
11
12These three files have been added to source control in this directory;
13please see unicode-license.txt for the relevant copyright information.
14
15In order to keep in sync with glibc's wcwidth as much as possible, it is
16desirable for the logic that processes the Unicode data to be the same as
17glibc's.  To that end, we also put in this directory, in the from_glibc/
18directory, the glibc python code that implements their logic.  This code was
19copied verbatim from glibc, and it can be updated at any time from the glibc
20source code repository.  The files copied from that respository are:
21
22localedata/unicode-gen/unicode_utils.py
23localedata/unicode-gen/utf8_gen.py
24
25And the most recent versions added to GCC are from glibc git commit:
26f6032247061fb37d59565f2e9667e242c8a98e76
27
28Finally, the script gen_wcwidth.py found here contains the GCC-specific code to
29map glibc's output to the lookup tables we require.  This script should not need
30to change, unless there are structural changes to the Unicode data files or to
31the glibc code.
32
33The procedure to update GCC's wcwidth tables is the following:
34
351.  Update the three Unicode data files from the above URLs.
36
372.  Update the two glibc files in from_glibc/ from glibc's git.  Update
38    the commit number above in this README.
39
403.  Run ./gen_wcwidth.py X.Y > ../../libcpp/generated_cpp_wcwidth.h
41    (where X.Y is the version of the Unicode standard corresponding to the
42    Unicode data files being used, most recently, 13.0.0).
43
44After that, GCC's wcwidth will match the most recent glibc.
45