nexmon – Blame information for rev 1
?pathlinks?
Rev | Author | Line No. | Line |
---|---|---|---|
1 | office | 1 | While some other iconv(3) implementations - like FreeBSD iconv(3) - choose |
2 | the "many small shared libraries" and dlopen(3) approach, this implementation |
||
3 | packs everything into a single shared library. Here is a comparison of the |
||
4 | two designs. |
||
5 | |||
6 | * Run-time efficiency |
||
7 | 1. A dlopen() based approach needs a cache of loaded shared libraries. |
||
8 | Otherwise, every iconv_open() call will result in a call to dlopen() |
||
9 | and thus to file system related system calls - which is prohibitive |
||
10 | because some applications use the iconv_open/iconv/iconv_close sequence |
||
11 | for every single filename, string, or piece of text. |
||
12 | 2. In terms of virtual memory use, both approaches are on par. Being shared |
||
13 | libraries, the tables are shared between any processes that use them. |
||
14 | And because of the demand loading used by Unix systems (and because libiconv |
||
15 | does not have initialization functions), only those parts of the tables |
||
16 | which are needed (typically very few kilobytes) will be read from disk and |
||
17 | paged into main memory. |
||
18 | 3. Even with a cache of loaded shared libraries, the dlopen() based approach |
||
19 | makes more system calls, because it has to load one or two shared libraries |
||
20 | for every encoding in use. |
||
21 | |||
22 | * Total size |
||
23 | In the dlopen(3) approach, every shared library has a symbol table and |
||
24 | relocation offset. All together, FreeBSD iconv installs more than 200 shared |
||
25 | libraries with a total size of 2.3 MB. Whereas libiconv installs 0.45 MB. |
||
26 | |||
27 | * Extensibility |
||
28 | The dlopen(3) approach is good for guaranteeing extensibility if the iconv |
||
29 | implementation is distributed without source. (Or when, as in glibc, you |
||
30 | cannot rebuild iconv without rebuilding your libc, thus possibly |
||
31 | destabilizing your system.) |
||
32 | The libiconv package achieves extensibility through the LGPL license: |
||
33 | Every user has access to the source of the package and can extend and |
||
34 | replace just libiconv.so. |
||
35 | The places which have to be modified when a new encoding is added are as |
||
36 | follows: add an #include statement in iconv.c, add an entry in the table in |
||
37 | iconv.c, and of course, update the README and iconv_open.3 manual page. |
||
38 | |||
39 | * Use within other packages |
||
40 | If you want to incorporate an iconv implementation into another package |
||
41 | (such as a mail user agent or web browser), the single library approach |
||
42 | is easier, because: |
||
43 | 1. In the shared library approach you have to provide the right directory |
||
44 | prefix which will be used at run time. |
||
45 | 2. Incorporating iconv as a static library into the executable is easy - |
||
46 | it won't need dynamic loading. (This assumes that your package is under |
||
47 | the LGPL or GPL license.) |
||
48 | |||
49 | |||
50 | All conversions go through Unicode. This is possible because most of the |
||
51 | world's characters have already been allocated in the Unicode standard. |
||
52 | Therefore we have for each encoding two functions: |
||
53 | - For conversion from the encoding to Unicode, a function called xxx_mbtowc. |
||
54 | - For conversion from Unicode to the encoding, a function called xxx_wctomb, |
||
55 | and for stateful encodings, a function called xxx_reset which returns to |
||
56 | the initial shift state. |
||
57 | |||
58 | |||
59 | All our functions operate on a single Unicode character at a time. This is |
||
60 | obviously less efficient than operating on an entire buffer of characters at |
||
61 | a time, but it makes the coding considerably easier and less bug-prone. Those |
||
62 | who wish best performance should install the Real Thing (TM): GNU libc 2.1 |
||
63 | or newer. |
||
64 |