|
Author: Robert Dewar, Vasiliy Fofanov, Franco Gasperoni, Yang Zhang
Abstract: Ada Gem #22 — Ada speaks many languages. How many? All those languages that you can write with a computer keyboard. Read on to learn about this Ada gem.
The only characters allowed in an Ada 83 program (for strings, comments, and identifiers) were 7-bit ASCII symbols. That was annoying. In Ada 83 we could not write:
S : String : = "à la carte";
and even less write:
À_La_Carte : Boolean := False;
That was - how shall we put it - a “set menu” - view of things :)
An amendment changed this situation to 8-bit characters during the lifetime of Ada 83. Ada 95 made this 8-bit change clearer and more official, designating ISO Latin-1 as the character set. So the above are both legal in Ada 95. Ada 95 also introduced 16-bit ISO 10646 support in the form of Wide_Character. One can write in Ada 95:
My_Favorite_Pie : String := "π"; -- :)
but an implementation did not have to allow 16-bit characters in identifiers and comments, and Ada 95 did not mandate the acceptance of the following:
π : constant := 3.14159_26535_89793_23846_26433_83279_50288_41971_69399_37511;
— Can’t eat this one :)
although the GNAT technology for Ada 95 allows full 16-bit characters in identifiers and comments.
Ada 2005 is the ultimate in terms of openness: the full 32-bit ISO-10646 character set is supported and use of π (for instance) in identifiers, comments, and strings is allowed in an Ada 2005 program. As a matter of fact the package Ada.Numerics in Ada 2005 now contains:
Pi : constant := 3.14159_26535_89793_23846_26433_83279_50288_41971_69399_37511; π : constant := Pi;
To demonstrate the use of the full 32-bit ISO-10646 character set in Ada 2005 programs we have written a couple of programs in English, Russian, and Chinese. These programs take an ISO date ranging from 1983 to 2019 and print the date in the local format. Each program wishes you a happy new year if the date entered matches the local new year date. All file names are of the form:
happy_new_year_{locale}_{lang}.adb
where {locale} is a 2 letter sequence indicating the country for which the program has been written (e.g. “us” for the US, “cn” for China, “ru” for Russia) and {lang} the language in which the program has been written (e.g. “en” for English, “cn” for Chinese, “ru” for Russian). For instance “happy_new_year_cn_en.adb” is the program for China written in English, while “happy_new_year_cn_cn.adb” is the program for China written in Chinese. Having a program in English for China allows non Chinese speakers to understand what the program does and perhaps learn some Chinese :)
To compile the Ada programs provided with this gem we suggest that you use options -gnatW8 -gnat05.
Enjoy… Happy Holidays and Happy New Year :)
Ada Gems example files are distributed by AdaCore and may be used or modified for any purpose without restrictions.
happy_new_year_cn_cn.adb|
5.5Kb |
4.6Kb |
3.0Kb |
3.7Kb |
5.9Kb |
Posted
in Development Log, Ada / Ada 2005, Devt log - Gem of the Week
If you have an idea for a Gem you would like to contribute please feel free to contact us at: gems@adacore.com
Stanislaw Goldstein said:
I am very happy to see this gem, as I have been fighting for a long time to see non-ASCII characters in Ada. I have compiled the programs and run them. Unfortunately, the results are not very satisfactory, as I have seen neither Russian nor Chinese text in the output window and/or console. Is the problem already solved?(I have academic version 2007-2 of GPS and the compiler, on Windows). Has anybody been able to see the output as it should look like?
Stefan Bellon said:
I do not think this is a problem with the compiler, but with the setting of your console. Of course you need a console which is capable of displaying UTF-8 encoded text and have the necessary fonts available. I just checked with gnome-terminal (set to UTF-8) and uxterm, and the happy_new_year_cn_cn outputs nicely formatted Chinese dates (and the new year message for 2007-02-18 and 2008-02-07, so it looks like it works). Not that I understand Chinese of course … ;-)
Stanislaw Goldstein said:
You are right, it is not a problem of the compiler. Fortunately, the problem was solved for me by Robert Dewar and Nicolas Setton for the output window of the GPS - it turned out that one has to set the environment variable CHARSET to UTF-8. I still have a problem with the windows console (started with cmd). Alghout the console can show UTF-8 characters using Lucida Console font (from a fairly restricted subset of Unicode), and one can change the code page to UTF-8 (code page 65001), the programs show a run time error. I will give the details in the ticket GB04-005.