| Programming | Software Engineering | Web Design | Database | Operating Systems

The Globalization of Language in Oracle - Terminology

James Koopmann
Keywords: Globalization,Language,Oracle,Terminology
From: http://databasejournal.com/features/oracle/article.php/3483431

With today's exploding world economy, multi-national communication is essential. Databases must not only store different character sets but also present information in a comfortable format and order for individuals from every locale. This series explores how to globalize your databases and communicate effectively across the globe.

Let's begin learning about how sorting and character comparisons happen in a global market place by first understanding some of the terminology.

You can see in Listing 1 that I have created a table called CHAR_TABLE and inserted some characters along with their decimal and binary equivalents. I then decided to order these rows on the decimal, binary, and graphical representation. Since the binary number is generated from the binary number, it would be assumed that the sort order would be the same--and it was. However, when we sort on the graphical character we get a somewhat different order for the non-American characters. We must then ask ourselves why this happens. What is different about the graphical character representation that makes the sort work differently than that on the decimal value?

While we might all wish everyone to speak the same language, it surely isn't going to happen in most of our life times. While we are moving closer and closer to a real one-world economy, we will most surely still need to converse in multiple languages. This requires us to be able to store, retrieve, and manipulate these languages within our databases. The issue doesn't much lie in storage and retrieval, although there are a few minor issues, but instead it lies in how to provide an environment that gives linguistic meaning for the part of the globe that is viewing the information. This is most apparent when trying to perform equality and sorting on characters to retrieve the information that the user needs.

Listing 1

Example on sorting different characters: Create table to hold some characters

CREATE TABLE char_table (xDecimal NUMBER, xBinary NUMBER, xGraphic CHAR(4));
INSERT INTO char_table VALUES (65   ,01000001   ,'A');
INSERT INTO char_table VALUES (66   ,01000010   ,'B');
INSERT INTO char_table VALUES (67   ,01000011   ,'C');
INSERT INTO char_table VALUES (68   ,01000100   ,'D');
INSERT INTO char_table VALUES (79   ,01001111   ,'O');
INSERT INTO char_table VALUES (85   ,01010101   ,'U');
INSERT INTO char_table VALUES (97   ,01100001   ,'a');
INSERT INTO char_table VALUES (98   ,01100010   ,'b');
INSERT INTO char_table VALUES (99   ,01100011   ,'c');
INSERT INTO char_table VALUES (100  ,01100100   ,'d');
INSERT INTO char_table VALUES (111  ,01101111   ,'o');
INSERT INTO char_table VALUES (117  ,01110101   ,'u');
INSERT INTO char_table VALUES (196  ,11000100   ,'Ä'); 
INSERT INTO char_table VALUES (214  ,11010110   ,'Ö');
INSERT INTO char_table VALUES (220  ,11011100   ,'Ü');
INSERT INTO char_table VALUES (228  ,11100100   ,'ä');
INSERT INTO char_table VALUES (246  ,11110110   ,'ö');
INSERT INTO char_table VALUES (252  ,11111100   ,'ü');

Select the Characters and order them on the Decimal, Binary, and Character value:

SELECT * FROM char_table ORDER BY xDecimal;
SELECT * FROM char_table ORDER BY xBinary;
SELECT * FROM char_table ORDER BY xGraphic;

ORDER BY xDecimal

ORDER BY xBinary

ORDER BY xGraphic

XDECIMAL XBINARY XGRA

--------- ---------- ----

65 1000001 A

66 1000010 B

67 1000011 C

68 1000100 D

79 1001111 O

85 1010101 U

97 1100001 a

98 1100010 b

99 1100011 c

100 1100100 d

111 1101111 o

117 1110101 u

196 11000100 Ä

214 11010110 Ö

220 11011100 Ü

228 11100100 ä

246 11110110 ö

252 11111100 ü

XDECIMAL XBINARY XGRA

-------- ---------- ----

65 1000001 A

66 1000010 B

67 1000011 C

68 1000100 D

79 1001111 O

85 1010101 U

97 1100001 a

98 1100010 b

99 1100011 c

100 1100100 d

111 1101111 o

117 1110101 u

196 11000100 Ä

214 11010110 Ö

220 11011100 Ü

228 11100100 ä

246 11110110 ö

252 11111100 ü

XDECIMAL XBINARY XGRA

-------- ---------- ----

65 1000001 A

66 1000010 B

67 1000011 C

68 1000100 D

79 1001111 O

85 1010101 U

97 1100001 a

98 1100010 b

99 1100011 c

100 1100100 d

111 1101111 o

117 1110101 u

252 11111100 ü

220 11011100 Ü

196 11000100 Ä

246 11110110 ö

228 11100100 ä

214 11010110 Ö



QUICK TIP: Creating special characters from the keyboard
Hold the <ALT>-key down while punching in the ASCII code in the number pad.
You will need to prefix the decimal number with a zero.
So if you wanted to type in a Ü then you would hold down the <ALT>-key and key in 0220 with the number pad.

It is not usually my style to begin with an example but I think the example in Listing 1 brings a few items to the forefront. You can see in Listing 1 that I have created a table called CHAR_TABLE and inserted some characters along with their decimal and binary equivalents. I then decided to order these rows on the decimal, binary, and graphical representation. Since the binary number is generated from the binary number, it would be assumed that the sort order would be the same--and it was. However, when we sort on the graphical character we get a somewhat different order for the non-American characters. We must then ask ourselves why this happens. What is different about the graphical character representation that makes the sort work differently than that on the decimal value?

While we might all wish everyone to speak the same language, it surely isn't going to happen in most of our life times. While we are moving closer and closer to a real one-world economy, we will most surely still need to converse in multiple languages. This requires us to be able to store, retrieve, and manipulate these languages within our databases. The issue doesn't much lie in storage and retrieval, although there are a few minor issues, but instead it lies in how to provide an environment that gives linguistic meaning for the part of the globe that is viewing the information. This is most apparent when trying to perform equality and sorting on characters to retrieve the information that the user needs.

In subsequent articles, I will take us through a journey of handling and configuring our databases so that we may be able to have it converse within a global economy and provide a local flavor of the data to who ever is viewing it. As there are many definitions revolving around language and data comparison, I thought it only proper to explore the terminology around the globalization of language before getting much deeper into this topic. I am positive that after reviewing these definitions you will soon gain some insight into why this is such a difficult task for a database vendor to overcome.

Terminology

locale

The environment that your database system is being accessed from and has a desire to have information displayed and handled in its native format.

Language

Signifies a part of the world and dictates specific conventions on how character data is displayed, sorted, and compared.

Territory


Related Article
  • Oracle and Regular Expressions
  • Oracle Moves to Upgrade Support for Database Users
  • Session statistics of Oracle DB
  • Oracle 10g DataPump, Part 2: Implementation
  • Oracle 10g DataPump, Part 1: Overview

  • Comment
    No comment now.
    Add Your Comment:
    Your Name:      
    Your Comment:
    Note: After you post comment,please refresh the browser to show you comment.
    Search In YeYan.CN:
     

    Home | Privacy Policy | Copyright Policy | Contact Us | Site Map
    Copyright © 2006 YeYan.CN, All Rights Reserved.