CEN/TC304 N633R
SC22/WG20 N499R
Date: 1996-10-12


Title: Minutes of the 1st Joint Meeting of JTC1/SC22/WG20, CEN/TC304/WG1, and CEN/TC304/WG2

Source: Michael Everson
Status: Adopted Minutes


ÖNORM, Heinestraße, Wien, Österreich
Wednesday 1996-09-11

AT Gerhard Budin
CA Alain LaBonté
DK Keld Simonsen
IE Michael Everson
IS Žorvaršur Kįri Ólafsson
JP Takayuki Sato
JP Akio Kido
PL Elz°bieta Broma-Wrzesień
US Arnold Winkler
US Michael Kung

Chair: Arnold Winkler
Secretary: Michael Everson

Arnold Winkler: Wants to get all the disharmonies regarding the ordering standard between ISO and CEN.

Alain LaBonté: It's an API for ordering strings. the default tables are aimed to be adaptable and modifiable. In the default we have looked at the tables, many things are identical, the order of accents for instance. We have fewer things than the input from Michael Everson. Yesterday Michael and I looked at the ordering of specials, Michael did a good job of ordering specials by shape and categories. Three simple things are yet unagreed: (1) Caps before smalls or small before caps, (2) the order of scanning of accents at second phase of scanning, (3) affect of spaces on ordering is also yet to be agreed.

Keld Simonsen: CEN has decided to do forward ordering on second level

Table by Žorvaršur Kįri Ólafsson:
Accents forwardAccents backward
multilingual monolingual
 correct for French culture
 Canadian multilingual standard
does matter for small languagesdoesn't matter for big languages
no tradition does matter for big languages
L2 not used in bibliogr.incorrect for others
 conflict with DIN

Arnold Winkler: It seems that a few countries have very specific rules, in a default situation, we can either go with what people want or what most people expect.

John Clews: in libraries, there has been a tradition of equating all accented letters as non accented, ß with ss etc. This is well established.

Arnold Winkler: Default: should we help the French and make everyone else change if they want to?

Takayuki Sato: Viewpoint of people not dealing with accents, more than 50%. If I see something with an accent, I expect Left-to-Right normally -- it's predictable. If we open same argument for CJK ideographs, we can argue this forever. But in this IS is for the world, ideograph ordering should be friendly with Europeans and Americans, likewise International default order cannot be culturally correct for everyone. Backward is only culturally correct for French.

Keld Simonsen: I agree with Sato-san. This is a minor problem but it matters more that we agree than what we agree on. In a language as well-defined as Danish we don't have the rules. The major thing is that we agree, and not what we agree on.

John Clews: International standard should absorb what CEN uses. The French way should be a reference to a National standard, it is for one language.

Arnold Winkler: How many occurrences might this be.

John Clews: in practical situations, there is implementers who are affected, and the other is end users, who are expecting or not expecting, in many cases it won't matter to them.

Keld Simonsen: this is not an implementation issue, they need to implement for both directions in any case.

Evangelos Melagrakis: remark: the issue of normalization is changing a letter like thorn to something else is not well accepted in last WG1 meeting. Are we talking about multiscript ordering, if we decide on Left-to-Right or Right-to-Left passing of diacritical marks will this affect other scripts than Latin?

Alain LaBonté: No.

Evangelos Melagrakis: Right-to-Left made some problems for us when we tried to implement it. it is a little bit less effective than Left-to-Right. In Greece we have no specific rules regarding this; after long discussions a small majority decided for Left-to-Right ordering.

Arnold Winkler: I don't have the feeling that we are any closer to agreement on it. We have two countries with many words in which these occur. Also they are heavy users of computers. On the other side we have a much larger group of countries which have fewer words for which this is an issue at all. For whom do we make the default? It's only the default we can look at. We have to decide which direction we do this for either supporting what is supported today for other languages

Michael Everson: Is the dialogue box, the way options are presented to the end user, specified?

Žorvaršur Kįri Ólafsson: reminds me of the conformance table we made for the MES

Michael Everson: For multiscript, default, multilingual sorting, I want to see all scripts work the same way.

John Clews: Information is more and more global, we shouldn't want a different rule to apply to Latin, Greek, Cyrillic, Armenian. Georgian. There's nothing wrong with taking a rule that applies to all the scripts.

Keld Simonsen: from an implementers viewpoint, they just do what the tables tell them. We have a decision to say that we order each of the scripts according to what the users of the script want. Latin for Latin, Greek for Greek, Cyrillic for Cyrillic. That way the people who know these scripts. This also maximized the use of our standard.

Evangelos Melagrakis: Easily accepts an IS with options, 2, 4, 8, whatever. What I would not like to see a registry having 15 or 20 or 50 with 50 different ordering schemes. The Registry can only be a registry of 2 or 3 options.

Keld Simonsen: You always sort one way at a time.

Arnold Winkler: The standard allows the possibility to tailor

Much discussion.

Evangelos Melagrakis: I have reservations regarding the SP/NBSP, I think we can handle this in normalization before the sorting process.

Arnold Winkler: From all this we need the toggles, we need to agree to disagree, then I think we are in good shape, let's do this.

AI9609-J1 on Michael Everson to provide text of an informative annex regarding the interface for the toggles, how these are presented to the end user.

Žorvaršur Kįri Ólafsson: What are we doing about NBSP? Should it be ignored on first level? Do we toggle SP and NBSP against each other? One being a special at level and one being a character at level 1?

Keld Simonsen: Resolution that we need 3 toggle switches in the Standard (8 options)

Michael Everson presented the additions to the basic constituents to the Latin alphabet.


Discussion ensued. Every character has its own right.

Gerhard Budin: The philosophy of 26 letters is 646 thinking. This is 10646 thinking.

Michael Everson: Basic letters added to the Latin alphabet come in three kinds: deformations of existing characters, borrowings from foreign alphabets, and pure inventions.

Keld Simonsen: OI should be a G. Could be a G. SIGMA should be S.

Michael Everson: It isn't OI, it is GHA . It isn't SIGMA, it is ESH .

Michael Kung: If you insert GHA as a basic Latin character it is not backwards compatible to weight.

John Clews: I am concerned about the openendedness, of this. If we extend it it will just run and run.

Michael Everson: What is the harm of that? In the history of our alphabet, letters (basic forms) have been added to the alphabet, inserted in some instances and tagged on to the end in some instances. I have followed the established fact, the habits of the users of our alphabet over centuries.

Alain LaBonté: If we have no expectation, we should take the expectation of those who use this letter.

Gerhard Budin: What is the consequence of inserting letters into the basic ordering?

Michael Kung: for people who have done those weight assignments it could be a problem. It breaks existing assumptions.

Michael Everson: I could try harder to merge GHA and ESH, maybe WYNN , but GLOTTAL and CLICK can't be. But my view is that it would be wrong to force this on any of these letters, and against traditional practice of the users of our alphabet.

Arnold Winkler: Michael Everson will think harder about some of these, the problem of implementation, AI9609-J2 Michael Everson, Alain LaBonté, Michael Kung to discuss this further on the web.

Keld Simonsen: If you have a web page, you can index, you go to the first letter, you choose, click, go to your letter. This kind of link is talorable

Michael Kung: in sorting (?), you'll have trouble if your software was done as binary values where G was 10 and H was 11, but now it's 12, because you have added GHA. when you insert a base character.

Alain LaBonté: But doing it by binary values is the programmer's mistake, that is not recommended.

Arnold Winkler: We've discussed, this, it should remain open at this time, Michael Everson will work on this.

Michael Everson: Order of scripts is a new issue.

John Clews: 1988, took a particular order in his automated La El Cy, Ka Hy, He Ar, going basically from West to East. Taking oldest first.

Alain LaBonté: There is a religious problem. IBM had 4 volumes in their NLTP requirements. They had to have 2 volumes for Arabic and Hebrew because The Israelis accept that Hebrew be sorted after Arabic but the Arabs do not accept the reverse.

Keld Simonsen: We adopted a sequence of scripts. We would like to have the 2 standards, we don't have this 10646 requirement. This is a WG20 matter, we don't have a resolution for the 10646 script order.

Michael Everson: La El Cy Hy Ka, Ar He, Indic, CJK.

Keld Simonsen: CEN perspective, I would like ISO to at least use La El Cy.

Evangelos Melagrakis: we don't need a resolution, the minutes are sufficient.
AI9609-J3 Everyone to provide input on ordering of scripts if there are strong feelings

Lunch. Viennese.

John Clews: Is chair to TC46/SC2, Conversion of written languages (mostly transliteration). Reported in general. Wants to make SC2 standards more relevant to JTC1 as a whole. Keyboard layout and locales might also be associated with transliteration.

Keld Simonsen: the proposed API standard already addresses information on transliteration and transcription. We are very much aware of the need and are trying to accommodate that. Also WG15, POSIX, is working in this area.

Evangelos Melagrakis: We are preparing a new work program in our committee, we should have some requirements given to us, would like revision of current standards, etc.

Takayuki Sato: 14652, both have projects open, per Vienna agreement, we need to decide how to cooperate, who shall do work, who shall do what. WG20 needs a resolution to report to SC22, saying that we want to work with TC304.

Keld Simonsen: No, this is an internal ISO cooperation between TC46/SC2 and WG20

Evangelos Melagrakis: We are dealing with transformation of Character repertoires, not coding.

Michael Everson: There are TC304 WG4 projects related to this.

Žorvaršur Kįri Ólafsson. Registration standard. ENV 12005, we have this as a NWI to make it into an EN, TC304 have encouraged DS to fasttrack this.

Keld Simonsen. First DS will implement this as a Danish Standard and then fasttrack it through ISO. This is the procedure and it is underway.

Michael Everson: as a WG20 member I think that if we get this we should make sure it is published bilingually in English and French.

Michael Kung: 10646 mappings should be mentioned for language requirements.

Keld Simonsen: the repertoiremap, defined in XOpen registry and part of the ENV 12005, it defines the repertoire in terms of 10646. The proposal from DS will have the changes necessary for ISO.

Michael Kung: You should have a normative Annex E.

Žorvaršur Kįri Ólafsson: Annex E is normative but contains informative clauses

Akio Kido: This standard is already published. DS wants to fasttrack it. Can we modify its contents if fasttracking is employed?

Arnold Winkler: Fasttracking does not allow modification of the standard. If DS makes modification I don't know what happens.

Takayuki Sato: Only editorial mistakes can be corrected.

Keld Simonsen: A DIS vote is like this. For fasttracking a few technical changes can be made.

Arnold Winkler: will check this out in the SC22 meeting.

Evangelos Melagrakis: In TC304 meeting GR voted against the adoption of this standard as an IS, because many things have to be changed technically to go from EN to IS. This should be a very good base document to be an ISO standard but as it is it cannot be fasttracked. For procedural reasons we objected.

Keld Simonsen: The changes necessary are accounted for by the ISO rules. DS will fasttrack it with proposed text for changes.

Žorvaršur Kįri Ólafsson: If it is a approved is it just a matter of editing?

Keld Simonsen: I believe so, it is like the TR 11017, a few modifications might be made as a disposition of comments, but there is only 1 vote in fasttracking. If it fails it would need an NP in JTC1.

Michael Everson: If the changes do not allow changes in fasttracking, we could make the changes as part of the CEN revision

Keld Simonsen: Or DS could forward the ENV as it is, and the European-specific text could be altered in amendments, the main technical specifications do not need to be changed at all.

Arnold Winkler: In Preparation and Adoption of ISs, you can submit without modification directly for vote. May request that the doc. be submitted to 1 or more SC22, their comments can be taken into account before the formal submission.

Keld Simonsen: We will consider this.

Michael Everson: presented P11.

Žorvaršur Kįri Ólafsson: P17: European Default Locale:

Keld Simonsen: Michael Everson has had some input to it, Olle Järnefors has produced some input. For Cooperation with WG20; WG20 is specifying a default locale in 14652, covering all of 10646.

Akio Kido: it it just POSIX

Keld Simonsen: Yes, that is what we are registering in ENV 12005.

Michael Everson: No, it also has a narrative version of the thing.

Akio Kido: I think that our cultural specification could be a superset of the POSIX locale, What is the expected cooperation required.

Keld Simonsen: We need to align our specifications, with no unintentional deviation.

Takayuki Sato: If that is the case I have 2 questions. WG15 said that they are giving up cooperating; what do you want to do about WG15. WG20 will if necessary develop a generic standard, if you want to maintain some kind of consistency, this will bind WG20 activities.

Žorvaršur Kįri Ólafsson: as a general comment, to cooperate between ISO and CEN is intended to stop this kind of problem,a nd to avoid doing the same work at the same time with different results. The Vienna agreement says that ISO and CEN shall cooperate. But CEN can go its own way and ISO can go its own way.

Keld Simonsen: There is no cooperation with WG15 because they are not making any locales. But TC304 and WG20 are making locales, and they are both expected o be the basis for their member bodies to make their own locales.

Žorvaršur Kįri Ólafsson: the desire is to go toward international harmonization, which is why we try not to wok on things ISO is doing. But also we expect ISO won't start working on things we are working on.

Takayuki Sato: Since there is no locale project in ISO is in WG20 right now, so what we are cooperating on? And WG15 which did locales has said to forget about CEN. In general, we don't like to dream separate dreams using the same words. We would like to agree very precisely now, the WG20 project is meant to be generic. How can we achieve the ideal goal, what is the actual plan.

Žorvaršur Kįri Ólafsson: The way to do it is to ask the European experts in WG20 to speak for the European specifications. Over the last few months those experts have made some headway and they have a fairly good idea what the default will be looking like.

Evangelos Melagrakis: I begin to lose track, it seems that WG20 people are talking to each other... Is there a WI in WG20 that will produce something like what we are reproducing?

Takayuki Sato: no, there is a generic project. But the two projects do not have the same scope., Similar, but with different specification technique.

Žorvaršur Kįri Ólafsson: I would like to see WG20 ask TC304 to provide data.

Keld Simonsen: It is not intended to fasttrack the CEN standard. We will just be putting it into our CEN standard on European Locale.

Evangelos Melagrakis: Is EN12005 related to any current WG20 WI?

Takayuki Sato: No.

Akio Kido: WG20 has no project to develop an international locale. The sorting standard does have a default table. It seems to me that we are already focused on cooperation.

Specification method (POSIX in CEN, WG20 is specifying a more generic method)
Registration procedure referring to narrative and to POSIX
The contents of the locale, national, supranational, international, default

Žorvaršur Kįri Ólafsson: will if WG20 wants to work on this, would it base the work on the European locale?

Arnold Winkler: I never reject good input, of course.

Žorvaršur Kįri Ólafsson: well if and when WG20 starts on this, they will take our input.

Keld Simonsen: it's WG20 Resolution 9509-09. It will refer to 14651.

Resolutions: We need a cooperation plan. One way to deal with it would be to base WG2 locale work

Arnold Winkler: a language independent specification with the premise that the POSIX specification will be a subset of it, is sufficiently different from the POSIX registration.

Keld Simonsen: On the sorting issues and the default locale issues, the resolutions are much the same, WG20 is requested to take CEN input, CEN is requested to take WG20 input, and the two groups are to try to harmonize

Akio Kido: as for the locale itself, POSIX default locale is WG15's, WG20 cannot edit ISO's default POSIX locale.

Takayuki Sato: Even if WG15 refused, CEN has options.

Žorvaršur Kįri Ólafsson: CEN can force the cooperation, even if WG15 is reluctant and has no resources.

Keld Simonsen: POSIX doesn't have a rapporteur group on internationalization work, this request on the currency is being proposed

Michael Everson: Can we see the proposed syntax?

Keld Simonsen: Yes, but it is only proposed. AI9609-J4 Keld Simonsen to provide the proposed syntax.

Akio Kido: POSIX resolution "internationalization matters will be handled by WG20" is what I saw. 2. If POSIX asks WG20 to modify some POSIX default locale elements, we need to have something from WG15. 3. Our WG20 scope is internationalization, worldwide. If we develop a default locale, the date time format should be ISO standard, numeric, collation, everything should be ISO standard.

Discussion of Joint Meeting Doc 1 (WG2 things).

Gerhard Budin: TC304 should not be too interest in Parole and Speechdat, but Glossasoft might prove fruitful.

Keld Simonsen reported on the API.

Michael Everson: WG2 is likely to do this tomorrow.

Combined Resolution 9609-01 Options of International Standard Ordering
SC22/WG20, CEN/TC304/WG1, and CEN/TC304/WG2 resolve that IS 14561 standard will specify up to 8 options for the combinations of the following toggles with respect to the Latin script: Whether capitals or smalls be ordered first, whether accents are ordered from the beginning or end of the string on the second level, whether space is considered on the first level or not.

Actions:

AI9609-J1 on Michael Everson to provide text of an informative annex regarding the interface for the toggles, how these are presented to the end user.
AI9609-J2 Michael Everson, Alain LaBonté, Michael Kung to discuss the insertion of Latin letters into ordering further on the web.
AI9609-J3 Everyone to provide input on ordering of scripts if there are strong feelings
AI9609-J4 Keld Simonsen to provide the proposed POSIX syntax for the date extensions.


Back to the WG2 Home Page
HTML Michael Everson, [email protected], Baile Įtha Cliath, 1997-09-16