Java Solaris Communities Sun Store Join SDN My Profile Why Join?
 
Bug Database
Bug Detail
Quick Lists
Top 25 Bugs
Top 25 RFE's
Recently Closed Bugs
Printable Page Printable Page


Bug Database
Bug ID: 4766311
Votes 9
Synopsis REGRESSION: 1.4.1 no longer has character encodings for JIS201,JIS208,JIS212
Category java:char_encodings
Reported Against 1.4.1
Release Fixed 1.5(tiger)
State 10-Fix Delivered, bug
Priority: 3-Medium
Related Bugs 4867083
Submit Date 22-OCT-2002
Description


FULL PRODUCT VERSION :
java version "1.4.1"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1-b21)
Java HotSpot(TM) Client VM (build 1.4.1-b21, mixed mode)

FULL OPERATING SYSTEM VERSION : windows 2000


A DESCRIPTION OF THE PROBLEM :
Character endodings of "JIS201", "JIS208", "JIS212" worked
fine in 1.3.1. Now they fail in 1.4.x, and we notice that
these encodings are no longer listed in the list of
supported encodings, and our application is completely broken.

Oddly however, when we try to use these encodings we are not
getting an UnsupportedEncodingException, they just aren't
working properly.

We are doing this because we need to support the peculiar
encoding of various international character sets in DICOM
which uses ISO 2022 escapes in a fairly general fashion.

In order to achieve this, having extracted the subcomponents
of byte arrays between escape sequences that use a specific
encoding (like JIS 201 or 208 or 212), then we use the
encodings supplied and previously supported by the names of
"JIS201", "JIS208", "JIS212" to do the hard work using "new
String(bytes, offset, length, useEncoding);".


REGRESSION.  Last worked in version 1.3.1

REPRODUCIBILITY :
This bug can be reproduced always.

CUSTOMER WORKAROUND :
Haven't been able to find one despite trying all the  xxxxx 
relevant supported encodings listed.

Release Regression From : 1.4.0_02
The above release value was the last known release where this 
bug was known to work. Since then there has been a regression.

(Review ID: 165702) 
======================================================================
Work Around
N/A
Evaluation
Please provide a test case. The named aliases "JIS201", "JIS208", "JIS212"
(as opposed to "JIS0201", "JIS0208", "JIS0212") were never supported
based on a careful survey of past J2SE releases. "JIS0208",etc work
fine in 1.4.1.
 xxxxx@xxxxx  2002-10-22

The 1.4.1 coders are correctly aliased but the JIS-X-0208 and JIS-X-0212
CharsetDecoder implementations subclass sun.nio.cs.ext.DoubleByteDecoder
and don't provide their own implementation of the protected method:

	protected char decodeSingle(int inputByte)

Note also that in 1.4.1 the canonical name "JIS0208" is incorrect since
there are no registered JIS X 0208 charsets within the IANA charset
registry. For 1.4.2 the caononical name for JIS X 0208:1997 is 
x-JIS0208 and the x-JIS0208 Charset provide a historial "JIS0208"
alias.

 xxxxx@xxxxx  2002-11-29

Cause of problem known, will address this problem in the next J2SE feature
release.
 xxxxx@xxxxx  2002-12-09
Comments
  
  Include a link with my name & email   

Submitted On 22-OCT-2002
dclunie
The following small example illustrates the problem.

When run on 1.3.1 one gets:

java JISBug
src:
3b 33 45 44
dst:
e5 b1 b1 e7 94 b0

(which is correct)

and on 1.4.1-b21 one gets:

java JISBug
src:
3b 33 45 44
dst:
3b 33 45 44

(which is wrong ... no conversion has been done at all, but
neither is
any exception thrown to indicate the encoding is not
supported - the
code to do the work is presumably just stubbed out or something)

The example:

public class JISBug {
    static private void dumpBytes(byte[] bytes) {
        for (int i=0; i<bytes.length; ++i)
System.out.print(Integer.toHexString(((int)bytes[i])&0xff)+" ");
        System.out.println();
    }

    static public void main(String args[]) {
        try {
            byte[] jis0208bytes = {
                (byte)0x3b,(byte)0x33,(byte)0x45,(byte)0x44
            };

            String string = new String(jis0208bytes,"JIS0208");
            byte[] utf8Bytes = string.getBytes("UTF8");

            System.out.println("src: ");
dumpBytes(jis0208bytes);
            System.out.println("dst: "); dumpBytes(utf8Bytes);
        }
        catch (java.io.UnsupportedEncodingException e) {
            e.printStackTrace();
        }
    }
}


Submitted On 24-OCT-2002
dclunie
My apologies for the typo in the original bug report; I
meant to say "JIS0208" as opposed to "JIS208", as
illustrated in the test case that I subsequently, supplied,
which demonstrates that "JIS0208" etc. do NOT work fine at
all in 1.4.1, and what is more are NOT listed in those that
are supported (which is worrying).


Submitted On 27-FEB-2003
jwortmann
Got a work-around, although it does take a bit of work:)

Fortunately, the developer who wrote the 
sun.nio.cs.ext.JIS_X_0208 class left the class public and the 
inner decoder class protected.  This allows for the following 
work-around:

1)  Create a new class called JIS_X_0208_Fix in your own 
package of choice, extending sun.nio.cs.ext.JIS_X_0208.
2)  Inside JIS_X_0208_Fix, create an inner class (scoping is 
irrelevant) called DecoderFix, and extend 
sun.nio.cs.ext.JIS_X_0208.Decoder.
3)  Other than the required consructor, add the following 
method:
      protected char decodeSingle(int inputByte) 
{return '\uFFFD';}
4)  Create your own CharsetProvider that returns you new 
JIS_X_0208_Fix when the charsetName is "JIS0208Fix" (or 
whatever you heart desires), and register that provider the 
normal way.
6)  Whenever you want to decode JIS0208, use the 
charsetName that you chose in your provider.

Viola - now JIS 0208 works (as will JIS 0212, following similar 
procedures).

As a side note - SUN - please fix this quick.  The fix is oh-so-
easy for you guys.  It simply involves renaming the 
incorrectly named convSingle to decodeSingle in 
sun.nio.cs.ext.JIS_X_0208.Decoder.  PLEASE...


Submitted On 27-FEB-2003
jwortmann
I have found the same thing in 1.4.1_01.  No conversion is 
done at all for JIS0208 or JIS0212, even though specific 
support for these character sets is provided by charsets.jar.


Submitted On 02-MAR-2003
dclunie
Just tested it with the latest 1.4.1_02 on Windows; the bug
is still present - I guess they really meant "feature
release" (1.4.2 ? 1.5 ?).

Disappointing.

Hi Joe

Thanks for the information about the workaround; I also
implemented a work around but that involved adding my own
JIS 0208 tables and bypassing all this stuff altogether ...
yours sounds much easier (faster and more compact.

david



PLEASE NOTE: JDK6 is formerly known as Project Mustang