Submitted On 07-NOV-2003
bluppie4
Ok, we've tracked down a requirement for this problem to occur:
If the LANG environment variable is set to en_US the problem
occurs. If set to en_US.UTF-8 the problem doesn't occur.
Cheers!
Submitted On 01-DEC-2003
mpultz
Just wanted to let you all know that the customer submitted
workaround worked for me on SUSE 8.2 (JDK 1.4.2) and Windows
NT (JDK 1.4.2). Both platforms experienced the problem with
getBytes().
Submitted On 23-DEC-2003
solidmat
I am getting this bug as well on Windows XP running
with JRE 1.4.2-b28, after running the exploit code given
here.
Submitted On 23-DEC-2003
solidmat
SOURCE CODE TRACE
We took a look at the source code of the JVM. The
problem stems from the fact that float values are used
to indicate the maximum value of bytes per characters
in java.nio.charset.CharsetEncoder.maxBytesPerChar.
The issue is that floats cannot accuratly hold more than
2^24 integer values which is equals to 16,777,216.
After that value is reached, the encoding operation in
the character set classes incorrectly rounds down the
amount of memory needed for the buffer. The correct
solution would be to use doubles instead, or account
for the round off problem by increasing the buffer size.
SUGGESTED WORKAROUND
The workaround that we are using, is to use to .
getBytes() on a substring that is smaller than 16MB,
and combined the results by either using a
ByteArrayOutputStream or a ByteBuffer.
NOTE: If you are planning on using more than one-byte
characters sets, than you have to make sure that your
buffer is set accordingly.
Submitted On 13-APR-2004
ridesmet
Can someone explain to me why creating a
FileOutputStream.open() calls into String.getBytes? I also
encounter the following bug, but with the following stack trace:
java.nio.BufferOverflowException
at
java.nio.charset.CoderResult.throwException(CoderResult.java:259)
at
java.lang.StringCoding$CharsetSE.encode(StringCoding.java:338)
at java.lang.StringCoding.encode(StringCoding.java:372)
at java.lang.StringCoding.encode(StringCoding.java:378)
at java.lang.String.getBytes(String.java:608)
at java.io.FileOutputStream.open(Native Method)
at
java.io.FileOutputStream.<init>(FileOutputStream.java:176)
at
java.io.FileOutputStream.<init>(FileOutputStream.java:131)
Submitted On 26-MAY-2004
Yasushi.Umezaki.Kana
I found the following codes can reproduce the same error.
---------- BEGIN SOURCE ----------
import javax.mail.*;
import javax.mail.internet.*;
public class MimeTest
{
public static void main(String[] args)
{
try{
System.out.println("The string 'އ ' can be encoded successfully...");
System.out.println(MimeUtility.encodeText("އ ", "ISO-2022-JP", "B"));
System.out.println("But, the string 'އ' can not...");
System.out.println(MimeUtility.encodeText("އ", "ISO-2022-JP", "B"));
}
catch(Exception e)
{
e.printStackTrace();
}
}
}
---------- END SOURCE ----------
======= Results ======================
The string 'އ ' can be encoded successfully...
=?ISO-2022-JP?B?GyRCO2cbKEIg?=
But, the string 'އ' can not...
java.nio.BufferOverflowException
at java.nio.charset.CoderResult.throwException(CoderResult.java:259)
at java.lang.StringCoding$CharsetSE.encode(StringCoding.java:343)
at java.lang.StringCoding.encode(StringCoding.java:374)
at java.lang.String.getBytes(String.java:573)
at javax.mail.internet.MimeUtility.doEncode(MimeUtility.java:635)
at javax.mail.internet.MimeUtility.encodeWord(MimeUtility.java:617)
at javax.mail.internet.MimeUtility.encodeText(MimeUtility.java:418)
at MimeTest.main(MimeTest.java:15)
Submitted On 26-MAY-2004
Yasushi.Umezaki.Kana
Sorry... the particular characters appear as garbage in my previous comments. Originally I wrote the particular Japanese character, 0x8e87.
Submitted On 02-JUL-2004
swisstom
I had the same problem with String.getBytes() throwing an java.nio.BufferOverflowException.
THANKS for the workaround! It works for me too!
Cheers!
Submitted On 02-JUL-2004
swisstom
PS: BTW, for the workaround... does it work for the bordercase of length 16777217? (I would guess NO)
shouldn't it be
if (output.length() > 16777216 && output.length() % 4 == 1)
instead of
if (output.length() > 16777217 && output.length() % 4 == 1)
??
(or >= instead of >)
Submitted On 13-JUL-2004
jacklty
It is a horrible bug.... took me few weeks to track it down..... Next time, I will check the bug database before digging into the code.........
Submitted On 30-JUL-2004
yoda22281
The same thing happens with StringBuffer.append when the size of the string buffer hits the aforementioned limit (16777217).
Submitted On 20-AUG-2004
jarouch
Simple new String(new byte[16777217]) causes exception too.. I tried it with last snapshot build (b60) and bug is still there..
Submitted On 09-OCT-2005
oberserk
Thank you.
Submitted On 14-MAR-2006
moizd
Is there a plan to bring this fix to a java 5 update release
Submitted On 14-MAR-2006
tflora
It really sucks that this bug has not been fixed in 1.4.2. How can Sun justify leaving a bug like this out in a critical release.
Is there any known charset that does not exhibit this problem?
Thanks,
Todd
Submitted On 25-APR-2006
/**
* String's getBytes() method instantiates the byte array
* with a size equal to the integer equivalent of the
* floating point equivalent of the length of the string.
* Similar to doing the following:
* <code>byte[] b = new byte[(int)((float) foo.length())]</code>
*
* Primitive float's only keep track of the 24 most significant bits.
* In order to avoid round off problems which could create
* the BufferOverflowException with extremely large strings,
* additional characters can be added so that the lost least significant
* bits are all 0's.
*
* This method takes a string, and returns the same string with as
* few as 0, and no more than 128 copies of c appended to the end,
* thus converting it into a getBytes() compatible string.
*
* @param foo A string, usually with length greater than 16777216
* @param c The char to add at the end of the string if required.
* @return A new string which won't cause a BufferOverflowException
* when getBytes() is called, with up to 128 copies of c appended.
* @throws Exception When the string is too long to allow any more
* characters to be added, and thus cannot be made getBytes() compatible.
*/
private String bug4949631(String foo, char c) throws Exception {
if (foo.length() <= (int) Math.pow(2, 24)) return foo;
if (foo.length() > (int) (Math.pow(2, 31) - 129))
throw new Exception("The string is too long to make getBytes() compatiable");
// determine how many bits are being chopped off
// on conversion to float
int numberLSBLost = 7; // assume worst case
int msbMask = (int) Math.pow(2, 30);
int i = foo.length();
while ((i & msbMask) == 0) {
numberLSBLost--;
i = i << 1;
}
// we want to add just enough chars to avoid rounding
int lostBitsMask = (int) Math.pow(2, numberLSBLost) - 1;
int lostBitsValue = foo.length() & lostBitsMask;
int numCharsToAdd = (lostBitsMask + 1) - lostBitsValue;
char[] bar = new char[numCharsToAdd];
// format
for (int j = 0; j < numCharsToAdd; j++) {
bar[j] = c;
}
return foo + new String(bar);
}
PLEASE NOTE: JDK6 is formerly known as Project Mustang
|