有没有办法从StringBuilder
创建byte[]
?
我希望使用StringBuilder
来提高内存使用率,但我首先使用的是byte[]
,因此我必须从String
创建byte[]
,然后创建来自StringBuilder
的{{1}}我并未将此解决方案视为最佳解决方案。
由于
答案 0 :(得分:13)
基本上,您最好的选择似乎是直接使用CharsetDecoder。
以下是:
byte[] srcBytes = getYourSrcBytes();
//Whatever charset your bytes are endoded in
Charset charset = Charset.forName("UTF-8");
CharsetDecoder decoder = charset.newDecoder();
//ByteBuffer.wrap simply wraps the byte array, it does not allocate new memory for it
ByteBuffer srcBuffer = ByteBuffer.wrap(srcBytes);
//Now, we decode our srcBuffer into a new CharBuffer (yes, new memory allocated here, no can do)
CharBuffer resBuffer = decoder.decode(srcBuffer);
//CharBuffer implements CharSequence interface, which StringBuilder fully support in it's methods
StringBuilder yourStringBuilder = new StringBuilder(resBuffer);
<强>增加:强>
经过一些测试后,似乎简单的new String(bytes)
要快得多,似乎没有简单的方法可以让它更快。这是我跑的测试:
import java.io.IOException;
import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.CharacterCodingException;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.text.ParseException;
public class ConsoleMain {
public static void main(String[] args) throws IOException, ParseException {
StringBuilder sb1 = new StringBuilder("abcdefghijklmnopqrstuvwxyz");
for (int i=0;i<19;i++) {
sb1.append(sb1);
}
System.out.println("Size of buffer: "+sb1.length());
byte[] src = sb1.toString().getBytes("UTF-8");
StringBuilder res;
long startTime = System.currentTimeMillis();
res = testStringConvert(src);
System.out.println("Conversion using String time (msec): "+(System.currentTimeMillis()-startTime));
if (!res.toString().equals(sb1.toString())) {
System.err.println("Conversion error");
}
startTime = System.currentTimeMillis();
res = testCBConvert(src);
System.out.println("Conversion using CharBuffer time (msec): "+(System.currentTimeMillis()-startTime));
if (!res.toString().equals(sb1.toString())) {
System.err.println("Conversion error");
}
}
private static StringBuilder testStringConvert(byte[] src) throws UnsupportedEncodingException {
String s = new String(src, "UTF-8");
StringBuilder b = new StringBuilder(s);
return b;
}
private static StringBuilder testCBConvert(byte[] src) throws CharacterCodingException {
Charset charset = Charset.forName("UTF-8");
CharsetDecoder decoder = charset.newDecoder();
ByteBuffer srcBuffer = ByteBuffer.wrap(src);
CharBuffer resBuffer = decoder.decode(srcBuffer);
StringBuilder b = new StringBuilder(resBuffer);
return b;
}
}
结果:
Size of buffer: 13631488
Conversion using String time (msec): 91
Conversion using CharBuffer time (msec): 252
在IDEONE上修改(耗费更少内存)版本:Here。
答案 1 :(得分:4)
如果它是您想要的简短陈述,则无法绕过String之间的步骤。为了方便起见,String构造函数混合了转换和对象构造,但是在StringBuilder中没有这样的便利构造函数。
如果它是您感兴趣的性能,那么您可以通过使用以下内容来避免使用中间String对象:
new StringBuilder(Charset.forName(charsetName).decode(ByteBuffer.wrap(inBytes)))
如果您希望能够微调性能,可以自己控制解码过程。例如,您可能希望避免使用太多内存,方法是使用averageCharsPerByte作为对所需内存量的估计。如果估计值太短,则可以使用生成的StringBuilder来累积所有部分,而不是调整缓冲区的大小。
CharsetDecoder cd = Charset.forName(charsetName).newDecoder();
cd.onMalformedInput(CodingErrorAction.REPLACE);
cd.onUnmappableCharacter(CodingErrorAction.REPLACE);
int lengthEstimate = Math.ceil(cd.averageCharsPerByte()*inBytes.length) + 1;
ByteBuffer inBuf = ByteBuffer.wrap(inBytes);
CharBuffer outBuf = CharBuffer.allocate(lengthEstimate);
StringBuilder out = new StringBuilder(lengthEstimate);
CoderResult cr;
while (true) {
cr = cd.decode(inBuf, outBuf, true);
out.append(outBuf);
outBuf.clear();
if (cr.isUnderflow()) break;
if (!cr.isOverflow()) cr.throwException();
}
cr = cd.flush(outBuf);
if (!cr.isUnderflow()) cr.throwException();
out.append(outBuf);
我怀疑上面的代码在大多数应用程序中都是值得的。如果应用程序对性能感兴趣,它可能不应该处理StringBuilder,而是处理缓冲区级别的所有内容。