Base64 variants
Base64 encoding is an algorithm that converts binary data into ASCII. Resulting string consist of characters A-Z,a-z,0-9 and two extra '+' (plus) and '/' (slash) and also padding character '=' (equals). Conversion does not happens the same every time, there is few variants of it.
- Simple (basic) encoding creates single longlonglonglonglonglonglonglonglonglonglonglooooong= base64 encoded line.
- Fixed line lenght encoding, sometimes also called Mime base64 encoding or chunked encoding or simply line folding. Instead of single long line, it produces multiple lines usualy 76 characters long. It is quite important, because it is mandatory in some use scenarios (Binary attachments) , while harmful in others (BASIC authentication header).
- URL (safe) encoding produce string that can be used as parameter value in URL. Because '+' and '/' are not allowed, they are encoded as '-' and '_' while '=' padding character is usually removed.
Now back to initial question...How many Base64 encoders/decoders is present in Oracle (Sun) JDK?
I found 6 of them
sun.misc.BASE64Encoder Since Java 1.0? Well we all know that we should not touch anything from sun.* or com.sun.* packages. So we don't.
javax.xml.bind.DatatypeConverter Since Java 1.6 - This one actually works, but allows you only basic encoding. No mime or url encoding.
java.util.Base64 Since Java 1.8 - Finaly generaly usable Base64 encoder/decoder allowing basic, mime and url safe encoding.
And finaly some curiosities illustrating how even Sun/Oracle JDK/JRE contributors were missing Base64 encoder, so they created their own.
java.util.prefs.Base64 Since Java 1.4, but has default (package) visibility, therefore not usable
com.sun.net.httpserver.Base64 Since Java 1.6, but has default (package) visibility, therefore not usable
com.sun.org.apache.xml.internal.security.utils.Base64 - Similar story as sun.misc.BASE64Encoder, it also internaly uses XMLUtils.ignoreLineBreaks() to perform line folding...
Let's see some encoding results
Both commons-codec 1.6+ and Java8 java.util.Base64 can produce and consume any of mentioned base64 variants, but beware of quite different encoding results. I think that a lot of headaches is coming because of that.
In following test, commons-codec 1.9 and Java8u5 is used
Mime (chunked) encodingimport org.apache.commons.codec.binary.Base64; String string = "This string encoded will be longer that 76 characters and cause MIME base64 line folding"; byte[] encodeBase64Chunked = Base64.encodeBase64Chunked(string.getBytes()); System.out.println("commons-codec Base64.encodeBase64Chunked\n" + new String(encodeBase64Chunked)); String encodeMimeToString = java.util.Base64.getMimeEncoder().encodeToString(string.getBytes()); System.out.println("java.util.Base64.getMimeEncoder().encodeToString\n" + encodeMimeToString);prints
commons-codec Base64.encodeBase64Chunked VGhpcyBzdHJpbmcgZW5jb2RlZCB3aWxsIGJlIGxvbmdlciB0aGF0IDc2IGNoYXJhY3RlcnMgYW5k IGNhdXNlIE1JTUUgYmFzZTY0IGxpbmUgZm9sZGluZw== java.util.Base64.getMimeEncoder().encodeToString VGhpcyBzdHJpbmcgZW5jb2RlZCB3aWxsIGJlIGxvbmdlciB0aGF0IDc2IGNoYXJhY3RlcnMgYW5k IGNhdXNlIE1JTUUgYmFzZTY0IGxpbmUgZm9sZGluZw==
Java8 mime Encoder ends with '==' padding and does not add last newline (CR/LF) after that!
URL (safe) encodingString string = "ůůůůů"; String encodeUrlToString = java.util.Base64.getUrlEncoder().encodeToString(string.getBytes()); System.out.println("java.util.Base64.getUrlEncoder().encodeToString\n" + encodeUrlToString); String encodeBase64URLSafeString = Base64.encodeBase64URLSafeString(string.getBytes()); System.out.println("commons-codec Base64.encodeBase64URLSafeString\n" + encodeBase64URLSafeString);prints
java.util.Base64.getUrlEncoder().encodeToString xa_Fr8Wvxa_Frxc= commons-codec Base64.encodeBase64URLSafeString xa_Fr8Wvxa_FrxcJava8 url Encoder leaves padding '=' at the end of the result, which makes it unusable as URL parameter value!
UPDATE: This was reported while ago and it has turned out, that any Encoder can be switched into non-padding using withoutPadding() method.
String string = "ůůůůů"; String encodeUrlToString = Base64.getUrlEncoder().withoutPadding().encodeToString(string.getBytes()); System.out.println("java.util.Base64.getUrlEncoder().withoutPadding().encodeToString\n" + encodeUrlToString);prints
java.util.Base64.getUrlEncoder().withoutPadding().encodeToString xa_Fr8Wvxa_Frw
Note: In quite old commons-codec 1.4 chunking was incostitently turned on by default for encode() method, resulting in nasty surprises. See Jira ticket.
Happy Base64 encoding
No comments:
Post a Comment