Main Tutorials

Java – Convert String to Binary

string to binary

This article shows you five examples to convert a string into a binary string representative or vice verse.

  1. Convert String to Binary – Integer.toBinaryString
  2. Convert String to Binary – Bit Masking
  3. Convert Binary to String – Integer.parseInt
  4. Convert Unicode String to Binary.
  5. Convert Binary to Unicode String.

1. Convert String to Binary – Integer.toBinaryString

The steps to convert a string to its binary format.

  1. Convert string to char[].
  2. Loops the char[].
  3. Integer.toBinaryString(aChar) to convert chars to a binary string.
  4. String.format to create padding if need.
StringToBinaryExample1.java

package com.mkyong.crypto.bytes;

import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;

public class StringToBinaryExample1 {

    public static void main(String[] args) {

        String input = "Hello";
        String result = convertStringToBinary(input);

        System.out.println(result);

        // pretty print the binary format
        System.out.println(prettyBinary(result, 8, " "));

    }

    public static String convertStringToBinary(String input) {

        StringBuilder result = new StringBuilder();
        char[] chars = input.toCharArray();
        for (char aChar : chars) {
            result.append(
                    String.format("%8s", Integer.toBinaryString(aChar))   // char -> int, auto-cast
                            .replaceAll(" ", "0")                         // zero pads
            );
        }
        return result.toString();

    }

    public static String prettyBinary(String binary, int blockSize, String separator) {

        List<String> result = new ArrayList<>();
        int index = 0;
        while (index < binary.length()) {
            result.add(binary.substring(index, Math.min(index + blockSize, binary.length())));
            index += blockSize;
        }

        return result.stream().collect(Collectors.joining(separator));
    }
}

Output

Terminal

0100100001100101011011000110110001101111
01001000 01100101 01101100 01101100 01101111

2. Convert String to Binary – Bit Masking.

2.1 This Java example will use bit masking technique to generate binary format from an 8-bit byte.

StringToBinaryExample2.java

package com.mkyong.crypto.bytes;

import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;

public class StringToBinaryExample2 {

    public static void main(String[] args) {

        String input = "a";
        String result = convertByteArraysToBinary(input.getBytes(StandardCharsets.UTF_8));
        System.out.println(prettyBinary(result, 8, " "));

    }

    public static String convertByteArraysToBinary(byte[] input) {

        StringBuilder result = new StringBuilder();
        for (byte b : input) {
            int val = b;
            for (int i = 0; i < 8; i++) {
                result.append((val & 128) == 0 ? 0 : 1);      // 128 = 1000 0000
                val <<= 1;
            }
        }
        return result.toString();

    }

    public static String prettyBinary(String binary, int blockSize, String separator) {
        //... same with 1.1
    }
}

Output

Terminal

  01100001

The hard part is this code. The idea is similar to this Java – Convert Integer to Binary using bit masking. In Java, byte is an 8-bit, int is 32-bit, for integer 128 the binary is 1000 0000.


for (byte b : input) {
    int val = b;                                      // byte -> int
    for (int i = 0; i < 8; i++) {
        result.append((val & 128) == 0 ? 0 : 1);      // 128 = 1000 0000
        val <<= 1;                                    // val = val << 1
    }
}

This & is an Bitwise AND operator, only 1 & 1 is 1, other combinations are all 0.


1 & 1 = 1
1 & 0 = 0
0 & 1 = 0
0 & 0 = 0

This val <<= 1 is actually val = val << 1, it is a bit left shift operator, it moves the bits to the left side by 1 bit.

Review the following draft: let’s assume the val is an int, or byte represents a character a.


00000000 | 00000000 | 00000000 | 01100001   # val  = a in binary
00000000 | 00000000 | 00000000 | 10000000   # 128
&                                           # bitwise AND
00000000 | 00000000 | 00000000 | 00000000   # (val & 128) == 0 ? 0 : 1, result = 0

00000000 | 00000000 | 00000000 | 11000010   # val << 1
00000000 | 00000000 | 00000000 | 10000000   # 128
&                                           # bitwise AND
00000000 | 00000000 | 00000000 | 10000000   # (val & 128) == 0 ? 0 : 1, result = 1

00000000 | 00000000 | 00000001 | 10000100   # val << 1
00000000 | 00000000 | 00000000 | 10000000   # 128
&
00000000 | 00000000 | 00000000 | 10000000   # result = 1

00000000 | 00000000 | 00000011 | 00001000   # val << 1
00000000 | 00000000 | 00000000 | 10000000   # 128
&
00000000 | 00000000 | 00000000 | 00000000   # result = 0

00000000 | 00000000 | 00000110 | 00010000   # val << 1
00000000 | 00000000 | 00000000 | 10000000   # 128
&
00000000 | 00000000 | 00000000 | 00000000   # result = 0

00000000 | 00000000 | 00001100 | 00100000   # val << 1
00000000 | 00000000 | 00000000 | 10000000   # 128
&
00000000 | 00000000 | 00000000 | 00000000   # result = 0

00000000 | 00000000 | 00011000 | 01000000   # val << 1
00000000 | 00000000 | 00000000 | 10000000   # 128
&
00000000 | 00000000 | 00000000 | 00000000   # result = 0

00000000 | 00000000 | 00110000 | 10000000   # val << 1
00000000 | 00000000 | 00000000 | 10000000   # 128
&
00000000 | 00000000 | 00000000 | 10000000   # result = 1

# collect all bits                          # 01100001

For string a the binary string is 01100001.

3. Convert Binary to String.

In Java, we can use Integer.parseInt(str, 2) to convert a binary string to a string.

StringToBinaryExample3

package com.mkyong.crypto.bytes;

import java.util.Arrays;
import java.util.stream.Collectors;

public class StringToBinaryExample3 {

    public static void main(String[] args) {

        String input = "01001000 01100101 01101100 01101100 01101111";

        // Java 8 makes life easier
        String raw = Arrays.stream(input.split(" "))
                .map(binary -> Integer.parseInt(binary, 2))
                .map(Character::toString)
                .collect(Collectors.joining()); // cut the space

        System.out.println(raw);

    }

}

Output

Terminal

  Hello

4. Convert Unicode String to Binary.

We can use Unicode to represent non-English characters since Java String supports Unicode, we can use the same bit masking technique to convert a Unicode string to a binary string.

This example converts a single Chinese character (It means you in English) to a binary string.

UnicodeToBinary1.java

package com.mkyong.crypto.bytes;

import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;

public class UnicodeToBinary1 {

    public static void main(String[] args) {

        byte[] input = "你".getBytes(StandardCharsets.UTF_8);
        System.out.println(input.length);                       // 3, 1 Chinese character = 3 bytes
        String binary = convertByteArraysToBinary(input);
        System.out.println(binary);
        System.out.println(prettyBinary(binary, 8, " "));

    }

    public static String convertByteArraysToBinary(byte[] input) {

        StringBuilder result = new StringBuilder();
        for (byte b : input) {
            int val = b;
            for (int i = 0; i < 8; i++) {
                result.append((val & 128) == 0 ? 0 : 1);      // 128 = 1000 0000
                val <<= 1;
            }
        }
        return result.toString();

    }

    public static String prettyBinary(String binary, int blockSize, String separator) {
        //... same code 1.1
    }
}

Output

Terminal

3
111001001011110110100000
11100100 10111101 10100000

Different Unicode requires different bytes, and not all Chinese characters required 3 bytes of storage, some may need more or fewer bytes.

5. Convert Binary to Unicode String.

Read comments for self-explanatory.

UnicodeToBinary2.java

package com.mkyong.crypto.bytes;

import java.nio.ByteBuffer;
import java.nio.charset.StandardCharsets;

public class UnicodeToBinary2 {

    public static void main(String[] args) {

        String binary = "111001001011110110100000";     // 你, Chinese character
        String result = binaryUnicodeToString(binary);
        System.out.println(result.trim());

    }

    // <= 32bits = 4 bytes, int needs 4 bytes
    public static String binaryUnicodeToString(String binary) {

        byte[] array = ByteBuffer.allocate(4).putInt(   // 4 bytes byte[]
                Integer.parseInt(binary, 2)
        ).array();

        return new String(array, StandardCharsets.UTF_8);
    }

}

Output

Terminal

References

About Author

author image
Founder of Mkyong.com, love Java and open source stuff. Follow him on Twitter. If you like my tutorials, consider make a donation to these charities.

Comments

Subscribe
Notify of
8 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments
Wolkenfarmer
3 years ago

Your 5th code example just works for every String that is encoded in 4 bytes and I can’t quite figure out how to decode Strings that are encoded in 5 or more bytes. Help appreciated.

But still: Thank you for your great explanation overall – really helped me a lot so far! 🙂

Wolkenfarmer
3 years ago
Reply to  mkyong

No, but I want to convert a normal Unicode String with multiple characters into binary and then back. Therefore, I need to be able to read more than 4 bytes. However, when I allocate 5 or more bytes in the Byte buffer, it will still just work for 4 Bytes (like “abcd” or “你b”) or else will give a NumberFormatException at the parseInt().

I had the idea to first read in the first 4 Bytes, then decode the first character and then fill the 4 Bytes up again until every character is tranlated back. However, for this method I would need to know how many bytes one translated character used or a way of even only decoding just one character.

Do you have any suggestions?

Vincent
3 years ago
Reply to  Wolkenfarmer

Read this wolkenfarmer.
https://www.futurelearn.com/courses/representing-data-with-images-and-sound/0/steps/53135
In short, it tells you about byte structures and how to recognize them. In short, bytes starting with 110 (2 bytes), 1110 (3 bytes) and 11110 (4 bytes) are the things to look for. Following bytes (belonging to the first byte) all start with 10.
I suggest to make a method that splits your binairy data into array elements where every element is one byte (8 bits) in string format. This you give as an argument to the function binaryUnicodeToString.
However, during splitting of you binary data into chucks of 8, you have to look for these byte patterns (110, 1110 and 11110). As soon as you see these, you have to add these byte(strings) together before feeding it to binaryUnicodeToString. Also don’t forget to make the adjustment I suggested in my first response, otherwise you cannot convert a space.
Good luck.

Wolkenfarmer
3 years ago
Reply to  Vincent

A big “Thank you” towards you two!
It’s finally up and running.. here my so far working code for decoding binary String to Unicode String:

String[] stringBA = stringBinary.split(“-“);
String stringBConvert = new String();

for (int i = 0; i < stringBA.length; i++) {
   System.out.println(“____LoopDieLoop”);
   if (stringBA[i].startsWith(“110”)) {
      stringBConvert = stringBA[i] + stringBA[i + 1];
      i++;
   } else if (stringBA[i].startsWith(“1110”)) {
      stringBConvert = stringBA[i] + stringBA[i + 1] + stringBA[i + 2];
      i += 2;
   } else if (stringBA[i].startsWith(“11110”)) {
      stringBConvert = stringBA[i] + stringBA[i + 1] + stringBA[i + 2] + stringBA[i + 3];
      i += 3;
   } else {
      stringBConvert = stringBA[i];
   }
   byte[] array = ByteBuffer.allocate(4).putInt(Integer.parseInt(stringBConvert, 2)).array();

   if (stringUnicode != null) {
      stringUnicode = stringUnicode + new String(array, StandardCharsets.UTF_8).substring(4 – stringBConvert.length() / 8);
   } else {
      stringUnicode = new String(array, StandardCharsets.UTF_8).substring(4 – stringBConvert.length() / 8);
   }
}

Vincent
3 years ago
Reply to  Wolkenfarmer

Glad to be of help! I did exactly the same as you btw, but there is an out of bound error possible. There could be situations where the data stream is corrupted and this will cause a crash because the program will try to access array elements which aren’t there. Something to solve when it happens 😁

Vincent
3 years ago

Great explanation, thanks! But your code (at 5, binaryUnicodeToString) fails on the conversion of a “00100000” (space). 
It will return an empty string due to the required trim function needed after the call to that method.
Any suggestions to improve this would be very appreciated! 

Vincent
3 years ago
Reply to  Vincent

Found a solution, just replace the last line in binaryUnicodeToString with this:

return new String(array, StandardCharsets.UTF_8).substring(4-binary.length()/8);

No need to trim the return value (so remove the trim function from the method(s) calling binaryUnicodeToString) and if there is a space to convert, it will not be trimmed.