Main Tutorials

Java Sign Extension

Java multicast problem

In Java, Sign extension will kick in if we are doing the following stuff:

  • Widening primitive conversion – Cast from one type to another, which involves increasing the bits of the original type, for example, cast byte (8 bits) to int (32 bits).
  • Bitwise right shift by n bit >> n.

The sign extension is an operation of increasing the bits while preserves the number’s sign (positive and negative). The Sign extension fills the increased bits with the original of the most significant bit (msb) or the leftmost bit, except one condition – casting an unsigned char to other types, please scroll down to the 3. Java Multicast below for explanation.

This article also explains the famous Java multicast problem in the below code. Can you answer it?


  byte b = -1;                    // twos complement
  char c = (char) (b);            // byte -> int -> char (widening and narrowing type) (sign extension)
  System.out.println((int)c);     // char -> int (zero extension), output = 65535

  c = (char) (b & 0xff);          // byte -> int (sign extension)
  System.out.println((int) c);    // char -> int (zero extension), output = 255

1. Java Sign Extension – Widening Primitive Conversion

For example, we cast a byte (8 bits) to int (32 bits).

Positive Number
1.1 Here is a byte 10 in binary format, 8 bits (1 byte).


0000 1010 = 10

If we cast/widen the byte (8 bits) to an int (32 bits, 4 bytes), what are the values (1 or 0) for the extra bits?


# byte, 1 byte, 8 bits
0000 1010 = 10

# int, 4 bytes, 32 bits
???? ???? | ???? ???? | ???? ???? | 0000 1010 = 10

The answer is sign extension; it depends on the "original" most significant bit (msb) or the leftmost bit. In the above example, the msb of byte 10 or 0000 1010 is 0, and the sign extension fills all the extra unknown bits with zero.


{0}000 1010 , {0} = most significant bit or leftmost bit.

After we cast the byte 10 to int, the result is still 10.


0000 0000 | 0000 0000 | 0000 0000 | 0000 1010 = 10

  byte aByte = 10;
  int aByte1 = (int) aByte;
  System.out.println(aByte1);  // output = 10

Negative Number
1.2 For byte -10, Java uses two’s complement to represent the negative value.


# two's complement formula

0000 1010 = 10
1111 0101 (invert bits)
1111 0110 (add 1)
1111 0110 = -10

If we cast/widen the byte -10 (8 bits) to int (32 bits, 4 bytes), what are the values (1 or 0) for the extra bits?


???? ???? | ???? ???? | ???? ???? | 1111 0110 = -10

For byte -10 or 1111 0110, the most significant bit of 1111 0110 is 1, and the Sign extension fills all the unknown bits with one.


{1}111 0110 , {1} = most significant bit or leftmost bit.

After we cast the byte -10 to int, the result is still -10.


1111 1111 | 1111 1111 | 1111 1111 | 1111 0110

For two’s complement (leftmost 1 is negative, 0 is positive)


# two's complement formula

1111 1111 | 1111 1111 | 1111 1111 | 1111 0110 (this is a negative number)
1111 1111 | 1111 1111 | 1111 1111 | 1111 0101 (sub 1)
0000 0000 | 0000 0000 | 0000 0000 | 0000 1010 (invert bits)
0000 0000 | 0000 0000 | 0000 0000 | 0000 1010 = 10
1111 1111 | 1111 1111 | 1111 1111 | 1111 0110 = -10

  byte aByte = -10;
  int aByte1 = (int) aByte;
  System.out.println(aByte1); // output = -10

The hard part is understand the two’s complement.

2. Java Sign Extension – Bitwise Right Shift >>

The bitwise shift operator >>, also called arithmetic right shift, it preserves the sign (positive or negative numbers) after bits shift.

For example, here is an integer 10. In Java, int is 32 bits, 4 bytes.


  0000 0000 | 0000 0000 | 0000 0000 | 0000 1010

And we perform a right shift by 3 bits, 10 >> 3. For integer 10 or {0=msb}0001010, the most significant bit is zero, and the sign extension fills all the unknown bits with zero.


  ???0 0000 | 0000 0000 | 0000 0000 | 0000 0001 | 010 >> 3

The answer to 10 >> 3 is 1.


  0000 0000 | 0000 0000 | 0000 0000 | 0000 0001 = 1

  System.out.println(10>>3); // output = 1

3. Java Multicast – Sign Extension and Zero Extension.

So far, everything looks simple and straightforward. The real challenge is multicast, and it makes everything complicated.

Review the following Java multicast example (I found this code snippet from a forum if I do not remember wrong, the original is from the Java Puzzle book).


  byte b = -1;                    // twos complement
  char c = (char) (b);            // byte -> int -> char (widening and narrowing type) (sign extension)
  System.out.println((int)c);     // char -> int (zero extension), output = 65535

  c = (char) (b & 0xff);          // byte -> int (sign extension)
  System.out.println((int) c);    // char -> int (zero extension), output = 255

The above example involved multicast, from byte -> int -> char -> int, two’s complement, widening primitive conversion, narrowing primitive conversion, sign extension, zero extension, and also bitwise masking. Let break it down one by one.

3.1 Java uses two’s complement to represent negative.


  0000 0001 = 1
  1111 1110 = (invert bits)
  1111 1111 = (add 1)
  1111 1111 = -1 in twos complement

  byte b = 1111 1111

3.1 In Java byte is a signed byte, 8 bits; while char is an unsigned 2 bytes 16 bits. Java unable to cast a byte to a char directly (unable to cast a signed type to an unsigned type), so it cast/widen the byte to an int first and narrow down to char.


  byte b = -1;  
  char c = (char) (b);                           = byte -> int -> char

  1111 1111                                      = byte (8 bits)
  1111 1111 | 1111 1111 | 1111 1111 | 1111 1111  = int  (32 bits, sign extension)
  1111 1111 | 1111 1111                          = char (16 bits)

  char c = 1111 1111 | 1111 1111

3.2 This (int)c is interesting; it cast a char 2 bytes to an int 4 bytes. First, we may think the Sign extension will perform and fills the increased bit with the original most significant bit, which is one.


  char c = 1111 1111 | 1111 1111
  int(c) = ???? ???? | ???? ???? | 1111 1111 | 1111 1111
  int(c) = 1111 1111 | 1111 1111 | 1111 1111 | 1111 1111 (sign extension) ??? but why the output is 65535?

The answer is no, in the above case, Java uses zero extension to fill the increased bit with ZERO.

Char -> Any type = Zero Extension
Please remember char is an unsigned type, if we cast an unsigned type char to any other types, zero extension is used, not "Sign extension".

What is Zero Extension?
The zero extension fills all the increased bits with ZERO. This zero extension also applied to the Logical right shift operator >>>.

Below is the correct answer to int(c).


char c = 1111 1111 | 1111 1111
int(c) = ???? ???? | ???? ???? | 1111 1111 | 1111 1111
int(c) = 0000 0000 | 0000 0000 | 1111 1111 | 1111 1111 (zero extension on unsigned char), the output is 65535.

3.4 Now, we look at the following statement.


  c = (char) (b & 0xff);          // mask, int -> char
  System.out.println((int) c);    // char -> int, output: 255, but why?

Rewrite it to


  int i = (b & 0xff)
  c = (char)i
  System.out.println((int) c);

The b & 0xff will return a int. Java cast the byte to int and performs a bitwise & with this 0xff, aka masking.


  byte b      = 1111 1111

  int         = 1111 1111 | 1111 1111 | 1111 1111 | 1111 1111  (sign extension)
                &
  0xff        = 0000 0000 | 0000 0000 | 0000 0000 | 1111 1111

  int i       = 0000 0000 | 0000 0000 | 0000 0000 | 1111 1111

  c = (char)i = 0000 0000 | 1111 1111                          (narrow down, int -> char)

The last one (int) c, cast an unsigned char to int, zero extension.


  c           = 0000 0000 | 1111 1111

  (int)c      = ???? ???? | ???? ???? | 0000 0000 | 1111 1111  (zero extension)

  (int)c      = 0000 0000 | 0000 0000 | 0000 0000 | 1111 1111  

In two’s complement, first leftmost defined signed, 1 is a negative number, 0 is a positive number.


  0000 0000 | 0000 0000 | 0000 0000 | 1111 1111
  = 1x2^0 +... 1x2^8
    1 + 2 + 4 + 8 + 16 + 32 + 64 + 128 = 255

Done. Thanks for reading this long article.

Note
Let me know if you found any typos, errors, or wrong explanation.

References

About Author

author image
Founder of Mkyong.com, love Java and open source stuff. Follow him on Twitter. If you like my tutorials, consider make a donation to these charities.

Comments

Subscribe
Notify of
1 Comment
Most Voted
Newest Oldest
Inline Feedbacks
View all comments
Panos G
3 years ago

its 1×2^0 +… 1×2^7