Test your knowledge!Take a quiz to access yourself.

Endianness – Little or Big , Learn Bytes Organization in Memory

Introduction

When i was introduced to the term Endianness, it sounded more like a nature of human rather than a computer. Well then i searched about it and found out that the term was coined from an old time novel which was published in 1726 and named Gulliver’s Travels. I was excited enough to read the novel to know how the author came up with this funny term “Endianness”. So i did and got to know that people were fighting over which side of egg is appropriate to crack, bigger or the little. The society was divided into two sects of people who prefer cracking their egg from bigger side and the other who prefer the little side.

 

In reference to computer memory, Endianness basically is the way the bytes are ordered within a memory. From our last post Bits, Bytes and Memory – Basics of Computer Storage Unit, those who are not clear about Bits, Bytes and Word please read the post first before moving ahead. Trust me it’ll help. Still, a revision for those who have already read that post or already understand the basics of Computer Storage Unit:

Bit – It is a binary digit and a basic unit of information in computing. It’s represented by either 0 or 1.

Byte – It’s the smallest addressable unit in memory. 8 bits together makes a byte.

Word –  It’s the biggest chunk of bits with which a processor can do processing (like addition and subtraction) at a time.

So the main question still remains….

How words are stored in Memory?

In a 32-bit architecture, a word is formed by 32 bits which means 4 bytes. So, to store an integer into memory, 4 bytes space is required. As we know, each memory address can store a single byte, not 4 bytes. Well, the answer is simple, we break it. Break the 32-bit binary value of integer into four 8-bit parts(1 byte each) and store them in successive manner.

For example: A digit “305419896” whose binary is “10010001101000101011001111000” if divided into 4 parts, will look like this:

00010010
00110100
01010110
01111000

Which if converted to their corresponding Hex values becomes 0x12, 0x34, 0x56, 0x78. You know how, right? If not, please read how to convert binary to hex at here: Hex to Binary

So, we see, the 8 bits are convertible to 2 hex units. It’s time we allocate some memory to this Integer(ox12345678). Well, it turns out there are two ways of storing this integer into memory.

Big Endian

In big endian, you store the most significant byte in the smallest address. If, let’s say, the first memory allocated address starts from 1000 and the above integer(0x12345678) is stored in the memory, then here’s how it would look:

AddressHex Value(0x__)
100012
100134
100256
100378

Remember, the address of a word is the address of first byte or in other words, memory is byte-addresseable. So, the address of above word would be? Yes, it’s 1000 which contains the most significant byte(MSB).

Big-endian is the most common format in data networking; fields in the protocols of the Internet protocol suite, such as IPv4, IPv6, TCP, and UDP, are transmitted in big-endian order.

Little Endian

In little endian, you store the least significant byte in the smallest address. Little-endian storage is popular for microprocessors, in part due to significant influence on microprocessor designs by Intel Corporation. Here’s how this order would look in memory :

AddressHex Value(0x__)
100078
100156
100234
100312

In this case, the address of this word contains the least significant byte(LSB).

Don’t believe me, check it yourself:

Understand the following sample C code and run into your machine that shows the byte representation of int, float and pointer.

include <stdio.h>
/* below function prints output as bytes in memory, from location start to start+n*/ 
void print_mem(char *start, int n)
{     
int i;     
for (i = 0; i < n; i++)
    printf(" %.2x", start[i]);
     printf("\n"); 
}

/*the Main function which calls above function for 0x12345678*/
int main() 
{
    int i = 0x12345678;
    print_mem((char *)&i, sizeof(i));
    getchar();
    return 0;
}
When above program is run on little endian machine, gives “78 56 34 12″ as output , while if it is run on big endian machine, gives “12 34 56 78″ as output.
Which type of Machine you are running?

Curious to check about your own machine now, if it’s little or the big one? It’s simple. Run the following program in your machine

int main()
{
  int x = 1;
  char *y = (char*)&x;
  printf("%c\n",*y+48);
}

If it’s little endian it will print 1. If it’s big endian it will print 0. How?

Let’s suppose that you are on a 32-bit machine. If it is little endian, the x in the memory looks something like this:

Higher Memory Address
    ----->
+----+----+----+----+
|0x01|0x00|0x00|0x00|
+----+----+----+----+
A
|
&x

so (char*)(*x) == 1, and *y+48 == ‘1’.

If your machine is big endian, it looks something like this:

Higher memory address
  ----->
+----+----+----+----+
|0x00|0x00|0x00|0x01|
+----+----+----+----+
A
|
&x

so this one will be ‘0’.

Is there something to worry?

Yes there is. What if two different endian systems try to transfer data to each other?

I mean, there is a good chance that your friend system send the file to your machine which uses the opposite endianness, therefore reads in the value. Of course, You’ll run into problems because of endianness. You’ll read in reversed values which won’t make any sense.

It becomes a major issue when sending numbers over the network. This gets even worse over the network, because you might find yourself helpless to determine the endianness of the machine that sent you the data.

Well, it’s no longer a big deal to flip the bytes to translate numbers from a format that is different from what the current computer understands. It’s just you have to be sure you know which way the data is stored to begin with so you don’t flip it the wrong way.

Also, Endianness makes no difference at the byte level, so you don’t need to worry about it for strings of 8 bit characters or anything else where the data is just a stream of bytes.

For anything where the elements are larger than a byte, e.g. 2 or more byte integers, floating point, etc, then you do need to worry about endianness, or alternatively use a text-based format for data interchange.

 

Please comment whatever comes in your mind after reading this post.

Kindly like our page Talent Cookie on Facebook, be a part of our Facebook group to keep yourself updated and also, you can follow us on Twitter.

2 Comments

Add a Comment

Your email address will not be published. Required fields are marked *