What exactly is the big and small end of a communication protocol?

2024.04.06


InternetNetwork management
In the communication protocol, which endianness must be explicitly specified to ensure that the sender and receiver are able to interpret the data correctly. Otherwise, if the sender uses big-end and the receiver uses small-end (or vice versa), there can be confusion during data transfer.

In the development of IoT applications, the communication protocols obtained from embedded engineering often see the mode of large and small ends, so what exactly is the large and small ends?

Big-and-small-endian order

Endian is an important concept in communication protocols that involves how multi-byte data (e.g., integers, floating-point numbers, etc.) is stored in memory. The size and endian order determines the order in which the most significant bytes (MSB) and least significant bytes (LSB) of the data are arranged in the memory address. This is critical for cross-platform communication and data exchange, as different hardware platforms may have different endians.

  1. "Big Endian": The high-digit bytes are stored at the lower address of the memory, and the low-bit bytes are stored at the higher address of the memory. For example, a 16-bit integer 0x1234 is represented in memory as 0x12 0x34 in big-endian mode.
  2. "Little Endian": The low-bit bytes are stored at the low address of the memory, and the high-bit bytes are stored at the high address of the memory. For the same 16-bit integer 0x1234, in little-endian mode, the representation in memory is 0x34 0x12.

ImageImage

In the communication protocol, which endianness must be explicitly specified to ensure that the sender and receiver are able to interpret the data correctly. Otherwise, if the sender uses big-end and the receiver uses small-end (or vice versa), there can be confusion during data transfer.

A common practice is to explicitly specify the endianness in the communication protocol to ensure proper interpretation of the data, regardless of the software and hardware platforms of the sender and receiver. It may also be necessary to add some specific tags or metadata to the communication protocol to indicate the endianness of the data, and the receiver can dynamically adjust its endianness interpretation based on these tags.

ImageImage

For example, many network protocols, such as TCP/IP, use big-endian ordering, while Intel architectures such as x86 and x86_64 do. When cross-platform communication takes place, a toenianness mismatch can cause problems, often requiring endianism to be translated when sending and receiving data.

In programming, there are times when it is necessary to write specific code to handle the conversion of the endianness to ensure the correct interpretation of the data. For example, in C, functions such as htonl, ntohl, htons, ntohs, etc., can be used to handle the conversion between network endianness and host endeny. The "h" in these function names stands for host, "n" stands for network, "s" stands for short, and "l" stands for long.

End-order conversion

In Java, you can use the ByteBuffer class directly for end-order conversion. ByteBuffer supports both Big and Little Endian, and can dynamically change the endianness at runtime.

For integer types (such as int, short, etc.), you can use the order() method of ByteBuffer to set the endianness, and then use putInt(), getShort(), etc. to read and write data.

import java.nio.ByteBuffer;
import java.nio.ByteOrder;

public class EndianConversion {
    public static void main(String[] args) {
        int data1 = 0x12345678;
        short data2 = 0x1234;

        // 使用ByteBuffer进行端序转换
        ByteBuffer buffer1 = ByteBuffer.allocate(6); // 分配足够的空间
        ByteBuffer buffer2 = ByteBuffer.allocate(6);
        // 设置为小端序并写入数据
        buffer1.order(ByteOrder.LITTLE_ENDIAN);
        buffer1.putInt(data1);
        
        buffer2.order(ByteOrder.LITTLE_ENDIAN);
        buffer2.putShort(data2);

        // 翻转到大端序并读取数据
        buffer1.flip(); // 准备从缓冲区读取数据
        buffer1.order(ByteOrder.BIG_ENDIAN);
        int bigEndian1 = buffer1.getInt();
        
        buffer2.flip();
        buffer2.order(ByteOrder.BIG_ENDIAN);
        short bigEndian2 = buffer.getShort();

        System.out.println("原int值: " + Integer.toHexString(data1));
        System.out.println("大端模式int值: " + Integer.toHexString(bigEndian1));
        System.out.println("原short值: " + Integer.toHexString(data2 & 0xFFFF));
        System.out.println("大端模式short值: " + Integer.toHexString(bigEndian2 & 0xFFFF));
    }
}
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.

For floating-point types (such as float and double), you can also use ByteBuffer for end-order conversion.

import java.nio.ByteBuffer;
import java.nio.ByteOrder;

public class FloatEndianConversion {
    public static void main(String[] args) {
        float data = 123.45f;

        // 使用ByteBuffer进行端序转换
        ByteBuffer buffer = ByteBuffer.allocate(4); // 分配足够的空间

        // 设置为小端序并写入数据
        buffer.order(ByteOrder.LITTLE_ENDIAN);
        buffer.putFloat(data);

        // 翻转到大端序并读取数据
        buffer.flip(); // 准备从缓冲区读取数据
        buffer.order(ByteOrder.BIG_ENDIAN);
        float bigEndian = buffer.getFloat();

        System.out.println("原float值: " + data);
        System.out.println("大端模式float值: " + bigEndian);
    }
}
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.

The IEEE 754 standard defines the format of floating-point numbers, and their values are consistent regardless of the platform, as long as they are interpreted according to the standard. If you need to store or transmit floating-point numbers as bytes, and you want the receiver to interpret those bytes in a different endianness, then you need to use ByteBuffer for conversion.

In C, standard library functions are typically used for endpoint conversions, which allow developers to convert between network endeny (big-endian) and host endians. Network endianism is big-endian, while host endianism depends on the specific hardware architecture (which may be big-end or little-endian).

For 16-bit and 32-bit integers, they can be converted using the htons(host to network short), ntohs(network to host short), htonl(host to network long), and ntohl(network to host long) functions.

#include <stdio.h>
#include <arpa/inet.h>

int main() {
    uint16_t short_host_order = 0x1234;
    uint32_t long_host_order = 0x12345678;

    // 转换到网络字节序(大端序)
    uint16_t short_net_order = htons(short_host_order);
    uint32_t long_net_order = htonl(long_host_order);

    // 转换回主机字节序
    uint16_t short_back_to_host = ntohs(short_net_order);
    uint32_t long_back_to_host = ntohl(long_net_order);

    printf("Host order short: %04x\n", short_host_order);
    printf("Network order short: %04x\n", short_net_order);
    printf("Back to host order short: %04x\n", short_back_to_host);

    printf("Host order long: %08x\n", long_host_order);
    printf("Network order long: %08x\n", long_net_order);
    printf("Back to host order long: %08x\n", long_back_to_host);

    return 0;
}
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.

For floating-point numbers, there is no direct end-order conversion function, because the representation of floating-point numbers includes exponential and mantissa parts, which are stored in complex memory in a complex way. Typically, one solution is to convert a float to an integer type (such as uint32_t), then do an end-order conversion and then back to floating-point.

#include <stdio.h>
#include <string.h>
#include <arpa/inet.h>

int main() {
    float f = 123.45f;
    uint32_t *int_ptr;
    uint32_t int_val;
    float f_net, f_back;

    // 将浮点数转换为整数
    memcpy(&int_val, &f, sizeof(f));

    // 转换到网络字节序
    int_val = htonl(int_val);

    // 将整数转换回浮点数
    memcpy(&f_net, &int_val, sizeof(f_net));

    // 转换回主机字节序
    int_val = ntohl(int_val);
    memcpy(&f_back, &int_val, sizeof(f_back));

    printf("Original float: %f\n", f);
    printf("Network order float: %f\n", f_net);
    printf("Back to host order float: %f\n", f_back);

    return 0;
}
  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.

In this example, the default endianness is little-endian. In practice, __BYTE_ORDER__ macros can be used to check the endianness in GCC and perform end-order conversion only when needed. For endpoint conversion of floating-point numbers, it is important to pay attention to the impact of the IEEE 754 standard on the representation of floating-point numbers, as well as the possible differences between different platforms and compilers.