I'm confused, if I want to save the IP address, what data type is better?

2023.04.27

I'm confused, if I want to save the IP address, what data type is better?


The essence of an IP address is a 32-bit unsigned integer, similar to 192.168.0.1​ This dotted decimal string notation is just to help people understand and remember. The decimal representation corresponding to 192.168.0.1​ is an unsigned inte ger 3232235521.

When it comes to IP addresses (IPv4), common IP addresses like 192.168.0.1 and 127.0.0.1 will immediately appear in everyone's minds, and then combined with the question "What data type is used to store IP addresses in MySQL?", so Almost blurred out with char string type storage.

The interviewer looked indifferent, and you immediately realized that something was wrong, and thought about it carefully.

Then I found that the length of this IP address is variable, the shortest can be 0.0.0.0 which only needs 7 digits, and the longest can be 255.255.255.255 which requires 15 digits, so I confidently answered using varchar(15) to store the IP address, and for I was secretly delighted to be able to think of this level.

Unexpectedly, the interviewer smiled contemptuously and asked you "Are you sure?" You felt that the interviewer was testing you, so you answered "Yes" firmly.

Then start the next question

 …

People often use varchar(15) columns to store IP addresses, but in fact this is not optimal.

The essence of an IP address is a 32-bit unsigned integer. The dotted decimal string notation similar to 192.168.0.1 is just to help people understand and remember. The decimal representation corresponding to 192.168.0.1 is an unsigned integer 3 232235521.

Therefore, the person who uses the string type to store the IP address actually subconsciously thinks that the IP address is a string, and the stored string is a dotted decimal string, but the correct one should be a 32-bit unsigned integer

The so-called signed number actually uses the highest bit as the sign bit, such as a 32-bit signed INT, the highest bit is the sign bit, and the remaining 31 bits are the real value, so the value range of the signed INT is:

The value range of unsigned INT is:

The following table lists the signed and unsigned value ranges of each integer type in MySQL. When defining a table, you can add the keyword UNSIGNED after the data type to define an unsigned integer, otherwise it defaults to a signed integer:

type

Signed value range

Unsigned value range

TINYINT (1 byte, 8 bits)

-128 ~ 127

0~255

SMALLINT (2 bytes, 16 bits)

-32768 ~ 32767

0~65535

MEDIUMINT (3 bytes, 24 bits)

-8388608 ~ 8388607

0~16777215

INT (4 bytes, 32 bits)

-2147483648 ~ 2147483647

0~4294967295

BIGINT (8 bytes, 64 bits)

-9223372036854775808 ~ 9223372036854775807

0~18446744073709551615

Combining the above table, it can be seen that the 32-bit unsigned INT can just hold the IPv4 address. The following is a comparison between the two data types of INT UNSIGNED and VARCHAR(15):

  1. Storage space: 4 bytes of INT type and 15 bytes of VARCHAR(15) save more storage space. In addition, in addition to saving the required number of characters, VARCHAR will add another byte to record the length (if the length of the column statement exceeds 255, use two bytes to record the length), so VARCHAR(15) actually takes up 16 byte.
  2. Retrieval speed: If we want to build an index on the IP address, then for the string index, the retrieval speed of the integer index is simply a blow, because the comparison of the string type needs to be traversed from the first character in order Yes, slower.

MySQL considerately provides conversion functions between IPv4 address dotted decimal and unsigned integer, inet_aton and inet_ntoa (the bottom layer is a binary shift operation, which is very fast):

picture

Of course, you should perform these conversions in your business to reduce the pressure on MySQL.