Detailed Distributed ID Solution
Detailed Distributed ID Solution
In a distributed system, it is very important to generate a globally unique ID, because in a distributed system, multiple nodes generate IDs at the same time, which may cause ID conflicts. Several commonly used distributed ID solutions are introduced below.
UUID
UUID (Universally Unique Identifier) is an identifier composed of 128 digits, which can guarantee uniqueness on a global scale, because its generation algorithm is based on factors such as timestamp, node ID, etc. UUID can be generated using the UUID class that comes with Java, as follows:
javaCopy code
import java.util.UUID;
public class UuidGenerator {
public static void main(String[] args) {
UUID uuid = UUID.randomUUID();
System.out.println(uuid.toString());
}
}
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
The advantage of UUID is that it is easy to use, without additional configuration and management, and can be directly generated using the UUID class that comes with Java. However, UUID has a long length (128 bits), which is not suitable as the primary key of a database table, and is not easy to sort and index.
Snowflake
Snowflake is a distributed ID generation algorithm open sourced by Twitter, which can generate a 64-bit unique ID, which contains information such as timestamp, data center ID, and machine ID. The Java code for the Snowflake algorithm is as follows:
Java code for Snowflake algorithm:
javaCopy code
public class SnowflakeGenerator {
private final static long START_STMP = 1480166465631L;
private final static long SEQUENCE_BIT = 12;
private final static long MACHINE_BIT = 5;
private final static long DATACENTER_BIT = 5;
private final static long MAX_DATACENTER_NUM = -1L ^ (-1L << DATACENTER_BIT);
private final static long MAX_MACHINE_NUM = -1L ^ (-1L << MACHINE_BIT);
private final static long MAX_SEQUENCE = -1L ^ (-1L << SEQUENCE_BIT);
private final static long MACHINE_LEFT = SEQUENCE_BIT;
private final static long DATACENTER_LEFT = SEQUENCE_BIT + MACHINE_BIT;
private final static long TIMESTMP_LEFT = DATACENTER_LEFT + DATACENTER_BIT;
private long datacenterId;
private long machineId;
private long sequence = 0L;
private long lastStmp = -1L;
public SnowflakeGenerator(long datacenterId, long machineId) {
if (datacenterId > MAX_DATACENTER_NUM || datacenterId < 0) {
throw new IllegalArgumentException("datacenterId can't be greater than MAX_DATACENTER_NUM or less than 0");
}
if (machineId > MAX_MACHINE_NUM || machineId < 0) {
throw new IllegalArgumentException("machineId can't be greater than MAX_MACHINE_NUM or less than 0");
}
this.datacenterId = datacenterId;
this.machineId = machineId;
}
public synchronized long nextId() {
long currStmp = getNewstmp();
if (currStmp < lastStmp) {
throw new RuntimeException("Clock moved backwards. Refusing to generate id");
}
if (currStmp == lastStmp) {
sequence = (sequence + 1) & MAX_SEQUENCE;
if (sequence == 0L) {
currStmp = getNextMill();
}
} else {
sequence = 0L;
}
lastStmp = currStmp;
return (currStmp - START_STMP) << TIMESTMP_LEFT
| datacenterId << DATACENTER_LEFT
| machineId << MACHINE_LEFT
| sequence;
}
private long getNextMill() {
long mill = getNewstmp();
while (mill <= lastStmp) {
mill = getNewstmp();
}
return mill;
}
private long getNewstmp() {
return System.currentTimeMillis();
}
}
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- twenty one.
- twenty two.
- twenty three.
- twenty four.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
The advantage of the Snowflake algorithm is that the ID generation performance is high, and the ID length is short (64 bits), which can be used as the primary key of the database table, and it is convenient for sorting and indexing. However, it should be noted that if the number of nodes in the cluster exceeds the number of digits occupied by the machine ID, or the cluster is large and the number of timestamp digits is not enough, then other distributed ID generation algorithms need to be considered.
Leaf
Leaf is a distributed ID generation algorithm open sourced by Meituan Dianping, which can generate a globally unique 64-bit ID. The Java code of the Leaf algorithm is as follows:
Java code for Leaf algorithm:
javaCopy code
public class LeafGenerator {
private static final Logger logger = LoggerFactory.getLogger(LeafGenerator.class);
private static final String WORKER_ID_KEY = "leaf.worker.id";
private static final String PORT_KEY = "leaf.port";
private static final int DEFAULT_PORT = 8080;
private static final int DEFAULT_WORKER_ID = 0;
private static final int WORKER_ID_BITS = 10;
private static final int SEQUENCE_BITS = 12;
private static final int MAX_WORKER_ID = (1 << WORKER_ID_BITS) - 1;
private static final int MAX_SEQUENCE = (1 << SEQUENCE_BITS) - 1;
private static final long EPOCH = 1514736000000L;
private final SnowflakeIdWorker idWorker;
public LeafGenerator() {
int workerId = SystemPropertyUtil.getInt(WORKER_ID_KEY, DEFAULT_WORKER_ID);
int port = SystemPropertyUtil.getInt(PORT_KEY, DEFAULT_PORT);
this.idWorker = new SnowflakeIdWorker(workerId, port);
logger.info("Initialized LeafGenerator with workerId={}, port={}", workerId, port);
}
public long nextId() {
return idWorker.nextId();
}
private static class SnowflakeIdWorker {
private final long workerId;
private final long port;
private long sequence = 0L;
private long lastTimestamp = -1L;
SnowflakeIdWorker(long workerId, long port) {
if (workerId < 0 || workerId > MAX_WORKER_ID) {
throw new IllegalArgumentException(String.format("workerId must be between %d and %d", 0, MAX_WORKER_ID));
}
this.workerId = workerId;
this.port = port;
}
synchronized long nextId() {
long timestamp = System.currentTimeMillis();
if (timestamp < lastTimestamp) {
throw new RuntimeException("Clock moved backwards. Refusing to generate id");
}
if (timestamp == lastTimestamp) {
sequence = (sequence + 1) & MAX_SEQUENCE;
if (sequence == 0L) {
timestamp = tilNextMillis(lastTimestamp);
}
} else {
sequence = 0L;
}
lastTimestamp = timestamp;
return ((timestamp - EPOCH) << (WORKER_ID_BITS + SEQUENCE_BITS))
| (workerId << SEQUENCE_BITS)
| sequence;
}
private long tilNextMillis(long lastTimestamp) {
long timestamp = System.currentTimeMillis();
while (timestamp <= lastTimestamp) {
timestamp = System.currentTimeMillis();
}
return timestamp;
}
}
}
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- twenty one.
- twenty two.
- twenty three.
- twenty four.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
- 57.
- 58.
- 59.
- 60.
- 61.
The characteristic of the Leaf algorithm is that the ID generation speed is slightly slower than the Snowflake algorithm, but it can support more Worker nodes. The ID generated by the Leaf algorithm consists of three parts, namely timestamp, worker ID and serial number. The timestamp occupies 42 bits, the worker ID occupies 10 bits, and the serial number occupies 12 bits, totaling 64 bits.
The above are common distributed ID generation algorithms, and of course there are other solutions, such as: MongoDB ID, UUID, Twitter Snowflake, etc. Different solutions are suitable for different business scenarios, and the specific implementation details and performance are also different. You need to choose the appropriate solution according to the actual situation.
In addition to the distributed ID generation algorithm introduced above, there are some new distributed ID generation schemes emerging, such as Flicker's distributed ID generation algorithm, which uses an idea similar to Snowflake, but uses a different bit allocation method , which is more flexible than Snowflake, and can dynamically adjust the number of bits occupied by each part as needed. In addition, Facebook also launched the ID Generation Service (IGS) solution, which separates ID generation and storage and provides a more flexible and scalable solution, but requires more complex architecture design and implementation.
According to different business needs, multiple sets of distributed ID generation schemes can be designed. Here are some of my personal suggestions:
- Generated based on database auto-increment ID: Using database auto-increment ID as a globally unique ID can well guarantee the uniqueness of ID and is easy to implement, but it may cause performance bottlenecks when the amount of concurrency is high. Therefore, it is not recommended to use it in high concurrency scenarios.
- UUID-based generation: Using UUID as a globally unique ID can well guarantee the uniqueness of the ID, but the ID length is long (128 bits), which is not convenient for storage and transmission, and the probability of duplicate IDs is very small but not zero. Therefore, it is recommended to consider the length of the ID and the cost of storage transfer when used in a distributed system.
- Redis-based generation: Using the atomic operation of Redis can guarantee the uniqueness of the ID, and the ID generation speed is very fast, which can be applied to high-concurrency scenarios. However, it should be noted that if Redis goes down or has insufficient performance, it may affect the efficiency and availability of ID generation.
- Based on ZooKeeper generation: Using ZooKeeper's serial number generator can guarantee the uniqueness of the ID, and the implementation is relatively simple, but it needs to introduce additional dependencies and resources, and there may be performance bottlenecks.
To choose a distributed ID generation solution suitable for your business scenario, you need to comprehensively consider multiple factors such as ID uniqueness, generation speed, length, storage cost, scalability, and availability. At the same time, it should be noted that the implementation details and performance of different solutions are also different, and trade-offs and choices need to be made according to the actual situation.
The detailed code demo of each scheme is given below:
Generated based on database auto-increment ID
javaCopy code
public class IdGenerator {
private static final String JDBC_URL = "jdbc:mysql://localhost:3306/test";
private static final String JDBC_USER = "root";
private static final String JDBC_PASSWORD = "password";
public long generateId() {
Connection conn = null;
PreparedStatement pstmt = null;
ResultSet rs = null;
try {
Class.forName("com.mysql.jdbc.Driver");
conn = DriverManager.getConnection(JDBC_URL, JDBC_USER, JDBC_PASSWORD);
pstmt = conn.prepareStatement("INSERT INTO id_generator (stub) VALUES (null)", Statement.RETURN_GENERATED_KEYS);
pstmt.executeUpdate();
rs = pstmt.getGeneratedKeys();
if (rs.next()) {
return rs.getLong(1);
}
} catch (Exception e) {
e.printStackTrace();
} finally {
try {
if (rs != null) {
rs.close();
}
if (pstmt != null) {
pstmt.close();
}
if (conn != null) {
conn.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
return 0L;
}
}
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- twenty one.
- twenty two.
- twenty three.
- twenty four.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
Generated based on UUID
javaCopy code
import java.util.UUID;
public class IdGenerator {
public String generateId() {
return UUID.randomUUID().toString().replace("-", "");
}
}
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
Generated based on Redis
javaCopy code
import redis.clients.jedis.Jedis;
public class IdGenerator {
private static final String REDIS_HOST = "localhost";
private static final int REDIS_PORT = 6379;
private static final String REDIS_PASSWORD = "password";
private static final int ID_GENERATOR_EXPIRE_SECONDS = 3600;
private static final String ID_GENERATOR_KEY = "id_generator";
public long generateId() {
Jedis jedis = null;
try {
jedis = new Jedis(REDIS_HOST, REDIS_PORT);
jedis.auth(REDIS_PASSWORD);
long id = jedis.incr(ID_GENERATOR_KEY);
jedis.expire(ID_GENERATOR_KEY, ID_GENERATOR_EXPIRE_SECONDS);
return id;
} catch (Exception e) {
e.printStackTrace();
} finally {
if (jedis != null) {
jedis.close();
}
}
return 0L;
}
}
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- twenty one.
- twenty two.
- twenty three.
- twenty four.
- 25.
- 26.
- 27.
- 28.
- 29.
Generated based on ZooKeeper
javaCopy code
import java.util.concurrent.CountDownLatch;
import org.apache.zookeeper.CreateMode;
import org.apache.zookeeper.WatchedEvent;
import org.apache.zookeeper.Watcher;
import org.apache.zookeeper.ZooDefs.Ids;
import org.apache.zookeeper.ZooKeeper;
public class IdGenerator implements Watcher {
private static final String ZK_HOST = "localhost";
private static final int ZK_PORT = 2181;
private static final int SESSION_TIMEOUT = 5000;
private static final String ID_GENERATOR_NODE = "/id_generator";
private static final int ID_GENERATOR_EXPIRE_SECONDS = 3600;
private long workerId = 0;
public IdGenerator() {
try {
ZooKeeper zk = new ZooKeeper(ZK_HOST + ":" + ZK_PORT, SESSION_TIMEOUT, this);
CountDownLatch latch = new CountDownLatch(1);
latch.await();
if (zk.exists(ID_GENERATOR_NODE, false) == null) {
zk.create(ID_GENERATOR_NODE, null, Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
}
workerId = zk.getChildren(ID_GENERATOR_NODE, false).size();
zk.create(ID_GENERATOR_NODE + "/worker_" + workerId, null, Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL);
} catch (Exception e) {
e.printStackTrace();
}
}
public long generateId() {
ZooKeeper zk = null;
try {
zk = new ZooKeeper(ZK_HOST + ":" + ZK_PORT, SESSION_TIMEOUT, null);
CountDownLatch latch = new CountDownLatch(1);
latch.await();
zk.create(ID_GENERATOR_NODE + "/id_", null, Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL_SEQUENTIAL, (rc, path, ctx, name) -> {}, null);
byte[] data = zk.getData(ID_GENERATOR_NODE + "/worker_" + workerId, false, null);
long id = Long.parseLong(new String(data)) * 10000 + zk.getChildren(ID_GENERATOR_NODE, false).size();
return id;
} catch (Exception e) {
e.printStackTrace();
} finally {
if (zk != null) {
try {
zk.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
return 0L;
}
@Override
public void process(WatchedEvent event) {
if (event.getState() == Event.KeeperState.SyncConnected) {
System.out.println("Connected to ZooKeeper");
CountDownLatch latch = new CountDownLatch(1);
latch.countDown();
}
}
}
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
- 17.
- 18.
- 19.
- 20.
- twenty one.
- twenty two.
- twenty three.
- twenty four.
- 25.
- 26.
- 27.
- 28.
- 29.
- 30.
- 31.
- 32.
- 33.
- 34.
- 35.
- 36.
- 37.
- 38.
- 39.
- 40.
- 41.
- 42.
- 43.
- 44.
- 45.
- 46.
- 47.
- 48.
- 49.
- 50.
- 51.
- 52.
- 53.
- 54.
- 55.
- 56.
- 57.
- 58.
- 59.
- 60.
- 61.
- 62.
- 63.
- 64.
- 65.
- 66.
Note that the temporary nodes of ZooKeeper are used here to coordinate each working node. If a working node hangs up, its temporary node will also be deleted, so as to ensure that the ID obtained by each working node is unique.
The above are the detailed code demos of various distributed ID generation schemes. In fact, each scheme has its advantages and disadvantages, and the appropriate scheme should be selected according to the specific business scenario and system architecture.