Deadlock Occurs When Different Businesses Use the Same Thread Pool
When we are developing code, I have also seen many cases where a custom thread pool is registered globally (it may not be custom, and it is not recommended to directly use the thread pool created by Executors). Maybe the business volume is not high, or it may be other reasons. Anyway, this global thread pool can be created as much as possible.
1. Look at the code
Business logic code:
The custom thread pool BizThreadPool code is as follows:
If you still don’t see the problem with the code example above, you can stop for a few seconds and think about it.
Custom thread pool creation, the queue used, well..., you must not use it like this in your work, it is just for demonstration use here.
If you have already seen the problem, I hope you can continue reading to verify whether we think the same way.
2. What is the problem
After a few seconds of thinking, I decided to run the demo to see the phenomenon.
Encapsulate a controller to directly start the Springboot program and Java startup.
After successful startup, call GET http://localhost:8080/test/test, and the output is as follows.
According to our expectations, the subtask should also be output in the log. Why is there no output for the created subtask? It seems that it has not been executed.
Let's first execute the jstack command to look at the thread-related information. One section of the output information is as follows.
From the above stack information, we can see that after the main thread completes the execution of the parent task, it opens a CountDownLatch and waits for the three child tasks to be completed.
This is the problem. We keep waiting but get no result. So the result we saw at the beginning is that only the parent task is executed but the child task is not executed.
If there is no response to a call, after multiple calls, the system will crash when the server resource bottleneck is reached.
So why was the subtask not executed?
3. Try it out
First, let's start from the beginning and look at the configuration of the thread pool.
When we create a custom thread pool, both the core thread and the maximum thread are set to 1. Can't we just modify the maximum number of threads so that the thread pool has threads that can execute subtasks?
In production, the core threads and maximum threads are generally not set to 1, but even if you set them to 10, 100, or 1000, the problems described later in this article may occur in extreme cases.
Let's do it now. The code for creating a custom thread pool becomes as follows.
You are very confident and restart the program, then call the interface, but you are dumbfounded. Why is there no change?
If you restart the program after changing the maximum number of threads, it means you have forgotten how the thread pool works!
Okay, I forgive you. Don’t forget it again this time. Let’s find out the reason together.
4. Thread pool workflow
Here is the workflow of the thread pool.
Interviewer: What is the task execution process when the core thread of the thread pool is set to 0?
After knowing the working process of the thread pool, in the above code, even if the maximum number of thread pools is increased, the final subtask will not be executed. We can print the current thread pool status for auxiliary observation. (The printThreadPoolStatus() method in the above code will print the current status of the thread pool)
Call the GET http://localhost:8080/test/info method to view the current status of the thread pool.
You can see that there are 3 tasks in the queue, waiting for the thread pool to assign threads to execute the tasks. This is why the modification of the maximum number of thread pools has not taken effect, because there is still an unbounded queue.
Of course, if the number of tasks keeps increasing, the number of tasks in the queue will increase, reaching the bottleneck of the server, and OOM will occur. (The reason why Alibaba development specifications do not recommend the use of unbounded queues)
5. Modify the number of core threads
Then let's directly modify the number of core threads. Will the number of core threads exceed the number of tasks?
Answer: No.
For our example above, increasing the number of core threads and having threads that can execute subtasks can indeed solve the current scenario.
However, when the concurrency increases, or when all threads in the thread pool are occupied by the parent thread, it will still be found that the subtask cannot obtain thread execution.
Here we change the core thread to 10 and see the output results.
By changing the number of core threads, the problem of subtasks piling up in the queue is solved.
So through the above code, everyone should know how the deadlock occurs. Let me summarize it here.
VI. Summary
When the number of core threads is 1 and the number of maximum threads is 1, an unbounded queue is used. The parent task waits for the notification of the completion of the child task in the thread, and the child task waits for the thread pool to schedule thread resources in the task queue of the thread pool.
When the core thread is 1 and the maximum thread is n, an unbounded queue is used. There is no difference between setting the maximum thread to n and setting it to 1, except that a different queue is used. As long as an unbounded queue is used, when resources are exhausted, the service crashes. At this time, when new parent tasks arrive, they will continue to accumulate in the task queue.
When the number of core threads is n and the number of maximum threads is n, an unbounded queue is used. The number of core threads is set to n, which means that the parent thread is likely to be executable, and the created subtasks are queued in the task queue for execution.
When the concurrency increases, or the core threads are occupied by the parent task, the thread pool call becomes the following scenario, and all tasks are accumulated in the task queue:
The core threads are all parent tasks, and the tasks created later are also piled up in the task queue, eventually reaching the server bottleneck system OOM.
VII. Final Solution
From the above code example, the root cause of the deadlock is that the parent task creates multiple child tasks and waits for the child tasks to finish execution. Both the parent and child tasks use the same thread pool. When the execution threads in the thread pool are all parent tasks, all child tasks are waiting for execution in the task queue, so deadlock will occur.
The core thread will never be released, causing the task queue to continue to accumulate until OOM.
So the solution is to isolate the thread pool.
Different businesses use different thread pools, and a new thread pool is used to process subtasks to avoid deadlock.
The modified code is as follows.
By checking the log output, we can find that after the thread pool is isolated, even if the core thread is set to 1, the business logic can be executed normally, and there is no accumulation of tasks in the task queue.
8. Conclusion
Through the above demo reproduction and solution, the optimization suggestions at work are as follows:
- It is forbidden to use Executors to create custom thread pools. When using ThreadPoolExecutor to create a thread pool, pay attention to the meaning of each parameter to avoid the risk of resource exhaustion.
- The thread pool uses bounded queues and avoids using unbounded queues.
- For scenarios with parent-child tasks, you can use thread pools or MQ. After using bounded queues, formulate a reasonable rejection strategy, and consider MQ for retry.
- Different businesses use different thread pools, and parent and child tasks are prohibited from using the same thread pool.