NFS protocol end-to-end instance analysis writing data process

2023.10.02

NFS protocol end-to-end instance analysis writing data process

The server has registered various callback functions with RPC. When receiving the client's request, the specific callback function will be called for processing. This example will call the nfsd3_proc_write function. This function finally calls the data writing function of the VFS layer, and the VFS data writing function calls the function of the specific file system (such as Ext4) to complete the final data writing operation. ​

For NFS, its writing modes include synchronous writing, asynchronous writing and direct writing. The modes differ in the parameters specified when opening the file. Due to space limitations, it is difficult for this article to introduce all modes one by one. Here we mainly introduce a core process.

As a file system under Linux, NFS must also implement a set of function pointer interfaces in order to interface with VFS. Taking file-related operations as an example, the function pointer implemented is as follows. For writing data, VFS will call the nfs_file_write function of NFS.

picturepicture

In this function, if there is a SYNC mark, the synchronous writing process will be triggered, otherwise it will be returned to the caller after writing to the cache. In this section we mainly focus on the process of triggering synchronous writes, that is, how data is sent from the NFS file system to the server.

Both direct writing and synchronous writing will trigger the process of sending data to the server. This section uses synchronous writing as an example to introduce how data is sent to the server. If synchronous flushing is triggered, the nfs_file_fsync function will be called, which is the entry point for transmitting cache data to the server. The main flow of this function to the back-end access interface nfs_do_writepage is shown in the figure below.

picturepicture

nfs_file_fsync main line process

Here nfs_do_writepage is used to send a cache page to the server. The specific implementation is as shown in the following code. The main function is completed by the nfs_page_async_flush function. The more important parameter here is pgio, which contains the function pointer related to page data transmission. For the detailed definition of this parameter type, please refer to the kernel source code.

picturepicture

Then we start counting from the nfs_page_async_flush function and take a look at the main process, as shown in the figure below. The function nfs_generic_pg_pgios is the function pointer initialized in pgio, which is called in nfs_pageio_doio. The mainline process finally calls the nfs_initiate_pgio function, which completes the encapsulation of PRC messages and parameters, and then calls the API function of the RPC service to complete the request.

nfs_page_async_flush main line processnfs_page_async_flush main line process

When nfs_initiate_pgio calls the rpc_run_task function, the entire process enters the RPC service. That is the process of entering the RPC service state machine. For an introduction to the processing flow of the RPC state machine, please refer to the relevant content of this issue.

Finally, we will show a simplified diagram of the entire writing process, which includes the client's function calling process and the server's processing process. Some function calls are omitted from the client's process.

Network file system access diagramNetwork file system access diagram

The server has registered various callback functions with RPC. When receiving the client's request, the specific callback function will be called for processing. This example will call the nfsd3_proc_write function. This function finally calls the data writing function of the VFS layer, and the VFS data writing function calls the function of the specific file system (such as Ext4) to complete the final data writing operation.