Skip to main content

2. Document-Programming

2.1 What is File I/O

I/O stands for Input/Output, and in the Linux system, all input and output operations are performed through files, which is also referred to as File I/O.

Linux File I/O refers to the process of reading from and writing to external devices (e.g., hard disks, USB drives) in the Linux operating system. The data transfer from a device to memory is known as input, while the process of transferring data from memory to a device is known as output.

2.2 Basic Approaches to Linux File I/O Programming

  • Standard I/O (stdio): The Standard I/O function library provides a series of high-level file I/O functions such as fopen, fclose, fread, fwrite, etc., which assist programmers in performing file operations conveniently. The Standard I/O function library also includes a buffering mechanism to enhance file I/O efficiency.

  • File I/O (syscall): System call I/O involves directly using system calls for file operations, such as open, read, write, close, etc. Unlike the Standard I/O function library, system call I/O functions do not use buffering, resulting in higher file I/O efficiency.

  • Raw I/O (raw I/O): Raw I/O is a low-level file I/O approach that utilizes read and write functions for data read and write operations, allowing more fine-grained control over file I/O operations. Raw I/O is typically used for operations on special devices, such as disk partitions and network devices.

All three approaches can be employed in the Linux system, and the choice depends on the specific application scenarios and requirements. Standard I/O is suitable for most file I/O operations and enhances development efficiency, while system call I/O is ideal for lower-level file operations, optimizing file I/O efficiency. Raw I/O is appropriate for operations on special devices.

2.3 File Descriptors

  • For File I/O, everything is accomplished through file descriptors. In the Linux system, when a file is opened or created, the kernel returns a corresponding file descriptor (a non-negative integer) to the process. Additionally, it is specified that when a process starts, 0 represents standard input, 1 represents standard output, and 2 represents standard error. Consequently, if a new file is opened at this point, its file descriptor would be 3, and so on.

  • POSIX defines STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO to replace 0, 1, and 2, respectively. The definitions of these three symbolic constants are located in the header file unistd.h.

2.4 Opening or Creating Files

  • Linux provides the open function to open or create a file. The function is declared as follows:

    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>

    int open(const char *pathname, int flags);
    int open(const char *pathname, int flags, mode_t mode)
  • Parameter Explanation:

    • pathname: Represents the file name, which can include (absolute or relative) paths
    • flags: Represents the file opening mode
    • mode: Specifies the access permissions for the owner, group, and other users of the file. This parameter is only used when creating a file
    • If the function is successful, it returns the file descriptor; otherwise, it returns -1
  • The file opening mode flags can use the following macros (when multiple options are present, they can be combined using the "|" operator):

    • O_RDONLY: Open a file for reading only
    • O_WRONLY: Open a file for writing only
    • O_RDWR: Open a file for both reading and writing
    • O_APPEND: All data written to the file will be appended to the end
    • O_CREAT: Create the file if it does not exist
    • O_EXCL: If O_CREAT is set and the file already exists, the open call will fail
    • O_TRUNC: If the file exists, its contents will be truncated when opened
  • The mode parameter is used only when creating a file and specifies the file's access permissions. The modes are:

    • S_IRUSR: Read permission for the owner.
    • S_IWUSR: Write permission for the owner.
    • S_IXUSR: Execute permission for the owner.
    • S_IRWXU: S_IRUSR | S_IWUSR | S_IXUSR.
    • S_IRGRP: Read permission for the group.
    • S_IWGRP: Write permission for the group.
    • S_IXGRP: Execute permission for the group.
    • S_IXGRP: Execute permission for the group.
    • S_IROTH: Read permission for others.
    • S_IWOTH: Write permission for others.
    • S_IXOTH: Execute permission for others.
    • S_IRWXO: S_IROTH | S_IWOTH | S_IXOTH.
  • The calculation of file access permissions is based on umask & ~mode.

    • When we log in to the Linux system and create a file, there is always a default permission. Setting the default permission for users to create files is the function of the umask. The umask sets file permissions by specifying a preset permission mask when creating a file. By default, executable files cannot be created, and execute permission must be manually assigned. Therefore, when creating a file, the default permission is 666, and for directories, the default maximum permission is 777.

    • Calculation method:

      • For example, if the umask value is 022:

        linaro@linaro-alip:~$ umask
        0022
      • Then the default file permission will be: 110 110 110 & (~000 010 010) = 110 110 110 & 111 101 101 = 110 100 100

      • Thus, the default permission is 644.

  • To open a file test.txt located in the current working directory luckfox:

    int fd = open("./luckfox/test.txt", O_RDWR)
  • To open a file in the parent directory of the current directory with read and write permissions:

    int fd = open("../luckfox .txt", O_RDWR)
  • To create and open a file with read and write permissions in the absolute path:

    int fd = open("./luckfox/test.txt", O_CREAT| O_RDWR)
  • Create a file: Create a new open.cpp file and input the following code in open.cpp:

    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <unistd.h>

    int main(int argc, char *argv[])
    {
    int fd;
    fd = open("luckfox.txt", O_CREAT | O_RDWR, 0666);
    if (fd == -1)
    {
    printf("open is error\n");
    return -1;
    }
    printf("fd is %d\n", fd);
    return 0;
    }
    • If the function is successful, the returned file descriptor will be 3; otherwise, it will be -1.

2.5 Reading Data from a File

  • Linux provides the read function to read data from an opened file, and its declaration is as follows:

    #include<unistd.h>
    ssize_t read(int fd, void * buf ,size_t count)
  • Parameter meanings:

    • fd is the file descriptor
    • buf represents the address of the data buffer for reading
    • count indicates the number of bytes to read
  • Function's purpose: It transfers count bytes of data from the file pointed to by fd to the memory pointed to by the buf pointer.

  • Return value: If the read is successful, it returns the actual number of bytes read. If it fails, it returns -1. If the end of the file is reached or there is no data to be read, it returns 0. Note that the number of bytes read may be less than the value of count.

  • Reading data from a file:

    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <unistd.h>

    int main(int argc, char *argv[])
    {
    int fd;
    char buf[32] = {0};
    ssize_t ret;
    fd = open("luckfox.txt", O_RDWR);
    if (fd == -1)
    {
    printf("open is error\n");
    return -1;
    }
    printf("fd is %d\n", fd);
    ret = read(fd,buf,32);
    if (ret == -1)
    {
    printf("read is error\n");
    return -2;
    }
    printf("buf is %s", buf);
    printf("ret is %ld\n", ret);
    close(fd);
    return 0;
    }

2.6 Writing Data to a File

  • Linux provides the write function to write data into an already opened file, and its declaration is as follows:

    #include<unistd.h>
    ssize_t write(int fd, void * buf ,size_t count)
  • Parameter meanings:

    • fd is the file descriptor
    • buf represents the address of the data buffer to be written
    • count indicates the number of bytes to write
  • Function's purpose: It writes count bytes of data from the buffer pointed to by buf into the file specified by fd.

  • Writing data to a file:

    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <unistd.h>
    #include <cstring>

    int main(int argc, char *argv[])
    {
    int fd;
    ssize_t ret;
    char buf[] = "boys and girls\n hi,children!";
    char filename[] = "luckfox.txt";

    fd = open(filename,O_RDWR|O_APPEND);
    if (fd == -1)
    {
    printf("open is error\n");
    return -1;
    }
    printf("fd is %d\n", fd);
    ret = write(fd,buf,strlen(buf));
    printf("write %d bytes to file %s\n",ret,filename);
    close(fd);
    return 0;
    }

2.7 File Offset

  • In practical applications, there might be a need to read or write from a specific position in the file. In such cases, the file read/write position needs to be moved to the new position. In Linux, the lseek function can be used to modify the file offset (read/write position). The function is declared as follows:

    #include <sys/types.h>
    #include<unistd.h>

    off_t lsweek(int fd, off_t offset, int whence)
  • Parameter meanings:

    • fd is the file descriptor
    • offset: The offset in bytes. It can be positive or negative. If it's positive, it indicates moving forward, and if it's negative, it indicates moving backward.
    • whence: This parameter is used in conjunction with the offset parameter to specify the current position's reference point. It can have the following three values:
      • SEEK_SET:The offset is relative to the beginning of the file
      • SEEK_CUR:The offset is relative to the current file read/write pointer position
      • SEEK_END:The offset is relative to the end of the file.
  • Setting the file pointer to position 100:

    lseek(fd,100,SEEK SET);
  • Setting the file pointer to the end of the file:

    lseek(fd,0,SEEK END);
  • Determining the current file position:

    lseek(fd,0,SEEK CUR);
  • Starting to read the file again:

    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <unistd.h>

    int main(int argc, char *argv[])
    {
    int fd;
    char buf[32] = {0};
    ssize_t ret;
    fd = open("luckfox.txt", O_CREAT | O_RDWR, 0666);
    if (fd == -1)
    {
    printf("open is error\n");
    return -1;
    }
    // printf("fd is %d\n", fd);

    ret = read(fd, buf, 32);
    if (ret < 0)
    {
    printf("read is error\n");
    return -2;
    }
    printf("buf is %s\n", buf);
    printf("ret is %ld\n", ret);

    lseek(fd,0,SEEK_SET);
    ret = read(fd, buf, 32);

    printf("ret is %ld\n", ret);
    close(fd);
    return 0;
    }
  • Checking the current file position:

    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <unistd.h>

    int main(int argc, char *argv[])
    {
    int fd;
    char buf[32] = {0};
    ssize_t ret;
    fd = open("luckfox.txt", O_CREAT | O_RDWR, 0666);
    if (fd == -1)
    {
    printf("open is error\n");
    return -1;
    }

    ret = read(fd, buf, 8);
    if (ret < 0)
    {
    printf("read is error\n");
    return -2;
    }
    printf("buf is %s\n", buf);
    printf("ret is %ld\n", ret);

    ret=lseek(fd,0,SEEK_CUR);

    printf("ret is %ld\n", ret);
    close(fd);
    return 0;
    }
  • Checking the file size (in bytes):

    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <fcntl.h>
    #include <unistd.h>

    int main(int argc, char *argv[])
    {
    int fd;
    char buf[32] = {0};
    ssize_t ret;
    fd = open("luckfox.txt", O_CREAT | O_RDWR, 0666);
    if (fd == -1)
    {
    printf("open is error\n");
    return -1;
    }


    ret = read(fd, buf, 2);
    if (ret < 0)
    {
    printf("read is error\n");
    return -2;
    }
    printf("buf is %s\n", buf);
    printf("ret is %ld\n", ret);

    ret=lseek(fd,0,SEEK_END);


    printf("ret is %ld\n", ret);
    close(fd);
    return 0;
    }