# 多维度分析stdin、stdout、sdterr原理

Java层面

在Java层面，我们看到：

initializeSystemClass 方法中完成了对stdin、stdout、sdterr的初始化操作（该方法将在虚拟机启动时由C++通过JNI调用，后面会详细描述）
stdin、stdout、sdterr均是通过指定特殊的fd索引：0，1，2 创建 FileDescriptor 对象
通过该对象创建对应的FileInputStream和FileOutputStream对象（之后在调用print和read等等函数时，将会直接调用C语言的write和read函数，读者可以自行查阅源码）
将其通过指定编码集包装为BufferedInputStream和BufferedOutputStream进行输入和输出

public final class System {

 public final static PrintStream out = null;

 public final static PrintStream err = null;

 public final static InputStream in = null;



 private static void initializeSystemClass() {

  ...

  FileInputStream fdIn = new FileInputStream(FileDescriptor.in); // 初始化输入流

  FileOutputStream fdOut = new FileOutputStream(FileDescriptor.out);// 初始化输出流

  FileOutputStream fdErr = new FileOutputStream(FileDescriptor.err); // 初始化错误输出流

  setIn0(new BufferedInputStream(fdIn)); // 输入流包装为缓冲输入流

  setOut0(newPrintStream(fdOut, props.getProperty("sun.stdout.encoding"))); // 输出流包装为print输出流，同时从环境变量中获取到编码集

  setErr0(newPrintStream(fdErr, props.getProperty("sun.stderr.encoding")));// 错误输出流包装为print错误输出流，同时从环境变量中获取到编码集

  ...

 }



 // 缓冲区长度为128字节

 private static PrintStream newPrintStream(FileOutputStream fos, String enc) {

  if (enc != null) { // 指定编码集初始化PrintStream

   try {

    return new PrintStream(new BufferedOutputStream(fos, 128), true, enc);

   } catch (UnsupportedEncodingException uee) {}

  }

  return new PrintStream(new BufferedOutputStream(fos, 128), true);

 } 

}



// 文件描述符包装类

public final class FileDescriptor {

 public static final FileDescriptor in = new FileDescriptor(0);

 public static final FileDescriptor out = new FileDescriptor(1);

 public static final FileDescriptor err = new FileDescriptor(2);

 

 private int fd;

 private FileDescriptor(int fd) {

  this.fd = fd;

 }

}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75

JNI 层面

在JNI层面我们看到：JVM在启动时，将会加载并初始化System类，然后调用System类的initializeSystemClass方法完成对该标准输入输出流的初始化

// 创建虚拟机代码

jint Threads::create_vm(JavaVMInitArgs* args, bool* canTryAgain) {

 ...

 // 加载并初始化System类

 initialize_class(vmSymbols::java_lang_System(), CHECK_0);

 ...

 // 调用System类的initializeSystemClass方法

 call_initializeSystemClass(CHECK_0);

}



static void call_initializeSystemClass(TRAPS) {

 // 获取System类的Klass对象（元数据对象）

 Klass* k = SystemDictionary::resolve_or_fail(vmSymbols::java_lang_System(), true, CHECK);

 instanceKlassHandle klass (THREAD, k);

 JavaValue result(T_VOID);

 JavaCalls::call_static(&result, klass, vmSymbols::initializeSystemClass_name(),

          vmSymbols::void_method_signature(), CHECK); // 调用该静态方法完成初始化

}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35

C层面

在C语言层面，我们使用printf来代表输出，scanf来代表输入。通过源码我们看到：与Java一样，定义了stdin，stdout，stderr 的fd 值为0，1，2。

int __printf (const char *format, ...)

{

 ...

 done = vfprintf (stdout, format, arg);

 ...

}



FILE *stdin = (FILE *) &_IO_2_1_stdin_;

FILE *stdout = (FILE *) &_IO_2_1_stdout_;

FILE *stderr = (FILE *) &_IO_2_1_stderr_;



DEF_STDFILE(_IO_2_1_stdin_, 0, 0, _IO_NO_WRITES);

DEF_STDFILE(_IO_2_1_stdout_, 1, &_IO_2_1_stdin_, _IO_NO_READS);

DEF_STDFILE(_IO_2_1_stderr_, 2, &_IO_2_1_stdout_, _IO_NO_READS+_IO_UNBUFFERED);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

内核实现

通过上述描述我们看到在用户空间层面：Java或者C均是通过定义特殊的fd值 0,1,2来表示输入、输出、错误输出值，那么我们来看看在内核中 fd 是什么，这几个特殊值又是什么呢？我们先来以C语言的write函数来看看其原理，该函数的定义如下，通过指定 fd，buf输出数据的首地址，输出数据长度，将数据写入到文件描述符fd中，读者可以将fd设置为 1和2观察结果。

ssize_t write(int fd, const void *buf, size_t count);

那么，我们来看write函数将调用的系统调用函数的实现，从中观察下fd文件描述符到底指什么。通过源码得知：

sys_write系统调用将处理用户空间传递的写操作
在该方法中将通过fget_light方法获取到fd代表的file文件
而在fget_light方法中，我们看到核心流程为fcheck(fd)方法， fcheck(fd)方法中将根据 file = files->fd[fd] 从files指针的fd数组中获取到fd所对应下标的file文件，这时我们可以推论：fd就是数组下标，数组中存放着当前进程打开的file文件指针
实际处理读写操作的对象为file文件指针

asmlinkage ssize_t sys_write(unsigned int fd, const char __user * buf, size_t count)

{

 struct file *file;

 ssize_t ret = -EBADF;

 int fput_needed;



 file = fget_light(fd, &fput_needed); // 通过文件描述符得到file指针

 if (file) { // 文件对象存在，那么执行写入

  ret = vfs_write(file, buf, count, &file->f_pos);

  fput_light(file, fput_needed);

 }

 return ret;

}



struct file *fget_light(unsigned int fd, int *fput_needed)

{

 struct file *file;

 struct files_struct *files = current->files; // 获取当前进程打开的文件结构体，其中包含了打开的file文件对象



 *fput_needed = 0;

 if (likely((atomic_read(&files->count) == 1))) { // 若只有当前进程访问该fd，那么不需要获取自旋锁，直接通过fd获取file文件

  file = fcheck(fd);

 } else { // 否则需要上锁后再获取（读者如果不明白，可以考虑下笔者说过的：线程是多个进程共享数据，而这个file是可以共享的，如果当前进程和其他进程共享数据（他们都是线程）那么需要上锁）

  spin_lock(&files->file_lock);

  file = fcheck(fd);

  if (file) {

   get_file(file);

   *fput_needed = 1;

  }

  spin_unlock(&files->file_lock);

 }

 return file;

}



// 宏定义

#define fcheck(fd) fcheck_files(current->files, fd)



// 从指定files_struct中获取fd表示的file文件

static inline struct file * fcheck_files(struct files_struct *files, unsigned int fd)

{

 struct file * file = NULL;

 if (fd < files->max_fds)

  file = files->fd[fd]; // 从这里可以看到其实fd 就是 files 数组的下标，改下标处存放了file文件指针

 return file;

}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89

结论

通过上述描述，我们看到最终我们需要研究的方向：0,1,2 三个数组下标，指向files结构指针的fd数组的file指针是由谁放入的，又是什么时候放入的？放入了什么？来看Linux启动源码。通过源码我们得知：在内核启动时，在最开始的1号进程init进程的files结构中打开了一个文件：/dev/console，这时将在0下标处存放该文件file指针，同时使用dup系统调用复制两次了fd为0的file，此时files数组中将存在：0，1，2下标的file，这就是我们前面看到的stdin、stdout、sdterr。

asmlinkage void __init start_kernel(void){ // 内核启动源码

 ...

 rest_init(); // 创建init进程

}



static void rest_init(void)

{

 kernel_thread(init, NULL, CLONE_KERNEL); // 创建init进程（通过内核线程方式创建，了解下即可），init方法为内核的执行体

 ...

} 



// 由 init 内核线程执行

static int init(void * unused){

 ...

 if (open("/dev/console", O_RDWR, 0) < 0) // 打开名为 /dev/console 的文件

  printk("Warning: unable to open an initial console.\n");

 (void) dup(0); // 复制fd为0 file文件

 (void) dup(0); // 复制fd为0 file文件

 ...

}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

那么，现在我们来看看/dev/console是什么呢？在Linux中/dev/console指向当前控制台，是一个虚拟终端，他需要连接到真实的tty终端上，何为tty？Teletype 的缩写，表示一个计算机的终端(terminal)设备。来看下图，用户进程通过与tty进行通讯，然后tty驱动将与实际仿真终端沟通，而仿真终端将实际操作具体的驱动与硬件交互。读者了解即可。

这时，我们可以总结下：

在内核创建的时候创建了 0,1,2 三个file文件，当进程进行fork的时候，会将父进程的files进行复制，此时将在内核创建的所有进程中包含0，1, 2 文件
当我们操作stdin、stdout、sdterr时，直接指定0,1,2 fd 将可直接操作该文件
读者可以在Linux 终端上，也即shell上，直接输入 echo "hello" >> /dev/tty 观察结果，因为该tty表示当前终端，然后再 echo "hello" >> /dev/console ，然后观察结果，注意 /dev/console 关联的具体终端在不同机器上不同，笔者的为ubuntu，可以通过 ctrl + alt + f1-N 来切换 tty 来观察结果

← Netty 核心原理十六 ChannelPromise 原理 Netty 核心原理十七内存池设计与JDK基础 →