理解JS中的Module

Posted on 2014-06-26 | In Javascript

require函数

Node.js采用CommonJS规范，每个文件就是一个模块，有自己的作用域。在一个文件里面定义的变量、函数、类，都是私有的，对其他文件不可见。每个模块内部，module变量代表当前模块。这个变量是一个对象，它的exports属性（即module.exports）是对外的接口。加载某个模块，其实是加载该模块的module.exports属性。

//module1.js
function greet(){
    consolo.log('greet!')
}
module.exports = greet;

//app.js
var greet = require('./module1.js');
greet(); //greet!

上面例子中，module.exports返回的是一个函数对象

module对象

每个JS文件内部都有一个module对象，代表当前模块，module它的原型是Module。猜想module对象创建的过程为：

当编译require时，找到.js文件，Module会为其生成一个id并注册到一个全局对象中维护
创建module对象和该id绑定
编译该.js文件时，从id索引到module对象

我们先写一个空的module，观察其所有属性：

console.log(module);

/*
Module {
  id: '.',
  exports: {},
  parent: null,
  filename: '/home/user/path/to/module.js',
  loaded: false,
  children: [],
  paths:[...]
}
*/

各属性其含义如下： (1) module.id: 模块的识别符，通常是带有绝对路径的模块文件名。 (2) module.filename 模块的文件名，带有绝对路径。 (3) module.loaded 返回一个布尔值，表示模块是否已经完成加载。 (4) module.parent 返回一个对象，表示调用该模块的模块，当该模块被require时，parent指向require该模块的模块 (5) module.children 返回一个数组，表示该模块要用到的其他模块。例如，当前模块require了3个其它模块，那么children数组的值为这3个模块对象 (6) module.exports 表示模块对外输出的值。

Require Behind Sence

有了前面的铺垫，接下来我们可以分析Node.js中实现require函数的源码

//module1.js
function greet(){
    consolo.log('greet!')
}
module.exports = greet;

//app.js
var greet = require('./module1.js');
greet(); 

接下来从app.js的第一句开始，前面已经提到require并非是一个编译器的预处理符号，而是一个函数，需要被解释器解释执行，可以在require处打一个断点，观察Nodejs的执行过程。

> var greet = require('./module1');
  greet();

断点后，首先进入内部的require函数，传入path为当前文件的路径

  function require(path) {
    try {
      exports.requireDepth += 1;
      return mod.require(path); //返回mod.require,path是当前文件路径
    } finally {
      exports.requireDepth -= 1;
    }
  }

我们最终得到的var greet就是mod.require(path);这个函数的返回值，继续看看这个函数干了什么事情

// Loads a module at the given file path. Returns that module's
// `exports` property.
Module.prototype.require = function(path) {
  assert(path, 'missing path');
  assert(typeof path === 'string', 'path must be a string');
  return Module._load(path, this, /* isMain */ false);
};

从这个函数的注释中可知，该函数会加载path文件中所有的module，并返回它们的exports成员，因此var greet的值间接来自Module._load的返回值。除此之外，这个函数并未做其它事情，而是直接调用了Module._load函数

// Check the cache for the requested file.
// 1. If a module already exists in the cache: return its exports object.
// 2. If the module is native: call `NativeModule.require()` with the
//    filename and return the result.
// 3. Otherwise, create a new module for the file and save it to the cache.
//    Then have it load  the file contents before returning its exports
//    object.
Module._load = function(request, parent, isMain) {
    //1. check cache
    //2. check native
    //3. create a new module
    //filename : "xxxx/module1.js"
    //parent: "xxxx/app.js"
    var module = new Module(filename, parent);
    //...
    Module._cache[filename] = module;
    tryModuleLoad(module, filename);
    return module.exports;
}

省略掉一些代码，注释上看，这个函数只做三件事，检查缓存中是否有这个module，检查该module是否是NativeModule，所谓NativeModule是Node.js中V8部分的C++代码，显然，这里的module1.js不是NativeModule，于是走到了第三步，创建一个新的JS Module，两个参数分别为module1.js的路径和当前app.js文件的路径。

这个函数很关键，从这里开始，便有了module对象，而在创建完module后，将其放入了全局的Module._cache中，但此时创建的module对象是个空壳，如果访问module.exports返回的是一个空对象，因此接下来需要执行load module的操作即tryModuleLoad函数，最后将制作好的module.exports对象返回给var greet对象。

到这里基本流程就走完了，但实际上tryModuleLoad的实现非常复杂，其大概思路为读取module1.js中的代码，调用JavasScript Core的Native代码(V8)进行求值后返回。对此感兴趣的同学可继续阅读附录中的内容，不想了解细节的同学，看到这里也就足够了。

More on Require

如果被require的文件过多，可以将他们放到文件夹中，并创建一个index.js文件用来索引其它被require的文件。假设文件结构如下

├── app.js
├── greet
│   ├── English.js
│   ├── Spanish.js
│   ├── config.json
│   └── index.js

在app.js中可以通过require './greet'来间接应用English.js和Spanish.js两个Module，require './greet'会调用默认require './greet/index.js'：

//index.js
var e = require ('./English')
var s = require ('./Spanish')
module.exports = {
    english:e,
    spanish:s
}
//app.js
var greet = require('./greet')
greet.english();
greet.spanish();

Native Modules

上面我们分析了require的内部实现，其中在module.load的函数中，我们发现Node.js可以load native的module，其源码为：

Module._load = function(request, parent, isMain) {
  //...
  if (NativeModule.nonInternalExists(filename)) {
    debug('load native module %s', request);
    return NativeModule.require(filename);
  }
  ///....
}

因此我们可以尝试在Node.js中require Native的Module，可以到Node.js的文档中找到所有的Native Module，可以随便尝试一个

var util = require('utl');
var name = "Tony";
//看起来像C的print
var greeting = util.format('Hello, %s',name);
util.log(greeting); //10 Mar 01:56:23 - Hello, Tony

需要注意的是，在require native module时，不需要指定路径和后缀名，如果有同名的js module，在require的时候需要加上路径，例如require('path/util')

小结

所有代码都运行在模块作用域，不会污染全局作用域。
模块可以多次加载，但是只会在第一次加载时运行一次，然后运行结果就被缓存了，以后再加载，就直接读取缓存结果。要想让模块再次运行，必须清除缓存。
模块加载的顺序，按照其在代码中出现的顺序。

ES6 Syntax

在ES6中，require和module.exports变成了export和import，例如

//module1.js
export function greet(){
    console.log("Greeting!");
}
//app.js
import * as greeter from 'greet';
greeter.greet();

附录 tryLoadModule的实现

tryLoadModule首先会执行module.load函数，如下

Module.prototype.load = function(filename) {
  //....
  var extension = path.extname(filename) || '.js';
  if (!Module._extensions[extension]) extension = '.js';
  Module._extensions[extension](this, filename);
  this.loaded = true;
  //...
};

extension为Module文件类型，如果没有指定，则默认为.js，另外运行时可以看到Module._extensions可支持的文件类型有三种，.js/.json/.node。每种类型的文件，对应一个解析的API，接下来会执行Module.extensions[.js](this,filename)

// Native extension for .js
Module._extensions['.js'] = function(module, filename) {
  var content = fs.readFileSync(filename, 'utf8');
  module._compile(internalModule.stripBOM(content), filename);
};

这个函数首先读取了module1.js中的代码，接下来调用module._compile对这部分代码做了一些预处理，具体为

NativeModule.wrap = function(script) {
    return NativeModule.wrapper[0] + script + NativeModule.wrapper[1];
};

NativeModule.wrapper = [
'(function (exports, require, module, __filename, __dirname) { ',
'\n});'
];

这里很有意思，所谓的module._compile实际上是将module中的代码包装了一层变为了IIFE的形式

(function (exports, require, module, __filename, __dirname){
    function greet(){
        consolo.log('greet!')
    }
    module.exports = greet;
})

接着将包装后的代码丢给 vm.js中的runInThisContext来执行，该函数用来和V8通信，返回一个compilerWrapper的对象，该对象会来执行module1.js中的代码, 即module.exports = greet

//参数this.exports即为module.exports
result = compiledWrapper.call(this.exports, this.exports, require, this,filename, dirname);

//module1.js
function greet(){
    consolo.log('greet!')
}
> module.exports = greet; //执行这一句