README.md 3.19 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118
# fastparse

A very simple and stupid parser, based on a statemachine and regular expressions.

It's not intended for complex languages. It's intended to easily write a simple parser for a simple language.



## Usage

Pass a description of statemachine to the constructor. The description must be in this form:

``` javascript
new Parser(description)

description is {
	// The key is the name of the state
	// The value is an object containing possible transitions
	"state-name": {
		// The key is a regular expression
		// If the regular expression matches the transition is executed
		// The value can be "true", a other state name or a function

		"a": true,
		// true will make the parser stay in the current state
		
		"b": "other-state-name",
		// a string will make the parser transit to a new state
		
		"[cde]": function(match, index, matchLength) {
			// "match" will be the matched string
			// "index" will be the position in the complete string
			// "matchLength" will be "match.length"
			
			// "this" will be the "context" passed to the "parse" method"
			
			// A new state name (string) can be returned
			return "other-state-name";
		},
		
		"([0-9]+)(\\.[0-9]+)?": function(match, first, second, index, matchLength) {
			// groups can be used in the regular expression
			// they will match to arguments "first", "second"
		},
		
		// the parser stops when it cannot match the string anymore
		
		// order of keys is the order in which regular expressions are matched
		// if the javascript runtime preserves the order of keys in an object
		// (this is not standardized, but it's a de-facto standard)
	}
}
```

The statemachine is compiled down to a single regular expression per state. So basically the parsing work is delegated to the (native) regular expression logic of the javascript runtime.


``` javascript
Parser.prototype.parse(initialState: String, parsedString: String, context: Object)
```

`initialState`: state where the parser starts to parse.

`parsedString`: the string which should be parsed.

`context`: an object which can be used to save state and results. Available as `this` in transition functions.

returns `context`




## Example

``` javascript
var Parser = require("fastparse");

// A simple parser that extracts @licence ... from comments in a JS file
var parser = new Parser({
	// The "source" state
	"source": {
		// matches comment start
		"/\\*": "comment",
		"//": "linecomment",
		
		// this would be necessary for a complex language like JS
		// but omitted here for simplicity
		// "\"": "string1",
		// "\'": "string2",
		// "\/": "regexp"
		
	},
	// The "comment" state
	"comment": {
		"\\*/": "source",
		"@licen[cs]e\\s((?:[^*\n]|\\*+[^*/\n])*)": function(match, licenseText) {
			this.licences.push(licenseText.trim());
		}
	},
	// The "linecomment" state
	"linecomment": {
		"\n": "source",
		"@licen[cs]e\\s(.*)": function(match, licenseText) {
			this.licences.push(licenseText.trim());
		}
	}
});

var licences = parser.parse("source", sourceCode, { licences: [] }).licences;

console.log(licences);
```



## License

MIT (http://www.opensource.org/licenses/mit-license.php)