Hardly any programmer escapes the need to use regular expressions in one form or another from time to time. For many, the pattern syntax can seem cryptic and forbidding. This tutorial will introduce a new pattern-matching engine, apg-exp—a feature-rich alternative to RegExp with an ABNF pattern syntax that is a little easier on the eyes.
A Quick Comparison
Have you ever needed to verify an email address and come across something like this?
^[\w!#$%&'*+/=?^_`{|}~-]+(?:\.[\w!#$%&'*+/=?^_`{|}~-]+)*@(?:[A-Z0-9-]+\.)+[A-Z]{2,6}$
A pattern-matching engine is the right tool for the job. This is a well-designed, well-written regular expression. It works great. So what's not to like?
Well, if you are an expert with regular expressions, nothing at all. But for the rest of us, they may be
- Hard to read
- Even harder to write
- Hard to maintain
The regular expression syntax has a long, time-honored history and is deeply integrated into many of the tools and languages that we, as programmers, use every day.
There is, however, an alternative syntax that has been around almost as long, is very popular with writers and users of Internet technical specifications, has all the power of regular expressions but is seldom used in the world of JavaScript programming. Namely, the Augmented Backus-Naur Form, or ABNF, formally defined by the IETF in RFC 5234 and RFC 7405.
Let's see what that same email address might look like in ABNF.
email-address = local "@" domain
local = local-word *("." local-word)
domain = 1*(sub-domain ".") top-domain
local-word = 1*local-char
sub-domain = 1*sub-domain-char
top-domain = 2*6top-domain-char
local-char = alpha / num / special
sub-domain-char = alpha / num / "-"
top-domain-char = alpha
alpha = %d65-90 / %d97-122
num = %d48-57
special = %d33 / %d35 / %d36-39 / %d42-43 / %d45 / %d47
/ %d61 / %d63 / %d94-96 / %d123-126
Not as compact, for sure, but like HTML and XML it is designed to be read by humans as well as machines. I'm guessing that with nothing more than a passing knowledge of wild card search patterns, you can just about read what is going on here in "plain English".
- the email address is defined as a local part and a domain separated by
@
- the local part is one word followed by optional dot-separated words
- the domain is one or more dot-separated sub-domains followed by a single top domain
- the only things you might not know here, but can probably guess, are:
- just as the wild card character
*
means "zero or more",1*
means "one or more" and2*6
means min 2 and max 6 repetitions /
separates alternate choices%d
defines decimal character codes and character code ranges- for example,
%d35
represents#
, ASCII decimal 35 %d65-90
represents any character in the rangeA-Z
, ASCII decimals 65-90
- just as the wild card character
RegExp and apg-exp are compared for this email address in example 1.
apg-exp is a pattern-matching engine designed to have the look and feel of RegExp but to use the ABNF syntax for pattern definitions. In the next few sections I'll walk you through:
- How to get apg-exp into your app
- A short guide to the ABNF syntax
- Working with apg-exp—a few examples
- Where to go next—more details, advanced examples
Up and Running—How to Get It
npm
If you are working in a Node.js environment, from your project directory run:
npm install apg-exp --save
You can then access it in your code with require()
.
For example:
var ApgExp = require("apg-exp");
var exp = new ApgExp(pattern, flags);
var result = exp.exec(stringToMatch);
GitHub
To get a copy of the code from GitHub, you can clone the repository to your project directory:
git clone http://ift.tt/2av5CgH apg-exp
Then in page.html
:
<!-- optional stylesheet used in tutorial examples -->
<link rel="stylesheet" href="./apg-exp/apgexp.css">
<script src="./apg-exp/apgexp-min.js"></script>
<script>
var useApgExp = function(){
var exp = new ApgExp(pattern, flags);
var result = exp.exec(stringToMatch);
/* do something with the result */
}
</script>
CDN
You can also create a CDN version directly from the GitHub source using RawGit. However, be sure to read the no uptime or support guarantees (In fact, be sure to read the entire FAQ).
The following are used in all of the examples in this tutorial.
<link rel="stylesheet"
href="http://ift.tt/2av6bXO">
<script
src="http://ift.tt/29WalZT"
charset="utf-8"></script>
These files are cached on the MaxCDN servers and you are free to use them for testing as long as they remain available. However, for production, you should place copies of apgexp-min.js
and apgexp.css
on your own servers for guaranteed access
and include them in your pages as best suited to your application.
A Short Guide to ABNF
ABNF is a syntax to describe phrases, a phrase being any string. As you saw in the email example above, it allows you to break down complex phrases into a collection of simpler phrases. A phrase definition has the form:
name = elements LF
where LF
is a line feed (newline \n
) character.
The table below is a short guide to the elements (see SABNF for the full guide).
Continue reading %An Alternative to Regular Expressions: agp-exp%
by Lowell D. Thomas via SitePoint
No comments:
Post a Comment