The script system\packages\regex\regbuild.ijs
contains definitions to for building regular expression patterns.
Many of the verbs below may enclose its argument in parentheses (to make it a subexpression). For example,
anyof 'abc'
returns '(abc)*'
.
The argument is only put in parentheses if necessary.
anyof set 'abc'
is '[abc]*'
.
The following verbs correspond directly to a feature of the regular expression notation:
set chars | returns set construction for chars |
set 'abc' | |
[abc] |
not chars | set of non-matching chars |
set not 'abc' | |
[^abc] |
sub pat | make a subexpression |
set 'abc' | |
(abc) |
someof pat | pattern matching 1 or more pat |
someof 'abc' | |
(abc)+ |
(min,max) of pat | pattern matchin min up to max of pat |
2 4 of 'abc' | |
(abc){2,4} |
pat1 or pat2 | pattern matching either pat1 or pat2 |
'abc' or 'd' | |
abc|x |
pat1 or pat2 | pattern matching pat1 immediately followed by pat2 |
'action=' by 'move' or 'copy' | |
action=(move|copy) |
sub pat | makes pat a subexpression |
sub 'abc' | |
(abc) |
bkref refnum | back-reference to a previous subexpression |
bkref 1 | |
\1 |
Some nouns can be used as parts of regular expressions:
white | pattern matching one or more whitespace characters |
owhite | pattern matching optional whitespace |
sol | pattern matching the start of a line |
eol | pattern matching the end of a line |
any | pattern matching any character |
Finally, some miscellaneous verbs
plain text | returns a regular expression matching the plain text |
plain 'dir j.*' | |
dir j\.\* |
pat1 between y | result is elements of y catenated together with pat1 between each |
' *' between 'abc' | |
a *b *c ' *' between 'p1';'p2';'p3|p4' | |
p1 *p2 *(p3|p4) |
comment nb pattern | add comment to pattern |
Interpretation of a pattern always stops at the first null character (0{a.
). The nb
verb makes use of this by catenating a null character and comment at the end of a pattern.
p=. rxcomp 'some digits' nb '[[:digit:]]+' rxinfo p +-+----------------------------+ |1|[[:digit:]]+ NB. some digits| +-+----------------------------+
setchars setpat | returns list of characters matching a set pattern |
setchars '[a-d[:digit:]]' | |
0123456789abcd |
Character classes
The following nouns are strings which are used within sets to specify a character class:
alnum, alpha, blank, cntrl, digit, graph, lower, print, punct, space, upper, xdigit
For example,
alpha=. '[:alpha:]'
Corresponding nouns, named with a leading uppercase, are patterns specifying a set of the character class, for example,
Alpha=. '[[:alpha:]]' NB. (same as set alpha)
J patterns
The following nouns, defined in packages\regex\regj.ijs
, are patterns which match elements of J code:
Jname | matches a J name |
Jnumitem, Jnum | matches a J numeric item or array (constant) |
Jchar | matches a J character string |
Jconst | matches a J numeric or character constant, include a. and a: |
Jgassign, Jlassign, Jassign | matches J global, local, or either assignment |
Jlpar, Jrpar | match J's left and right parentheses |
Jsol, Jeol |
match the start or end of a J sentence |