pub struct Regex { /* private fields */ }
Expand description
This struct is a wrapper around an Oniguruma regular expression pointer. This represents a compiled regex which can be used in search and match operations.
Implementations§
Source§impl Regex
impl Regex
Sourcepub fn captures<'t>(&self, text: &'t str) -> Option<Captures<'t>>
pub fn captures<'t>(&self, text: &'t str) -> Option<Captures<'t>>
Returns the capture groups corresponding to the leftmost-first match
in text. Capture group 0
always corresponds to the entire match.
If no match is found, then None
is returned.
Sourcepub fn find_iter<'r, 't>(&'r self, text: &'t str) -> FindMatches<'r, 't> ⓘ
pub fn find_iter<'r, 't>(&'r self, text: &'t str) -> FindMatches<'r, 't> ⓘ
Returns an iterator for each successive non-overlapping match in text
,
returning the start and end byte indices with respect to text
.
§Example
Find the start and end location of every word with exactly 13 characters:
let text = "Retroactively relinquishing remunerations is reprehensible.";
for pos in Regex::new(r"\b\w{13}\b").unwrap().find_iter(text) {
println!("{:?}", pos);
}
// Output:
// (0, 13)
// (14, 27)
// (28, 41)
// (45, 58)
Sourcepub fn captures_iter<'r, 't>(&'r self, text: &'t str) -> FindCaptures<'r, 't> ⓘ
pub fn captures_iter<'r, 't>(&'r self, text: &'t str) -> FindCaptures<'r, 't> ⓘ
Returns an iterator over all the non-overlapping capture groups matched
in text
. This is operationally the same as find_iter
(except it
yields information about submatches).
§Example
We can use this to find all movie titles and their release years in some text, where the movie is formatted like “‘Title’ (xxxx)”:
let re = Regex::new(r"'([^']+)'\s+\((\d{4})\)")
.unwrap();
let text = "'Citizen Kane' (1941), 'The Wizard of Oz' (1939), 'M' (1931).";
for caps in re.captures_iter(text) {
println!("Movie: {:?}, Released: {:?}", caps.at(1), caps.at(2));
}
// Output:
// Movie: Citizen Kane, Released: 1941
// Movie: The Wizard of Oz, Released: 1939
// Movie: M, Released: 1931
Sourcepub fn split<'r, 't>(&'r self, text: &'t str) -> RegexSplits<'r, 't> ⓘ
pub fn split<'r, 't>(&'r self, text: &'t str) -> RegexSplits<'r, 't> ⓘ
Returns an iterator of substrings of text
delimited by a match
of the regular expression.
Namely, each element of the iterator corresponds to text that isn’t
matched by the regular expression.
This method will not copy the text given.
§Example
To split a string delimited by arbitrary amounts of spaces or tabs:
let re = Regex::new(r"[ \t]+").unwrap();
let fields: Vec<&str> = re.split("a b \t c\td e").collect();
assert_eq!(fields, vec!("a", "b", "c", "d", "e"));
Sourcepub fn splitn<'r, 't>(
&'r self,
text: &'t str,
limit: usize,
) -> RegexSplitsN<'r, 't> ⓘ
pub fn splitn<'r, 't>( &'r self, text: &'t str, limit: usize, ) -> RegexSplitsN<'r, 't> ⓘ
Returns an iterator of at most limit
substrings of text
delimited
by a match of the regular expression. (A limit
of 0
will return no
substrings.)
Namely, each element of the iterator corresponds to text that isn’t
matched by the regular expression.
The remainder of the string that is not split will be the last element
in the iterator.
This method will not copy the text given.
§Example
Get the first two words in some text:
let re = Regex::new(r"\W+").unwrap();
let fields: Vec<&str> = re.splitn("Hey! How are you?", 3).collect();
assert_eq!(fields, vec!("Hey", "How", "are you?"));
Sourcepub fn scan_with_region<F>(
&self,
to_search: &str,
region: &mut Region,
options: SearchOptions,
callback: F,
) -> i32
pub fn scan_with_region<F>( &self, to_search: &str, region: &mut Region, options: SearchOptions, callback: F, ) -> i32
Scan the given slice, capturing into the given region and executing a callback for each match.
Source§impl Regex
impl Regex
Sourcepub fn capture_names_len(&self) -> usize
pub fn capture_names_len(&self) -> usize
Returns the number of named groups into regex.
Source§impl Regex
impl Regex
Sourcepub fn replace<R: Replacer>(&self, text: &str, rep: R) -> String
pub fn replace<R: Replacer>(&self, text: &str, rep: R) -> String
Replaces the leftmost-first match with the replacement provided.
The replacement can be a regular string or a function that takes
the matches Captures
and returns the replaced string.
If no match is found, then a copy of the string is returned unchanged.
§Examples
Note that this function is polymorphic with respect to the replacement. In typical usage, this can just be a normal string:
let re = Regex::new("[^01]+").unwrap();
assert_eq!(re.replace("1078910", ""), "1010");
But anything satisfying the Replacer
trait will work. For example,
a closure of type |&Captures| -> String
provides direct access to the
captures corresponding to a match. This allows one to access
submatches easily:
let re = Regex::new(r"([^,\s]+),\s+(\S+)").unwrap();
let result = re.replace("Springsteen, Bruce", |caps: &Captures| {
format!("{} {}", caps.at(2).unwrap_or(""), caps.at(1).unwrap_or(""))
});
assert_eq!(result, "Bruce Springsteen");
Sourcepub fn replace_all<R: Replacer>(&self, text: &str, rep: R) -> String
pub fn replace_all<R: Replacer>(&self, text: &str, rep: R) -> String
Replaces all non-overlapping matches in text
with the
replacement provided. This is the same as calling replacen
with
limit
set to 0
.
See the documentation for replace
for details on how to access
submatches in the replacement string.
Sourcepub fn replacen<R: Replacer>(&self, text: &str, limit: usize, rep: R) -> String
pub fn replacen<R: Replacer>(&self, text: &str, limit: usize, rep: R) -> String
Replaces at most limit
non-overlapping matches in text
with the
replacement provided. If limit
is 0, then all non-overlapping matches
are replaced.
See the documentation for replace
for details on how to access
submatches in the replacement string.
Source§impl Regex
impl Regex
Sourcepub fn new(pattern: &str) -> Result<Self, Error>
pub fn new(pattern: &str) -> Result<Self, Error>
Create a Regex
Simple regular expression constructor. Compiles a new regular expression with the default options using the ruby syntax. Once compiled, it can be used repeatedly to search in a string. If an invalid expression is given, then an error is returned.
§Arguments
pattern
- The regex pattern to compile
§Examples
use onig::Regex;
let r = Regex::new(r#"hello (\w+)"#);
assert!(r.is_ok());
Sourcepub fn with_encoding<T>(pattern: T) -> Result<Regex, Error>where
T: EncodedChars,
pub fn with_encoding<T>(pattern: T) -> Result<Regex, Error>where
T: EncodedChars,
Create a Regex, Specifying an Encoding
Attempts to compile pattern
into a new Regex
instance. Instead of assuming UTF-8 as the encoding scheme the
encoding is inferred from the pattern
buffer.
§Arguments
pattern
- The regex pattern to compile
§Examples
use onig::{Regex, EncodedBytes};
let utf8 = Regex::with_encoding("hello");
assert!(utf8.is_ok());
let ascii = Regex::with_encoding(EncodedBytes::ascii(b"world"));
assert!(ascii.is_ok());
Sourcepub fn with_options(
pattern: &str,
option: RegexOptions,
syntax: &Syntax,
) -> Result<Regex, Error>
pub fn with_options( pattern: &str, option: RegexOptions, syntax: &Syntax, ) -> Result<Regex, Error>
Create a new Regex
Attempts to compile a pattern into a new Regex
instance.
Once compiled, it can be used repeatedly to search in a string. If an
invalid expression is given, then an error is returned.
See onig_sys::onig_new
for more information.
§Arguments
pattern
- The regex pattern to compile.options
- The regex compilation options.syntax
- The syntax which the regex is written in.
§Examples
use onig::{Regex, Syntax, RegexOptions};
let r = Regex::with_options("hello.*world",
RegexOptions::REGEX_OPTION_NONE,
Syntax::default());
assert!(r.is_ok());
Sourcepub fn with_options_and_encoding<T>(
pattern: T,
option: RegexOptions,
syntax: &Syntax,
) -> Result<Self, Error>where
T: EncodedChars,
pub fn with_options_and_encoding<T>(
pattern: T,
option: RegexOptions,
syntax: &Syntax,
) -> Result<Self, Error>where
T: EncodedChars,
Create a new Regex, Specifying Options and Ecoding
Attempts to comile the given pattern
into a new Regex
instance. Instead of assuming UTF-8 as the encoding scheme the
encoding is inferred from the pattern
buffer. If the regex
fails to compile the returned Error
value from
onig_new
contains more information.
§Arguments
pattern
- The regex pattern to compile.options
- The regex compilation options.syntax
- The syntax which the regex is written in.
§Examples
use onig::{Regex, Syntax, EncodedBytes, RegexOptions};
let pattern = EncodedBytes::ascii(b"hello");
let r = Regex::with_options_and_encoding(pattern,
RegexOptions::REGEX_OPTION_SINGLELINE,
Syntax::default());
assert!(r.is_ok());
Sourcepub fn match_with_options(
&self,
str: &str,
at: usize,
options: SearchOptions,
region: Option<&mut Region>,
) -> Option<usize>
pub fn match_with_options( &self, str: &str, at: usize, options: SearchOptions, region: Option<&mut Region>, ) -> Option<usize>
Match String
Try to match the regex against the given string slice,
starting at a given offset. This method works the same way as
match_with_encoding
, but the encoding is always utf-8.
For more information see Match vs Search
§Arguments
str
- The string slice to match against.at
- The byte index in the passed slice to start matchingoptions
- The regex match options.region
- The region for return group match range info
§Returns
Some(len)
if the regex matched, with len
being the number
of bytes matched. None
if the regex doesn’t match.
§Examples
use onig::{Regex, SearchOptions};
let r = Regex::new(".*").unwrap();
let res = r.match_with_options("hello", 0, SearchOptions::SEARCH_OPTION_NONE, None);
assert!(res.is_some()); // it matches
assert!(res.unwrap() == 5); // 5 characters matched
Sourcepub fn match_with_encoding<T>(
&self,
chars: T,
at: usize,
options: SearchOptions,
region: Option<&mut Region>,
) -> Option<usize>where
T: EncodedChars,
pub fn match_with_encoding<T>(
&self,
chars: T,
at: usize,
options: SearchOptions,
region: Option<&mut Region>,
) -> Option<usize>where
T: EncodedChars,
Match String with Encoding
Match the regex against a string. This method will start at
the offset at
into the string and try and match the
regex. If the regex matches then the return value is the
number of characters which matched. If the regex doesn’t match
the return is None
.
For more information see Match vs Search
The contents of chars
must have the same encoding that was
used to construct the regex.
§Arguments
chars
- The buffer to match against.at
- The byte index in the passed buffer to start matchingoptions
- The regex match options.region
- The region for return group match range info
§Returns
Some(len)
if the regex matched, with len
being the number
of bytes matched. None
if the regex doesn’t match.
§Examples
use onig::{Regex, EncodedBytes, SearchOptions};
let r = Regex::with_encoding(EncodedBytes::ascii(b".*")).unwrap();
let res = r.match_with_encoding(EncodedBytes::ascii(b"world"),
0, SearchOptions::SEARCH_OPTION_NONE, None);
assert!(res.is_some()); // it matches
assert!(res.unwrap() == 5); // 5 characters matched
Sourcepub fn match_with_param<T>(
&self,
chars: T,
at: usize,
options: SearchOptions,
region: Option<&mut Region>,
match_param: MatchParam,
) -> Result<Option<usize>, Error>where
T: EncodedChars,
pub fn match_with_param<T>(
&self,
chars: T,
at: usize,
options: SearchOptions,
region: Option<&mut Region>,
match_param: MatchParam,
) -> Result<Option<usize>, Error>where
T: EncodedChars,
Match string with encoding and match param
Match the regex against a string. This method will start at
the offset at
into the string and try and match the
regex. If the regex matches then the return value is the
number of characters which matched. If the regex doesn’t match
the return is None
.
For more information see Match vs Search
The contents of chars
must have the same encoding that was
used to construct the regex.
§Arguments
chars
- The buffer to match against.at
- The byte index in the passed buffer to start matchingoptions
- The regex match options.region
- The region for return group match range infomatch_param
- The match parameters
§Returns
Ok(Some(len))
if the regex matched, with len
being the number
of bytes matched. Ok(None)
if the regex doesn’t match. Err
with an
Error
if an error occurred (e.g. retry-limit-in-match exceeded).
§Examples
use onig::{Regex, EncodedBytes, MatchParam, SearchOptions};
let r = Regex::with_encoding(EncodedBytes::ascii(b".*")).unwrap();
let res = r.match_with_param(EncodedBytes::ascii(b"world"),
0, SearchOptions::SEARCH_OPTION_NONE,
None, MatchParam::default());
assert!(res.is_ok()); // matching did not error
assert!(res.unwrap() == Some(5)); // 5 characters matched
Sourcepub fn search_with_options(
&self,
str: &str,
from: usize,
to: usize,
options: SearchOptions,
region: Option<&mut Region>,
) -> Option<usize>
pub fn search_with_options( &self, str: &str, from: usize, to: usize, options: SearchOptions, region: Option<&mut Region>, ) -> Option<usize>
Search pattern in string
Search for matches the regex in a string. This method will return the
index of the first match of the regex within the string, if
there is one. If from
is less than to
, then search is performed
in forward order, otherwise – in backward order.
For more information see Match vs Search
§Arguments
str
- The string to search in.from
- The byte index in the passed slice to start searchto
- The byte index in the passed slice to finish searchoptions
- The options for the search.region
- The region for return group match range info
§Returns
Some(pos)
if the regex matches, where pos
is the
byte-position of the start of the match. None
if the regex
doesn’t match anywhere in str
.
§Examples
use onig::{Regex, SearchOptions};
let r = Regex::new("l{1,2}").unwrap();
let res = r.search_with_options("hello", 0, 5, SearchOptions::SEARCH_OPTION_NONE, None);
assert!(res.is_some()); // it matches
assert!(res.unwrap() == 2); // match starts at character 3
Sourcepub fn search_with_encoding<T>(
&self,
chars: T,
from: usize,
to: usize,
options: SearchOptions,
region: Option<&mut Region>,
) -> Option<usize>where
T: EncodedChars,
pub fn search_with_encoding<T>(
&self,
chars: T,
from: usize,
to: usize,
options: SearchOptions,
region: Option<&mut Region>,
) -> Option<usize>where
T: EncodedChars,
Search for a Pattern in a String with an Encoding
Search for matches the regex in a string. This method will
return the index of the first match of the regex within the
string, if there is one. If from
is less than to
, then
search is performed in forward order, otherwise – in backward
order.
For more information see Match vs Search
The encoding of the buffer passed to search in must match the encoding of the regex.
§Arguments
chars
- The character buffer to search in.from
- The byte index in the passed slice to start searchto
- The byte index in the passed slice to finish searchoptions
- The options for the search.region
- The region for return group match range info
§Returns
Some(pos)
if the regex matches, where pos
is the
byte-position of the start of the match. None
if the regex
doesn’t match anywhere in chars
.
§Examples
use onig::{Regex, EncodedBytes, SearchOptions};
let r = Regex::with_encoding(EncodedBytes::ascii(b"l{1,2}")).unwrap();
let res = r.search_with_encoding(EncodedBytes::ascii(b"hello"),
0, 5, SearchOptions::SEARCH_OPTION_NONE, None);
assert!(res.is_some()); // it matches
assert!(res.unwrap() == 2); // match starts at character 3
Sourcepub fn search_with_param<T>(
&self,
chars: T,
from: usize,
to: usize,
options: SearchOptions,
region: Option<&mut Region>,
match_param: MatchParam,
) -> Result<Option<usize>, Error>where
T: EncodedChars,
pub fn search_with_param<T>(
&self,
chars: T,
from: usize,
to: usize,
options: SearchOptions,
region: Option<&mut Region>,
match_param: MatchParam,
) -> Result<Option<usize>, Error>where
T: EncodedChars,
Search pattern in string with encoding and match param
Search for matches the regex in a string. This method will
return the index of the first match of the regex within the
string, if there is one. If from
is less than to
, then
search is performed in forward order, otherwise – in backward
order.
For more information see Match vs Search
The encoding of the buffer passed to search in must match the encoding of the regex.
§Arguments
chars
- The character buffer to search in.from
- The byte index in the passed slice to start searchto
- The byte index in the passed slice to finish searchoptions
- The options for the search.region
- The region for return group match range infomatch_param
- The match parameters
§Returns
Ok(Some(pos))
if the regex matches, where pos
is the
byte-position of the start of the match. Ok(None)
if the regex
doesn’t match anywhere in chars
. Err
with an Error
if an error
occurred (e.g. retry-limit-in-match exceeded).
§Examples
use onig::{Regex, EncodedBytes, MatchParam, SearchOptions};
let r = Regex::with_encoding(EncodedBytes::ascii(b"l{1,2}")).unwrap();
let res = r.search_with_param(EncodedBytes::ascii(b"hello"),
0, 5, SearchOptions::SEARCH_OPTION_NONE,
None, MatchParam::default());
assert!(res.is_ok()); // matching did not error
assert!(res.unwrap() == Some(2)); // match starts at character 3
Sourcepub fn is_match(&self, text: &str) -> bool
pub fn is_match(&self, text: &str) -> bool
Returns true if and only if the regex matches the string given.
For more information see Match vs Search
§Arguments
text
- The string slice to test against the pattern.
§Returns
true
if the pattern matches the whole of text
, false
otherwise.
Sourcepub fn find(&self, text: &str) -> Option<(usize, usize)>
pub fn find(&self, text: &str) -> Option<(usize, usize)>
Find a Match in a Buffer, With Encoding
Finds the first match of the regular expression within the buffer.
Note that this should only be used if you want to discover the
position of the match within a string. Testing if a pattern
matches the whole string is faster if you use is_match
. For
more information see Match vs
Search
§Arguments
text
- The text to search in.
§Returns
The offset of the start and end of the first match. If no
match exists None
is returned.
Sourcepub fn find_with_encoding<T>(&self, text: T) -> Option<(usize, usize)>where
T: EncodedChars,
pub fn find_with_encoding<T>(&self, text: T) -> Option<(usize, usize)>where
T: EncodedChars,
Find a Match in a Buffer, With Encoding
Finds the first match of the regular expression within the buffer.
For more information see Match vs Search
§Arguments
text
- The text to search in.
§Returns
The offset of the start and end of the first match. If no
match exists None
is returned.
Sourcepub fn encoding(&self) -> OnigEncoding
pub fn encoding(&self) -> OnigEncoding
Get the Encoding of the Regex
§Returns
Returns a reference to an oniguruma encoding which was used when this regex was created.
Sourcepub fn captures_len(&self) -> usize
pub fn captures_len(&self) -> usize
Get the Number of Capture Groups in this Pattern
Sourcepub fn capture_histories_len(&self) -> usize
pub fn capture_histories_len(&self) -> usize
Get the Size of the Capture Histories for this Pattern