A collection of commonly used regular expressions

为您收集了常用的regular expression（Regular Expression）,程序开发中,经常用到的正则表达,方便您快速使用,节省宝贵的时间,提高程序开发效率，以下regular expression经过多次测试，并不断增加,因为不同程序或工具的regular expression略有区别，大家可以根据需要进行简单修改使用

illustrate	正则表达式
URL	[a-zA-z]+://[^\s]*
IP Address	((2[0-4]\d\|25[0-5]\|[01]?\d\d?)\.){3}(2[0-4]\d\|25[0-5]\|[01]?\d\d?)
Email address	\w+([-+.]\w+)@\w+([-.]\w+)\.\w+([-.]\w+)*
QQ number	[1-9]\d{4,}
HTML markup (containing content or self-closing)	<(.)(.)>.<\/\1>\|<(.) \/>
Password (composed of numbers/uppercase letters/lowercase letters/punctuation marks, all four must be present, more than 8 characters)	(?=^.{8,}$)(?=.\d)(?=.\W+)(?=.[A-Z])(?=.[a-z])(?!.\n).$
Date (year-month-day)	(\d{4}\|\d{2})-((1[0-2])\|(0?[1-9]))-(([12][0-9])\|(3[01])\|(0?[1-9]))
Date (month/day/year)	((1[0-2])\|(0?[1-9]))/(([12][0-9])\|(3[01])\|(0?[1-9]))/(\d{4}\|\d{2})
Time (hour:minute, 24-hour format)	((1\|0?)[0-9]\|2[0-3]):([0-5][0-9])
Kanji (characters)	[\u4e00-\u9fa5]
Chinese and full-width punctuation marks (characters)	[\u3000-\u301e\ufe10-\ufe19\ufe30-\ufe44\ufe50-\ufe6b\uff01-\uffee]
Mainland China landline phone number	(\d{4}-\|\d{3}-)?(\d{8}\|\d{7})
Mainland China mobile phone number	1\d{10}
Mainland China postal code	[1-9]\d{5}
Mainland China ID number (15 or 18 digits)	\d{15}(\d\d[0-9xX])?
non-negative integer (positive integer or zero)	\d+
positive integer	[0-9][1-9][0-9]
negative integer	-[0-9][1-9][0-9]
integer	-?\d+
decimal	(-?\d+)(\.\d+)?
Words that do not contain abc	\b((?!abc)\w)+\b

正则表达式 Used for string processing, form validation and other occasions, practical and efficient. Now we collect some commonly used expressions here for emergency use.

illustrate	正则表达式
username	/^[a-z0-9_-]{3,16}$/
password	/^[a-z0-9_-]{6,18}$/
hexadecimal value	/^#?([a-f0-9]{6}\|[a-f0-9]{3})$/
E-mail	/^([a-z0-9_\.-]+)@([\da-z\.-]+)\.([a-z\.]{2,6})$/
URL	/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-])\/?$/
IP address	/^(?:(?:25[0-5]\|2[0-4][0-9]\|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]\|2[0-4][0-9]\|[01]?[0-9][0-9]?)$/
HTML tag	/^<([a-z]+)([^<]+)(?:>(.)<\/\1>\|\s+\/>)$/
Chinese character range in Unicode encoding	/^[u4e00-u9fa5],{0,}$/
Regular expression to match Chinese characters	[\u4e00-\u9fa5]
Comment: Matching Chinese is really a headache. With this expression, it will be easier.
Match double-byte characters (including Chinese characters)	[^\x00-\xff]
Comment: Can be used to calculate the length of a string (the length of a double-byte character counts as 2, and the length of an ASCII character counts as 1)
Regular expression to match blank lines	\n\s*\r
Comment: Can be used to delete blank lines
Regular expression to match HTML tags	<(\S?)[^>]>.?</\1>\|<.?/>
Comment: The version circulating on the Internet is too bad. The above one can only match part of it, and it is still powerless for complex nested tags.
Regular expression to match leading and trailing whitespace characters	^\s\|\s$
Comment: It can be used to delete whitespace characters (including spaces, tabs, form feeds, etc.) at the beginning and end of the line. It is a very useful expression.
Regular expression to match email addresses	\w+([-+.]\w+)@\w+([-.]\w+)\.\w+([-.]\w+)*
Comment: Very useful for form validation
Regular expression to match URL	[a-zA-z]+://[^\s]*
Comment: The version circulating on the Internet has very limited functions. The above one can basically meet the needs.
Whether the matching account is legal (starting with a letter, 5-16 bytes allowed, alphanumeric underscores allowed)	^[a-zA-Z][a-zA-Z0-9_]{4,15}$
Comment: Very useful for form validation
Match domestic phone numbers	\d{3}-\d{8}\|\d{4}-\d{7}
Comment: Matching format such as 0511-4405222 or 021-87888822
Match Tencent QQ account	[1-9][0-9]{4,}
Comment: Tencent QQ account starts from 10000
Match mainland China postal code	[1-9]\d{5}(?!\d)
Comment: Postal codes in mainland China are 6 digits
Match ID card	\d{15}\|\d{18}
Comment: Mainland China’s ID card has 15 or 18 digits
match ip address	\d+\.\d+\.\d+\.\d+
Comment: Useful when extracting IP address
Match specific numbers:
^[1-9]\d*$	//Match positive integers
^-[1-9]\d*$	//match negative integers
^-?[1-9]\d*$	//match integers
^[1-9]\d*\|0$	//Match non-negative integers (positive integers + 0)
^-[1-9]\d*\|0$	//Match non-positive integers (negative integers +0)
^[1-9]\d\.\d\|0\.\d[1-9]\d$	//Match positive floating point numbers
^-([1-9]\d\.\d\|0\.\d[1-9]\d)$	//Match negative floating point numbers
^-?([1-9]\d\.\d\|0\.\d[1-9]\d\|0?\.0+\|0)$	//Match floating point numbers
^[1-9]\d\.\d\|0\.\d[1-9]\d\|0?\.0+\|0$	//Match non-negative floating point numbers (positive floating point numbers +0)
^(-([1-9]\d\.\d\|0\.\d[1-9]\d))\|0?\.0+\|0$	//Match non-positive floating point numbers (negative floating point numbers +0)
Comment: Useful when processing large amounts of data, please pay attention to corrections when applying it.
Match specific string
^[A-Za-z]+$	//Match a string consisting of 26 English letters
^[A-Z]+$	//Match a string consisting of 26 uppercase English letters
^[a-z]+$	//Match a string consisting of 26 lowercase English letters
^[A-Za-z0-9]+$	//Match a string consisting of numbers and 26 English letters
^\w+$	//Match a string consisting of numbers, 26 English letters, or underscores

Complete set of regular expressions: Regular expressions come in many different flavors. The following table is a complete list of metacharacters in PCRE and their behavior in the context of regular expressions:

character	describe
\	Marks the next character as a special character, or a literal character, or a backreference, or an octal escape character. For example, "n" matches the character "n". "\n" matches a newline character. The sequence "\\" matches "\" and "\(" matches "(".
^	Matches the beginning of the input string. If the Multiline property of the RegExp object is set, ^ also matches the position after "\n" or "\r".
$	Matches the end of the input string. If the Multiline property of the RegExp object is set, $ also matches the position before "\n" or "\r".
*	Matches the preceding subexpression zero or more times. For example, zo* matches "z" and "zoo". *Equivalent to {0,}.
+	Matches the preceding subexpression one or more times. For example, "zo+" matches "zo" and "zoo", but not "z". + is equivalent to {1,}.
?	Matches the preceding subexpression zero or one time. For example, "do(es)?" would match the "do" in "do" or "does". ? Equivalent to {0,1}.
{n}	n is a nonnegative integer. Match determined n times. For example, "o{2}" cannot match the "o" in "Bob", but it can match the two o's in "food".
{n,}	n is a nonnegative integer. Match at least n times. For example, "o{2,}" cannot match the "o" in "Bob", but it can match all o's in "foooood". "o{1,}" is equivalent to "o+". "o{0,}" is equivalent to "o*".
{n,m}	m和n均为非负整数，其中n<=m。最少匹配n次且最多匹配m次。例如，“o{1,3}”将匹配“fooooood”中的前三个o。“o{0,1}”等价于“o?”。请注意在逗号和两个数之间不能有空格。
?	When this character immediately follows any of the other qualifiers (*,+,?, {n}, {n,}, {n,m}), the matching pattern is non-greedy. Non-greedy mode matches as little of the searched string as possible, while the default greedy mode matches as much of the searched string as possible. For example, for the string "oooo", "o+?" will match a single "o", while "o+" will match all "o"s.
.	Matches any single character except "\n". To match any character including "\n", use a pattern like "[.\n]".
(pattern)	Match pattern and get this match. The obtained matches can be obtained from the generated Matches collection, using the SubMatches collection in VBScript and the $0...$9 attributes in JScript. To match parentheses characters, use "$" or "$".
(?:pattern)	Matches the pattern but does not obtain the matching result, which means that this is a non-acquisition match and is not stored for later use. This is useful when combining parts of a pattern using the or character "(\|)". For example, "industr(?:y\|ies)" is a simpler expression than "industry\|industries".
(?=pattern)	Forward lookup, matches the search string at the beginning of any string matching pattern. This is a non-fetch match, that is, the match does not need to be fetched for later use. For example, "Windows(?=95\|98\|NT\|2000)" can match "Windows" in "Windows2000", but cannot match "Windows" in "Windows3.1". Prefetching does not consume characters, that is, after a match occurs, the search for the next match begins immediately after the last match, rather than starting after the character containing the prefetch.
(?!pattern)	Negative lookahead, matches the search string at the beginning of any string that does not match the pattern. This is a non-fetch match, that is, the match does not need to be fetched for later use. For example, "Windows(?!95\|98\|NT\|2000)" can match "Windows" in "Windows3.1", but cannot match "Windows" in "Windows2000". Prefetch does not consume characters, that is, after a match occurs, the search for the next match starts immediately after the last match, rather than starting after the character containing the prefetch
x\|y	Match x or y. For example, "z\|food" matches "z" or "food". "(z\|f)ood" matches "zood" or "food".
[xyz]	Character collection. Matches any one of the characters contained. For example, "[abc]" would match the "a" in "plain".
[^xyz]	A collection of negative characters. Matches any character not included. For example, "[^abc]" would match the "p" in "plain".
[a-z]	Character range. Matches any character within the specified range. For example, "[a-z]" matches any lowercase alphabetic character in the range "a" through "z".
[^a-z]	Negative character range. Matches any character not within the specified range. For example, "[^a-z]" matches any character that is not in the range "a" through "z".
\b	Matches a word boundary, which is the position between a word and a space. For example, "er\b" matches the "er" in "never" but not the "er" in "verb".
\B	Match non-word boundaries. "er\B" can match the "er" in "verb", but not the "er" in "never".
\cx	Matches the control character specified by x. For example, \cM matches a Control-M or carriage return character. The value of x must be one of A-Z or a-z. Otherwise, treat c as a literal "c" character.
\d	Matches a numeric character. Equivalent to [0-9].
\D	Matches a non-numeric character. Equivalent to [^0-9].
\f	Matches a form feed. Equivalent to \x0c and \cL.
\n	Matches a newline character. Equivalent to \x0a and \cJ.
\r	Matches a carriage return character. Equivalent to \x0d and \cM.
\s	Matches any whitespace character, including spaces, tabs, form feeds, and so on. Equivalent to [\f\n\r\t\v].
\S	Matches any non-whitespace character. Equivalent to [^\f\n\r\t\v].
\t	Matches a tab character. Equivalent to \x09 and \cI.
\v	Matches a vertical tab character. Equivalent to \x0b and \cK.
\w	Matches any word character including an underscore. Equivalent to "[A-Za-z0-9_]".
\W	Matches any non-word character. Equivalent to "[^A-Za-z0-9_]".
\xn	匹配n，其中n为十六进制转义值。十六进制转义值必须为确定的两个数字长。例如，“\x41”匹配“A”。“\x041”则等价于“\x04&1”。正则表达式中可以使用ASCII编码。.
\num	Matches num, where num is a positive integer. A reference to the match obtained. For example, "(.)\1" matches two consecutive identical characters.
\n	Identifies an octal escape value or a backreference. If \n is preceded by at least n fetched subexpressions, n is a backward reference. Otherwise, if n is an octal number (0-7), then n is anOctal escape value.
\nm	Identifies an octal escape value or a backreference. If there are at least nm get subexpressions before \nm, nm is a backward reference. If \nm is preceded by at least n obtains, then n is a backward reference followed by the literal m. If none of the previous conditions are met, and if n and m are both octal numbers (0-7), then \nm will match the octal escape value nm.
\nml	If n is an octal number (0-3), and m and l are both octal digits (0-7), then the octal escape value nml is matched.
\un	Matches n, where n is a Unicode character represented by four hexadecimal digits. For example, \u00A9 matches the copyright symbol (?).

Online tool navigation

JSON tools

Format conversion

Encryption and decryption encoding

Text numbers

network

webmaster

calculate

other

Check list