python模块:re模块-白红宇

python模块:re模块

阅读量：6225 次

发布时间：2019-06-21

本文共 2127 字，大约阅读时间需要 7 分钟。

. 匹配任意字符

[] 匹配指定字符类别

^ 字符开头

$ 字符结尾

[^] 取非字符

* 重复多次字符(0次或多次)

+ 重复多次字符(1次或多次)

? 重复单次字符

| 左右表达式任意匹配

{m,n} 重复m到n次字符

{m} 重复m次字符

\d 匹配任何十进制数,相当于[0-9]

\D 匹配任何非数字字符,相当于[^0-9]

\s 匹配任何空白字符,相当于[fdss]

\S 匹配任何非空白字符,相当于[^jdvnjsd]

\w 匹配任何数字字母,相当于[a-zA-Z0-9]

\W 匹配任何非数字字母,相当于[^a-zA-Z0-9]

例1:定义简单的正则表达式

格式:re.compile(strPattern[, flag]):

strPattern:字符串表达式

flag:

re.I(re.IGNORECASE): 忽略大小写（括号内是完整写法，下同）

M(MULTILINE): 多行模式，改变'^'和'$'的行为

S(DOTALL): 点任意匹配模式，改变'.'的行为

L(LOCALE): 使预定字符类 \w \W \b \B \s \S 取决于当前区域设定

U(UNICODE): 使预定字符类 \w \W \b \B \s \S \d \D 取决于unicode定义的字符属性

X(VERBOSE): 详细模式。这个模式下正则表达式可以是多行，忽略空白字符，并可以加入注释。

pattern=re.compile(r'heLLow',re.I) #生成一个pattern实例

match=pattern.match('hellow world') #使用pattern匹配文本

if match:

print match.group() #如果匹配就输出

#例2:match属性和方法

#!/bin/env python

#!-*- coding:UTF-8 -*-

import re

match=re.match(r'(\w+)(\w+)(?P<sign>.*)','hellow world!') #使用pattern匹配文本

print "match.string:",match.string #匹配时使用的文本

print "match.re:",match.re #匹配时使用的pattern对像

print "match.pos:",match.pos #开始搜索的索引

print "match.endpos:",match.endpos #结束搜索的索引

print "match.lastindex:",match.lastindex #最后一个被捕获在分组在文本中的索引

print "match.lastgroup",match.lastgroup #最后一个被捕获的分组名

print "match.group(1,2):",match.group(1,2) #获得一个或多个分组截获的字符串

print "match.groups():",match.groups() #以元组形式返回全部分组的字符串

print "match.groupdict():",match.groupdict() #返回有别名组的别名为键

print "match.start(2):",match.start(2) #返回指定组截获的子串在字符中的起始索引

print "match.end(2):",match.end(2) #返回指定组截获的子串在字符中的结束索引

print "match.span(2):",match.span(2) #返回起始组和结束组

#例3:re模块和方法

re.compile #转换为Pattern对像

re.match #匹配正零时表达式

re.search #查找字符串

re.split #分割字符串

re.findall #搜索字符中,以列表形式返回全部能匹配的子串

re.finditer #搜索字符串,返回一个顺序访问每一个匹配的结果

re.sub #替换字符串

re.subn #替换字符串,n次

#例4:查找字符串

a=re.compile(r'hello')

b=a.search('hello world') #查找a中是否有hello字符

if b:

print b.group()

#例5:截断字符串

p=re.compile(r'\d')

print p.findall('one1two2three3four4five5')

#例6:返回每个匹配的结果

w=re.compile(r'\d')

for m in w.finditer('one1two2three3four4five5'):

print m.group()

#例7:

e=re.compile(r'(\w+)(\w+)')

s='This is, tong cheng'

print e.sub(r'\2\1',s)

def func(m):

return m.group(1).title()+ ' ' + m.group(2).title()

print e.sub(func,s)

转载地址：http://lsuna.baihongyu.com/

你可能感兴趣的文章

接触C# 反射

查看>>

c#中const、static、readonly的区别

查看>>

在 Silverlight 项目中获取程序集的引用信息

查看>>

函数式编程（3）幻灯片

查看>>

总结c#和javascript中常见的相关的"空"

nandflash学习1——导致nandflash反转的原因【转】

Linux中断（interrupt）子系统之二：arch相关的硬件封装层【转】

查看>>

Django - 模板

查看>>

Java刷题知识点之什么是死锁、死锁产生的4个必要条件、死锁的解除与预防

查看>>

ArcGIS Engine对象库

查看>>

图片在保存的时候===》出现这个异常：GDI+ 中发生一般性错误

查看>>

Hadoop MapReduce编程 API入门系列之wordcount版本2（六）

查看>>

分布式监控系统Zabbix-3.0.3-完整安装记录（2）-添加mysql监控

查看>>