PyXURLs

A regular expression based URL extractor which extracts URLs from text.

Thanks to Daniel Martí invests the project mvdan/xurls. This python project developed by the same concept as the golang version.

Installing

# the alternative is regex as engine if you suffered installing on re2
pip install google-re2 pyxurls

Usage

Extract URLs by strict strategy

import xurls

extractor = xurls.Strict()

url = extractor.findfirst('we have the link with scheme https://www.python.org and https://www.github.com')
#  https://www.python.org

urls = extractor.findall('we have the link with scheme https://www.python.org and https://github.com')
#  ['https://www.python.org', 'https://github.com']

Extract URLs by relaxed strategy

import xurls

extractor = xurls.Relaxed()

url = extractor.findfirst('we have the link with scheme www.python.org and https://www.github.com')
#  www.python.org

urls = extractor.findall('we have the link with scheme www.python.org and https://github.com')
#  ['www.python.org', 'https://github.com']

Extract URLs by limit scheme

import xurls

# limit to https
extractor = xurls.StrictScheme('https://')

url = extractor.findfirst('we have the link with scheme custom://domain.com and https://www.python.org noscheme.com')
#  https://www.python.org

# unlimit to standard scheme
extractor = xurls.StrictScheme(xurls.express.ANY_SCHEME)
urls = extractor.findall('we have the link with scheme custom://domain.com and https://www.python.org noscheme.com')
#  ['custom://domain.com', 'https://www.python.org']

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
tests		tests
tools		tools
xurls		xurls
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PyXURLs

Installing

Usage

Extract URLs by strict strategy

Extract URLs by relaxed strategy

Extract URLs by limit scheme

About

Uh oh!

Releases 2

Packages

Uh oh!

Languages

License

andytzeng/pyxurls

Folders and files

Latest commit

History

Repository files navigation

PyXURLs

Installing

Usage

Extract URLs by strict strategy

Extract URLs by relaxed strategy

Extract URLs by limit scheme

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Languages

Packages