Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • A awesome-python
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 13
    • Issues 13
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 317
    • Merge requests 317
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Vinta Chen
  • awesome-python
  • Merge requests
  • !527

add a link for pygrok

  • Review changes

  • Download
  • Email patches
  • Plain diff
Closed Administrator requested to merge github/fork/garyelephant/master into master Dec 17, 2015
  • Overview 0
  • Commits 1
  • Pipelines 0
  • Changes 1

Created by: garyelephant

Pygrok is a python library to parse strings and extract information from structured/unstructured data.It's a implementation of jordansissel's grok regular expression library.

With pygrok, you can easily:

  • parsing and matching patterns in a string(log, message etc.)
  • relieving from complex regular expressions.
  • extracting information from structured/unstructured data

Getting Started

>>> import pygrok
>>> text = 'gary is male, 25 years old and weighs 68.5 kilograms'
>>> pattern = '%{WORD:name} is %{WORD:gender}, %{NUMBER:age} years old and weighs %{NUMBER:weight} kilograms'
>>> print pygrok.grok_match(text, pattern)
{'gender': 'male', 'age': '25', 'name': 'gary', 'weight': '68.5'}

Pretty Cool ! Some of the pattern you can use are listed here:

`WORD` means \b\w+\b in regular expression.
`NUMBER` means (?:%{BASE10NUM})
`BASE10NUM` means (?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+)))

other patterns such as `IP`, `HOSTNAME`, `URIPATH`, `DATE`, `TIMESTAMP_ISO8601`, `COMMONAPACHELOG`..

See All patterns here.

Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: github/fork/garyelephant/master