Skip to content

Nested repetitions parsed potentially incorrectly #3

Closed
@calfeld-zz

Description

@calfeld-zz

Summary: /o{2}{5}/ matches 10 o's (in ruby) but the {2} quantifier is lost in the parse tree.

Example code:

#!/usr/bin/env ruby

require 'rubygems'
require 'regexp_parser'
require 'pp'

re = /o{2}{5}/
pp Regexp::Parser.parse(re)

puts "o" if "o" =~ re
puts "2 o" if "o"*2 =~ re
puts "5 o" if "o"*5 =~ re
puts "10 o" if "o"*10 =~ re

Output:

#<Regexp::Expression::Root:0x105fd9d40
 @expressions=
  [#<Regexp::Expression::Literal:0x105fd6398
    @expressions=[],
    @options=nil,
    @quantifier=
     #<Regexp::Expression::Quantifier:0x105fd5e20
      @max=5,
      @min=5,
      @mode=:greedy,
      @text="{5}",
      @token=:interval>,
    @text="o",
    @token=:literal,
    @type=:literal>],
 @options=nil,
 @text="",
 @token=:root,
 @type=:expression>
10 o

Comments:

As far as I can tell, the nested quantifier syntax isn't documented in ruby and is illegal in pcre. Grep for example, will not match any number of o's for the given regexp. As such, I'd be content with a will-not-fix verdict. But I thought you might be interested.

Thank you for your time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions