unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* Documentation on debugging regexp performance
@ 2016-01-21  5:29 Clément Pit--Claudel
  2016-01-21  6:36 ` Yuri Khan
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Clément Pit--Claudel @ 2016-01-21  5:29 UTC (permalink / raw)
  To: Emacs developers


[-- Attachment #1.1: Type: text/plain, Size: 571 bytes --]

Hi emacs-devel,

I'm running into a surprising regular expressions issue. I have attached a file (~50k) in which (re-search-forward "   +[^:=]+ +:=?") seems to be extremely slow. (I killed it after 30 seconds). Truncating the file to its first 20 lines reduces the time for re-search-forward to about a second, which is still extremely slow. 

Are there good resources on how to rewrite regexps to make them Emacs-friendly? I didn't find such documentation, and I'm puzzled as to what could make the regexp above hard to re-search-forward for.

Cheers,
Clément.

[-- Attachment #1.2: large-goal --]
[-- Type: text/plain, Size: 48957 bytes --]

Vector.t methSig (S q0) =>
                                       match
                                         v0 as v'0 in (Vector.t _ m0)
                                         return
                                           (match m0 as m1 return (.. -> ..) with
                                            | O => fun _ : Vector.t methSig O => False -> True
                                            | S n0 => fun _ : Vector.t methSig .. => Fin.t n0 -> methSig
                                            end v'0)
                                       with
                                       | Vector.nil => fun devil : False => match devil return True with
                                                  end
                                       | Vector.cons _ n0 t0 => fun p1 : Fin.t n0 => nth_fix n0 t0 p1
                                       end p'0
                                   end v') n t p0
                            end p'
                        end
                          (VectorDef.cons methSig
                             (Build_methSig
                                (String (Ascii false false true false true true true false)
                                   (String (Ascii true true true true false true true false)
                                      (String (Ascii true true true true true false true false)
                                         (String (Ascii true true false false true true true false)
                                            (String 
                                               (Ascii false false true false true true true false)
                                               (String 
                                                  (Ascii false true false false true true true false)
                                                  (String 
                                                  (Ascii true false false true false true true false)
                                                  (String 
                                                  (Ascii false true true true false true true false)
                                                  (String 
                                                  (Ascii true true true false false true true false) EmptyString)))))))))
                                (@nil Type)
                                (@Some Type
                                   match HSLM return Type with
                                   | Build_StringLikeMin String0 _ _ => String0
                                   end)) (S (S (S (S (S (S O))))))
                             (VectorDef.cons methSig
                                (Build_methSig
                                   (String (Ascii true true false false false true true false)
                                      (String (Ascii false false false true false true true false)
                                         (String (Ascii true false false false false true true false)
                                            (String 
                                               (Ascii false true false false true true true false)
                                               (String 
                                                  (Ascii true true true true true false true false)
                                                  (String 
                                                  (Ascii true false false false false true true false)
                                                  (String 
                                                  (Ascii false false true false true true true false)
                                                  (String 
                                                  (Ascii true true true true true false true false)
                                                  (String 
                                                  (Ascii true false true true false true true false)
                                                  (String (..) (..)))))))))))
                                   (@cons Type nat (@cons Type (ascii -> bool) (@nil Type))) 
                                   (@Some Type bool)) 
                                (S (S (S (S (S O)))))
                                (VectorDef.cons methSig
                                   (Build_methSig
                                      (String (Ascii true true true false false true true false)
                                         (String (Ascii true false true false false true true false)
                                            (String (Ascii false false true false true true true false) EmptyString)))
                                      (@cons Type nat (@nil Type)) 
                                      (@Some Type ascii)) 
                                   (S (S (S (S O))))
                                   (VectorDef.cons methSig
                                      (Build_methSig
                                         (String (Ascii false false true true false true true false)
                                            (String 
                                               (Ascii true false true false false true true false)
                                               (String 
                                                  (Ascii false true true true false true true false)
                                                  (String 
                                                  (Ascii true true true false false true true false)
                                                  (String 
                                                  (Ascii false false true false true true true false)
                                                  (String 
                                                  (Ascii false false false true false true true false) EmptyString))))))
                                         (@nil Type) 
                                         (@Some Type nat)) 
                                      (S (S (S O)))
                                      (VectorDef.cons methSig
                                         (Build_methSig
                                            (String 
                                               (Ascii false false true false true true true false)
                                               (String 
                                                  (Ascii true false false false false true true false)
                                                  (String 
                                                  (Ascii true true false true false true true false)
                                                  (String 
                                                  (Ascii true false true false false true true false) EmptyString))))
                                            (@cons Type nat (@nil Type)) 
                                            (@None Type)) 
                                         (S (S O))
                                         (VectorDef.cons methSig
                                            (Build_methSig
                                               (String 
                                                  (Ascii false false true false false true true false)
                                                  (String 
                                                  (Ascii false true false false true true true false)
                                                  (String 
                                                  (Ascii true true true true false true true false)
                                                  (String 
                                                  (Ascii false false false false true true true false) EmptyString))))
                                               (@cons Type nat (@nil Type)) 
                                               (@None Type)) 
                                            (S O)
                                            (VectorDef.cons methSig
                                               (Build_methSig
                                                  (String 
                                                  (Ascii true true false false true true true false)
                                                  (String 
                                                  (Ascii false false false false true true true false)
                                                  (String 
                                                  (Ascii false false true true false true true false)
                                                  (String 
                                                  (Ascii true false false true false true true false)
                                                  (String (..) (..))))))
                                                  (@cons 
                                                  Type 
                                                  (prod nat (prod nat nat))
                                                  (@cons Type nat (@cons Type nat (@nil Type))))
                                                  (@Some Type (list nat))) O 
                                               (VectorDef.nil methSig)))))))) 
                        return (list Type)
                      with
                      | Build_methSig _ methDom _ => methDom
                      end
                      match
                        match idx in (Fin.t m') return (Vector.t methSig m' -> methSig) with
                        | Fin.F1 q =>
                            fun v : Vector.t methSig (S q) =>
                            match
                              v as v' in (Vector.t _ m)
                              return
                                (match m as m0 return (Vector.t methSig m0 -> Type) with
                                 | O => fun _ : Vector.t methSig O => False -> True
                                 | S n => fun _ : Vector.t methSig (S n) => methSig
                                 end v')
                            with
                            | Vector.nil => fun devil : False => match devil return True with
                                                  end
                            | Vector.cons h n _ => h
                            end
                        | Fin.FS q p' =>
                            fun v : Vector.t methSig (S q) =>
                            match
                              v as v' in (Vector.t _ m)
                              return
                                (match m as m0 return (Vector.t methSig m0 -> Type) with
                                 | O => fun _ : Vector.t methSig O => False -> True
                                 | S n => fun _ : Vector.t methSig (S n) => Fin.t n -> methSig
                                 end v')
                            with
                            | Vector.nil => fun devil : False => match devil return True with
                                                  end
                            | Vector.cons _ n t =>
                                fun p0 : Fin.t n =>
                                (fix nth_fix (m : nat) (v' : Vector.t methSig m) (p : Fin.t m) {struct v'} :
                                   methSig :=
                                   match p in (Fin.t m') return (Vector.t methSig m' -> methSig) with
                                   | Fin.F1 q0 =>
                                       fun v0 : Vector.t methSig (S q0) =>
                                       match
                                         v0 as v'0 in (Vector.t _ m0)
                                         return
                                           (match m0 as m1 return (Vector.t methSig m1 -> Type) with
                                            | O => fun _ : Vector.t methSig O => False -> True
                                            | S n0 => fun _ : Vector.t methSig (..) => methSig
                                            end v'0)
                                       with
                                       | Vector.nil => fun devil : False => match devil return True with
                                                  end
                                       | Vector.cons h n0 _ => h
                                       end
                                   | Fin.FS q0 p'0 =>
                                       fun v0 : Vector.t methSig (S q0) =>
                                       match
                                         v0 as v'0 in (Vector.t _ m0)
                                         return
                                           (match m0 as m1 return (.. -> ..) with
                                            | O => fun _ : Vector.t methSig O => False -> True
                                            | S n0 => fun _ : Vector.t methSig .. => Fin.t n0 -> methSig
                                            end v'0)
                                       with
                                       | Vector.nil => fun devil : False => match devil return True with
                                                  end
                                       | Vector.cons _ n0 t0 => fun p1 : Fin.t n0 => nth_fix n0 t0 p1
                                       end p'0
                                   end v') n t p0
                            end p'
                        end
                          (VectorDef.cons methSig
                             (Build_methSig
                                (String (Ascii false false true false true true true false)
                                   (String (Ascii true true true true false true true false)
                                      (String (Ascii true true true true true false true false)
                                         (String (Ascii true true false false true true true false)
                                            (String 
                                               (Ascii false false true false true true true false)
                                               (String 
                                                  (Ascii false true false false true true true false)
                                                  (String 
                                                  (Ascii true false false true false true true false)
                                                  (String 
                                                  (Ascii false true true true false true true false)
                                                  (String 
                                                  (Ascii true true true false false true true false) EmptyString)))))))))
                                (@nil Type)
                                (@Some Type
                                   match HSLM return Type with
                                   | Build_StringLikeMin String0 _ _ => String0
                                   end)) (S (S (S (S (S (S O))))))
                             (VectorDef.cons methSig
                                (Build_methSig
                                   (String (Ascii true true false false false true true false)
                                      (String (Ascii false false false true false true true false)
                                         (String (Ascii true false false false false true true false)
                                            (String 
                                               (Ascii false true false false true true true false)
                                               (String 
                                                  (Ascii true true true true true false true false)
                                                  (String 
                                                  (Ascii true false false false false true true false)
                                                  (String 
                                                  (Ascii false false true false true true true false)
                                                  (String 
                                                  (Ascii true true true true true false true false)
                                                  (String 
                                                  (Ascii true false true true false true true false)
                                                  (String (..) (..)))))))))))
                                   (@cons Type nat (@cons Type (ascii -> bool) (@nil Type))) 
                                   (@Some Type bool)) 
                                (S (S (S (S (S O)))))
                                (VectorDef.cons methSig
                                   (Build_methSig
                                      (String (Ascii true true true false false true true false)
                                         (String (Ascii true false true false false true true false)
                                            (String (Ascii false false true false true true true false) EmptyString)))
                                      (@cons Type nat (@nil Type)) 
                                      (@Some Type ascii)) 
                                   (S (S (S (S O))))
                                   (VectorDef.cons methSig
                                      (Build_methSig
                                         (String (Ascii false false true true false true true false)
                                            (String 
                                               (Ascii true false true false false true true false)
                                               (String 
                                                  (Ascii false true true true false true true false)
                                                  (String 
                                                  (Ascii true true true false false true true false)
                                                  (String 
                                                  (Ascii false false true false true true true false)
                                                  (String 
                                                  (Ascii false false false true false true true false) EmptyString))))))
                                         (@nil Type) 
                                         (@Some Type nat)) 
                                      (S (S (S O)))
                                      (VectorDef.cons methSig
                                         (Build_methSig
                                            (String 
                                               (Ascii false false true false true true true false)
                                               (String 
                                                  (Ascii true false false false false true true false)
                                                  (String 
                                                  (Ascii true true false true false true true false)
                                                  (String 
                                                  (Ascii true false true false false true true false) EmptyString))))
                                            (@cons Type nat (@nil Type)) 
                                            (@None Type)) 
                                         (S (S O))
                                         (VectorDef.cons methSig
                                            (Build_methSig
                                               (String 
                                                  (Ascii false false true false false true true false)
                                                  (String 
                                                  (Ascii false true false false true true true false)
                                                  (String 
                                                  (Ascii true true true true false true true false)
                                                  (String 
                                                  (Ascii false false false false true true true false) EmptyString))))
                                               (@cons Type nat (@nil Type)) 
                                               (@None Type)) 
                                            (S O)
                                            (VectorDef.cons methSig
                                               (Build_methSig
                                                  (String 
                                                  (Ascii true true false false true true true false)
                                                  (String 
                                                  (Ascii false false false false true true true false)
                                                  (String 
                                                  (Ascii false false true true false true true false)
                                                  (String 
                                                  (Ascii true false false true false true true false)
                                                  (String (..) (..))))))
                                                  (@cons 
                                                  Type 
                                                  (prod nat (prod nat nat))
                                                  (@cons Type nat (@cons Type nat (@nil Type))))
                                                  (@Some Type (list nat))) O 
                                               (VectorDef.nil methSig)))))))) 
                        return (option Type)
                      with
                      | Build_methSig _ _ methCod => methCod
                      end)
                     (@Fin.FS (S (S (S (S (S (S O))))))
                        (@Fin.FS (S (S (S (S (S O)))))
                           (@Fin.FS (S (S (S (S O))))
                              (@Fin.FS (S (S (S O))) (@Fin.FS (S (S O)) (@Fin.FS (S O) (@Fin.F1 O)))))))))
               (@snd (list Type) (option Type)
                  ((fun idx : Fin.t (S (S (S (S (S (S (S O))))))) =>
                    @pair (list Type) (option Type)
                      match
                        match idx in (Fin.t m') return (Vector.t methSig m' -> methSig) with
                        | Fin.F1 q =>
                            fun v : Vector.t methSig (S q) =>
                            match
                              v as v' in (Vector.t _ m)
                              return
                                (match m as m0 return (Vector.t methSig m0 -> Type) with
                                 | O => fun _ : Vector.t methSig O => False -> True
                                 | S n => fun _ : Vector.t methSig (S n) => methSig
                                 end v')
                            with
                            | Vector.nil => fun devil : False => match devil return True with
                                                  end
                            | Vector.cons h n _ => h
                            end
                        | Fin.FS q p' =>
                            fun v : Vector.t methSig (S q) =>
                            match
                              v as v' in (Vector.t _ m)
                              return
                                (match m as m0 return (Vector.t methSig m0 -> Type) with
                                 | O => fun _ : Vector.t methSig O => False -> True
                                 | S n => fun _ : Vector.t methSig (S n) => Fin.t n -> methSig
                                 end v')
                            with
                            | Vector.nil => fun devil : False => match devil return True with
                                                  end
                            | Vector.cons _ n t =>
                                fun p0 : Fin.t n =>
                                (fix nth_fix (m : nat) (v' : Vector.t methSig m) (p : Fin.t m) {struct v'} :
                                   methSig :=
                                   match p in (Fin.t m') return (Vector.t methSig m' -> methSig) with
                                   | Fin.F1 q0 =>
                                       fun v0 : Vector.t methSig (S q0) =>
                                       match
                                         v0 as v'0 in (Vector.t _ m0)
                                         return
                                           (match m0 as m1 return (Vector.t methSig m1 -> Type) with
                                            | O => fun _ : Vector.t methSig O => False -> True
                                            | S n0 => fun _ : Vector.t methSig (..) => methSig
                                            end v'0)
                                       with
                                       | Vector.nil => fun devil : False => match devil return True with
                                                  end
                                       | Vector.cons h n0 _ => h
                                       end
                                   | Fin.FS q0 p'0 =>
                                       fun v0 : Vector.t methSig (S q0) =>
                                       match
                                         v0 as v'0 in (Vector.t _ m0)
                                         return
                                           (match m0 as m1 return (.. -> ..) with
                                            | O => fun _ : Vector.t methSig O => False -> True
                                            | S n0 => fun _ : Vector.t methSig .. => Fin.t n0 -> methSig
                                            end v'0)
                                       with
                                       | Vector.nil => fun devil : False => match devil return True with
                                                  end
                                       | Vector.cons _ n0 t0 => fun p1 : Fin.t n0 => nth_fix n0 t0 p1
                                       end p'0
                                   end v') n t p0
                            end p'
                        end
                          (VectorDef.cons methSig
                             (Build_methSig
                                (String (Ascii false false true false true true true false)
                                   (String (Ascii true true true true false true true false)
                                      (String (Ascii true true true true true false true false)
                                         (String (Ascii true true false false true true true false)
                                            (String 
                                               (Ascii false false true false true true true false)
                                               (String 
                                                  (Ascii false true false false true true true false)
                                                  (String 
                                                  (Ascii true false false true false true true false)
                                                  (String 
                                                  (Ascii false true true true false true true false)
                                                  (String 
                                                  (Ascii true true true false false true true false) EmptyString)))))))))
                                (@nil Type)
                                (@Some Type
                                   match HSLM return Type with
                                   | Build_StringLikeMin String0 _ _ => String0
                                   end)) (S (S (S (S (S (S O))))))
                             (VectorDef.cons methSig
                                (Build_methSig
                                   (String (Ascii true true false false false true true false)
                                      (String (Ascii false false false true false true true false)
                                         (String (Ascii true false false false false true true false)
                                            (String 
                                               (Ascii false true false false true true true false)
                                               (String 
                                                  (Ascii true true true true true false true false)
                                                  (String 
                                                  (Ascii true false false false false true true false)
                                                  (String 
                                                  (Ascii false false true false true true true false)
                                                  (String 
                                                  (Ascii true true true true true false true false)
                                                  (String 
                                                  (Ascii true false true true false true true false)
                                                  (String (..) (..)))))))))))
                                   (@cons Type nat (@cons Type (ascii -> bool) (@nil Type))) 
                                   (@Some Type bool)) 
                                (S (S (S (S (S O)))))
                                (VectorDef.cons methSig
                                   (Build_methSig
                                      (String (Ascii true true true false false true true false)
                                         (String (Ascii true false true false false true true false)
                                            (String (Ascii false false true false true true true false) EmptyString)))
                                      (@cons Type nat (@nil Type)) 
                                      (@Some Type ascii)) 
                                   (S (S (S (S O))))
                                   (VectorDef.cons methSig
                                      (Build_methSig
                                         (String (Ascii false false true true false true true false)
                                            (String 
                                               (Ascii true false true false false true true false)
                                               (String 
                                                  (Ascii false true true true false true true false)
                                                  (String 
                                                  (Ascii true true true false false true true false)
                                                  (String 
                                                  (Ascii false false true false true true true false)
                                                  (String 
                                                  (Ascii false false false true false true true false) EmptyString))))))
                                         (@nil Type) 
                                         (@Some Type nat)) 
                                      (S (S (S O)))
                                      (VectorDef.cons methSig
                                         (Build_methSig
                                            (String 
                                               (Ascii false false true false true true true false)
                                               (String 
                                                  (Ascii true false false false false true true false)
                                                  (String 
                                                  (Ascii true true false true false true true false)
                                                  (String 
                                                  (Ascii true false true false false true true false) EmptyString))))
                                            (@cons Type nat (@nil Type)) 
                                            (@None Type)) 
                                         (S (S O))
                                         (VectorDef.cons methSig
                                            (Build_methSig
                                               (String 
                                                  (Ascii false false true false false true true false)
                                                  (String 
                                                  (Ascii false true false false true true true false)
                                                  (String 
                                                  (Ascii true true true true false true true false)
                                                  (String 
                                                  (Ascii false false false false true true true false) EmptyString))))
                                               (@cons Type nat (@nil Type)) 
                                               (@None Type)) 
                                            (S O)
                                            (VectorDef.cons methSig
                                               (Build_methSig
                                                  (String 
                                                  (Ascii true true false false true true true false)
                                                  (String 
                                                  (Ascii false false false false true true true false)
                                                  (String 
                                                  (Ascii false false true true false true true false)
                                                  (String 
                                                  (Ascii true false false true false true true false)
                                                  (String (..) (..))))))
                                                  (@cons 
                                                  Type 
                                                  (prod nat (prod nat nat))
                                                  (@cons Type nat (@cons Type nat (@nil Type))))
                                                  (@Some Type (list nat))) O 
                                               (VectorDef.nil methSig)))))))) 
                        return (list Type)
                      with
                      | Build_methSig _ methDom _ => methDom
                      end
                      match
                        match idx in (Fin.t m') return (Vector.t methSig m' -> methSig) with
                        | Fin.F1 q =>
                            fun v : Vector.t methSig (S q) =>
                            match
                              v as v' in (Vector.t _ m)
                              return
                                (match m as m0 return (Vector.t methSig m0 -> Type) with
                                 | O => fun _ : Vector.t methSig O => False -> True
                                 | S n => fun _ : Vector.t methSig (S n) => methSig
                                 end v')
                            with
                            | Vector.nil => fun devil : False => match devil return True with
                                                  end
                            | Vector.cons h n _ => h
                            end
                        | Fin.FS q p' =>
                            fun v : Vector.t methSig (S q) =>
                            match
                              v as v' in (Vector.t _ m)
                              return
                                (match m as m0 return (Vector.t methSig m0 -> Type) with
                                 | O => fun _ : Vector.t methSig O => False -> True
                                 | S n => fun _ : Vector.t methSig (S n) => Fin.t n -> methSig
                                 end v')
                            with
                            | Vector.nil => fun devil : False => match devil return True with
                                                  end
                            | Vector.cons _ n t =>
                                fun p0 : Fin.t n =>
                                (fix nth_fix (m : nat) (v' : Vector.t methSig m) (p : Fin.t m) {struct v'} :
                                   methSig :=
                                   match p in (Fin.t m') return (Vector.t methSig m' -> methSig) with
                                   | Fin.F1 q0 =>
                                       fun v0 : Vector.t methSig (S q0) =>
                                       match
                                         v0 as v'0 in (Vector.t _ m0)
                                         return
                                           (match m0 as m1 return (Vector.t methSig m1 -> Type) with
                                            | O => fun _ : Vector.t methSig O => False -> True
                                            | S n0 => fun _ : Vector.t methSig (..) => methSig
                                            end v'0)
                                       with
                                       | Vector.nil => fun devil : False => match devil return True with
                                                  end
                                       | Vector.cons h n0 _ => h
                                       end
                                   | Fin.FS q0 p'0 =>
                                       fun v0 : Vector.t methSig (S q0) =>
                                       match
                                         v0 as v'0 in (Vector.t _ m0)
                                         return
                                           (match m0 as m1 return (.. -> ..) with
                                            | O => fun _ : Vector.t methSig O => False -> True
                                            | S n0 => fun _ : Vector.t methSig .. => Fin.t n0 -> methSig
                                            end v'0)
                                       with
                                       | Vector.nil => fun devil : False => match devil return True with
                                                  end
                                       | Vector.cons _ n0 t0 => fun p1 : Fin.t n0 => nth_fix n0 t0 p1
                                       end p'0
                                   end v') n t p0
                            end p'
                        end
                          (VectorDef.cons methSig
                             (Build_methSig
                                (String (Ascii false false true false true true true false)
                                   (String (Ascii true true true true false true true false)
                                      (String (Ascii true true true true true false true false)
                                         (String (Ascii true true false false true true true false)
                                            (String 
                                               (Ascii false false true false true true true false)
                                               (String 
                                                  (Ascii false true false false true true true false)
                                                  (String 
                                                  (Ascii true false false true false true true false)
                                                  (String 
                                                  (Ascii false true true true false true true false)
                                                  (String 
                                                  (Ascii true true true false false true true false) EmptyString)))))))))
                                (@nil Type)
                                (@Some Type
                                   match HSLM return Type with
                                   | Build_StringLikeMin String0 _ _ => String0
                                   end)) (S (S (S (S (S (S O))))))
                             (VectorDef.cons methSig
                                (Build_methSig
                                   (String (Ascii true true false false false true true false)
                                      (String (Ascii false false false true false true true false)
                                         (String (Ascii true false false false false true true false)
                                            (String 
                                               (Ascii false true false false true true true false)
                                               (String 
                                                  (Ascii true true true true true false true false)
                                                  (String 
                                                  (Ascii true false false false false true true false)
                                                  (String 
                                                  (Ascii false false true false true true true false)
                                                  (String 
                                                  (Ascii true true true true true false true false)
                                                  (String 
                                                  (Ascii true false true true false true true false)
                                                  (String (..) (..)))))))))))
                                   (@cons Type nat (@cons Type (ascii -> bool) (@nil Type))) 
                                   (@Some Type bool)) 
                                (S (S (S (S (S O)))))
                                (VectorDef.cons methSig
                                   (Build_methSig
                                      (String (Ascii true true true false false true true false)
                                         (String (Ascii true false true false false true true false)
                                            (String (Ascii false false true false true true true false) EmptyString)))
                                      (@cons Type nat (@nil Type)) 
                                      (@Some Type ascii)) 
                                   (S (S (S (S O))))
                                   (VectorDef.cons methSig
                                      (Build_methSig
                                         (String (Ascii false false true true false true true false)
                                            (String 
                                               (Ascii true false true false false true true false)
                                               (String 
                                                  (Ascii false true true true false true true false)
                                                  (String 
                                                  (Ascii true true true false false true true false)
                                                  (String 
                                                  (Ascii false false true false true true true false)
                                                  (String 
                                                  (Ascii false false false true false true true false) EmptyString))))))
                                         (@nil Type) 
                                         (@Some Type nat)) 
                                      (S (S (S O)))
                                      (VectorDef.cons methSig
                                         (Build_methSig
                                            (String 
                                               (Ascii false false true false true true true false)
                                               (String 
                                                  (Ascii true false false false false true true false)
                                                  (String 
                                                  (Ascii true true false true false true true false)
                                                  (String 
                                                  (Ascii true false true false false true true false) EmptyString))))
                                            (@cons Type nat (@nil Type)) 
                                            (@None Type)) 
                                         (S (S O))
                                         (VectorDef.cons methSig
                                            (Build_methSig
                                               (String 
                                                  (Ascii false false true false false true true false)
                                                  (String 
                                                  (Ascii false true false false true true true false)
                                                  (String 
                                                  (Ascii true true true true false true true false)
                                                  (String 
                                                  (Ascii false false false false true true true false) EmptyString))))
                                               (@cons Type nat (@nil Type)) 
                                               (@None Type)) 
                                            (S O)
                                            (VectorDef.cons methSig
                                               (Build_methSig
                                                  (String 
                                                  (Ascii true true false false true true true false)
                                                  (String 
                                                  (Ascii false false false false true true true false)
                                                  (String 
                                                  (Ascii false false true true false true true false)
                                                  (String 
                                                  (Ascii true false false true false true true false)
                                                  (String (..) (..))))))
                                                  (@cons 
                                                  Type 
                                                  (prod nat (prod nat nat))
                                                  (@cons Type nat (@cons Type nat (@nil Type))))
                                                  (@Some Type (list nat))) O 
                                               (VectorDef.nil methSig)))))))) 
                        return (option Type)
                      with
                      | Build_methSig _ _ methCod => methCod
                      end)
                     (@Fin.FS (S (S (S (S (S (S O))))))
                        (@Fin.FS (S (S (S (S (S O)))))
                           (@Fin.FS (S (S (S (S O))))
                              (@Fin.FS (S (S (S O))) (@Fin.FS (S (S O)) (@Fin.FS (S O) (@Fin.F1 O)))))))))))
         (methCod
            (Build_methSig
               (String (Ascii true true false false true true true false)
                  (String (Ascii false false false false true true true false)
                     (String (Ascii false false true true false true true false)
                        (String (Ascii true false false true false true true false)
                           (String (Ascii false false true false true true true false)
                              (String (Ascii true true false false true true true false) EmptyString))))))
               (@fst (list Type) (option Type)
                  ((fun idx : Fin.t (S (S (S (S (S (S (S O))))))) =>
                    @pair (list Type) (option Type)
                      match
                        match idx in (Fin.t m') return (Vector.t methSig m' -> methSig) with
                        | Fin.F1 q =>
                            fun v : Vector.t methSig (S q) =>
                            match
                              v as v' in (Vector.t _ m)
                              return
                                (match m as m0 return (Vector.t methSig m0 -> Type) with
                                 | O => fun _ : Vector.t methSig O => False -> True
                                 | S n => fun _ : Vector.t methSig (S n) => methSig
                                 end v')
                            with
                            | Vector.nil => fun devil : False => match devil return True with
                                                  end
                            | Vector.cons h n _ => h
                            end
                        | Fin.FS q p' =>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Documentation on debugging regexp performance
  2016-01-21  5:29 Documentation on debugging regexp performance Clément Pit--Claudel
@ 2016-01-21  6:36 ` Yuri Khan
  2016-01-21  9:39   ` Alexis
  2016-01-21 11:42 ` Wolfgang Jenkner
  2016-01-21 15:27 ` Alan Mackenzie
  2 siblings, 1 reply; 13+ messages in thread
From: Yuri Khan @ 2016-01-21  6:36 UTC (permalink / raw)
  To: Clément Pit--Claudel; +Cc: Emacs developers

On Thu, Jan 21, 2016 at 11:29 AM, Clément Pit--Claudel
<clement.pit@gmail.com> wrote:
> Hi emacs-devel,
>
> I'm running into a surprising regular expressions issue. I have attached a file (~50k) in which (re-search-forward "   +[^:=]+ +:=?") seems to be extremely slow. (I killed it after 30 seconds). Truncating the file to its first 20 lines reduces the time for re-search-forward to about a second, which is still extremely slow.

I’m no expert on the Emacs regexp implementation, but this part is
ambiguous: "[^:=]+ +". The engine will have to backtrack at least once
because the first part will greedily slurp all spaces, then the second
part will not match. You might want to add the space to the exclusion
character class: "[^:= ]+ +".



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Documentation on debugging regexp performance
  2016-01-21  6:36 ` Yuri Khan
@ 2016-01-21  9:39   ` Alexis
  2016-01-21 13:22     ` Clément Pit--Claudel
  2016-01-21 22:10     ` Marcin Borkowski
  0 siblings, 2 replies; 13+ messages in thread
From: Alexis @ 2016-01-21  9:39 UTC (permalink / raw)
  To: emacs-devel


Yuri Khan <yuri.v.khan@gmail.com> writes:

>> I'm running into a surprising regular expressions issue. I have 
>> attached a file (~50k) in which (re-search-forward " +[^:=]+ 
>> +:=?")  seems to be extremely slow. (I killed it after 30 
>> seconds). Truncating the file to its first 20 lines reduces the 
>> time for re-search-forward to about a second, which is still 
>> extremely slow.
>
> I’m no expert on the Emacs regexp implementation, but this part 
> is ambiguous: "[^:=]+ +". The engine will have to backtrack at 
> least once because the first part will greedily slurp all 
> spaces, then the second part will not match. You might want to 
> add the space to the exclusion character class: "[^:= ]+ +".

More generally, i highly recommend Jeffrey Friedl's book 
"Mastering Regular Expressions". It's not Emacs-specific, but it 
provides in-depth explanations of why certain regexen are time- 
and/or space-hungry.


Alexis.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Documentation on debugging regexp performance
  2016-01-21  5:29 Documentation on debugging regexp performance Clément Pit--Claudel
  2016-01-21  6:36 ` Yuri Khan
@ 2016-01-21 11:42 ` Wolfgang Jenkner
  2016-01-21 16:38   ` Clément Pit--Claudel
  2016-01-21 15:27 ` Alan Mackenzie
  2 siblings, 1 reply; 13+ messages in thread
From: Wolfgang Jenkner @ 2016-01-21 11:42 UTC (permalink / raw)
  To: Clément Pit--Claudel; +Cc: Emacs developers

On Thu, Jan 21 2016, Clément Pit--Claudel wrote:

> I'm running into a surprising regular expressions issue. I have attached a file (~50k) in which (re-search-forward "   +[^:=]+ +:=?") seems to be extremely slow. (I killed it after 30 seconds). Truncating the file to its first 20 lines reduces the time for re-search-forward to about a second, which is still extremely slow. 

Perhaps you meant

(re-search-forward "   +[^:=\n]+ +:=?")

Cf. (info "(elisp) Regexp Special"), in particular the section about
"complemented character alternative".



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Documentation on debugging regexp performance
  2016-01-21  9:39   ` Alexis
@ 2016-01-21 13:22     ` Clément Pit--Claudel
  2016-01-21 22:10     ` Marcin Borkowski
  1 sibling, 0 replies; 13+ messages in thread
From: Clément Pit--Claudel @ 2016-01-21 13:22 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 436 bytes --]

On 01/21/2016 04:39 AM, Alexis wrote:
> More generally, i highly recommend Jeffrey Friedl's book "Mastering
> Regular Expressions". It's not Emacs-specific, but it provides
> in-depth explanations of why certain regexen are time- and/or
> space-hungry.

Thanks for the suggestion. I think I do need something Emacs-specific, however: Python's regexp engine has no trouble at all with the example provided; neither does grep's.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Documentation on debugging regexp performance
  2016-01-21  5:29 Documentation on debugging regexp performance Clément Pit--Claudel
  2016-01-21  6:36 ` Yuri Khan
  2016-01-21 11:42 ` Wolfgang Jenkner
@ 2016-01-21 15:27 ` Alan Mackenzie
  2016-01-21 16:37   ` Clément Pit--Claudel
  2 siblings, 1 reply; 13+ messages in thread
From: Alan Mackenzie @ 2016-01-21 15:27 UTC (permalink / raw)
  To: Clément Pit--Claudel; +Cc: Emacs developers

Hello, Clément.

On Thu, Jan 21, 2016 at 12:29:58AM -0500, Clément Pit--Claudel wrote:
> Hi emacs-devel,

> I'm running into a surprising regular expressions issue. I have
> attached a file (~50k) in which (re-search-forward "   +[^:=]+ +:=?")
> seems to be extremely slow. (I killed it after 30 seconds). Truncating
> the file to its first 20 lines reduces the time for re-search-forward
> to about a second, which is still extremely slow. 

> Are there good resources on how to rewrite regexps to make them
> Emacs-friendly? I didn't find such documentation, and I'm puzzled as to
> what could make the regexp above hard to re-search-forward for.

> Cheers,
> Clément.

"   +[^:=]+ +:=?" is an ill-formed regexp - if you get lots of spaces in
a non-match, the Emacs regexp engine will try all possible ways of
matching these spaces before giving up.  You have three concatenated
sub-expressions, all of which match any number of spaces, namely:

   " +[^:=]+ +"
    1122222233

I would suggest reformulating it thus:

   " +[^:= ][^:=]+ "
    112222223333334

Subexpression 1 matches ALL the leading spaces.  Subexp 2 is exactly one
character which can't be a space.  Subexp 3 matches almost anything,
including spaces, and subexp 4 matches a single space at the end (to make
sure there is at least one space there).

All the best with your regexp!

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Documentation on debugging regexp performance
  2016-01-21 15:27 ` Alan Mackenzie
@ 2016-01-21 16:37   ` Clément Pit--Claudel
  2016-01-21 17:16     ` Alan Mackenzie
  0 siblings, 1 reply; 13+ messages in thread
From: Clément Pit--Claudel @ 2016-01-21 16:37 UTC (permalink / raw)
  To: Alan Mackenzie; +Cc: Emacs developers

[-- Attachment #1: Type: text/plain, Size: 1405 bytes --]

On 01/21/2016 10:27 AM, Alan Mackenzie wrote:
> Hello, Clément.

Hi Alan!

> "   +[^:=]+ +:=?" is an ill-formed regexp - if you get lots of spaces in
> a non-match, the Emacs regexp engine will try all possible ways of
> matching these spaces before giving up.  You have three concatenated
> sub-expressions, all of which match any number of spaces, namely:
> 
>    " +[^:=]+ +"
>     1122222233
> 
> I would suggest reformulating it thus:
> 
>    " +[^:= ][^:=]+ "
>     112222223333334

I think this has different semantics: my original regexp requires at least three spaces. But I think prepending spaces to yours fixes that.

> 
> Subexpression 1 matches ALL the leading spaces.
> Subexp 2 is exactly one
> character which can't be a space.  Subexp 3 matches almost anything,
> including spaces, and subexp 4 matches a single space at the end (to make
> sure there is at least one space there).

This is helpful, thanks! I realize however that maybe I oversimplified. The issue is that what I really want is something like this:

"   +\\([^:=]+\\) +:=?"

IOW, I want to capture that first group.

> All the best with your regexp!

Thanks. Your points about backtracking were helpful as well. Do you know if there are technical reasons why Emacs chooses a backtracking implementation for this regexp (instead of compiling it to a linear-time matcher)?

Clément.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Documentation on debugging regexp performance
  2016-01-21 11:42 ` Wolfgang Jenkner
@ 2016-01-21 16:38   ` Clément Pit--Claudel
  0 siblings, 0 replies; 13+ messages in thread
From: Clément Pit--Claudel @ 2016-01-21 16:38 UTC (permalink / raw)
  To: Emacs developers

[-- Attachment #1: Type: text/plain, Size: 560 bytes --]

On 01/21/2016 06:42 AM, Wolfgang Jenkner wrote:
> On Thu, Jan 21 2016, Clément Pit--Claudel wrote:
> 
>> I'm running into a surprising regular expressions issue. I have attached a file (~50k) in which (re-search-forward "   +[^:=]+ +:=?") seems to be extremely slow. (I killed it after 30 seconds). Truncating the file to its first 20 lines reduces the time for re-search-forward to about a second, which is still extremely slow. 
> 
> Perhaps you meant
> 
> (re-search-forward "   +[^:=\n]+ +:=?")

I don't think so; I do want newlines in there.


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Documentation on debugging regexp performance
  2016-01-21 16:37   ` Clément Pit--Claudel
@ 2016-01-21 17:16     ` Alan Mackenzie
  2016-01-23  6:12       ` Stefan Monnier
  0 siblings, 1 reply; 13+ messages in thread
From: Alan Mackenzie @ 2016-01-21 17:16 UTC (permalink / raw)
  To: Clément Pit--Claudel; +Cc: Emacs developers

Hello again Clément.

On Thu, Jan 21, 2016 at 11:37:48AM -0500, Clément Pit--Claudel wrote:
> On 01/21/2016 10:27 AM, Alan Mackenzie wrote:
> Hi Alan!

> > "   +[^:=]+ +:=?" is an ill-formed regexp - if you get lots of spaces in
> > a non-match, the Emacs regexp engine will try all possible ways of
> > matching these spaces before giving up.  You have three concatenated
> > sub-expressions, all of which match any number of spaces, namely:

> >    " +[^:=]+ +"
> >     1122222233

> > I would suggest reformulating it thus:

> >    " +[^:= ][^:=]+ "
> >     112222223333334

> I think this has different semantics: my original regexp requires at
> least three spaces. But I think prepending spaces to yours fixes that.

Sorry, yes, I'd extracted the interesting bit of your regexp, and forgot
that I'd done so.

> > Subexpression 1 matches ALL the leading spaces.
> > Subexp 2 is exactly one
> > character which can't be a space.  Subexp 3 matches almost anything,
> > including spaces, and subexp 4 matches a single space at the end (to make
> > sure there is at least one space there).

> This is helpful, thanks! I realize however that maybe I
> oversimplified. The issue is that what I really want is something like
> this:

> "   +\\([^:=]+\\) +:=?"

> IOW, I want to capture that first group.

That is ambiguous.  But if we can assume that the first group always
begins with a non-space, and always ends with a non-space, then we can
reformulate the above as:

    "   +\\([^:= ]\\([^:=]+[^:= ]\\)?\\) +:=?"
                                    ^

(or something similar - I've not actually tested it).  The ? inside the
first expression is to cope with there just being 1 single character
matched by the group.

> > All the best with your regexp!

> Thanks. Your points about backtracking were helpful as well. Do you
> know if there are technical reasons why Emacs chooses a backtracking
> implementation for this regexp (instead of compiling it to a
> linear-time matcher)?

I'm afraid I don't know.  It might be that compiling a regexp for a
linear-time matcher would be slower.  Or, possibly, nobody has sat down
and hacked out a better regexp engine.

> Clément.

-- 
Alan Mackenzie (Nuremberg, Germany).



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Documentation on debugging regexp performance
  2016-01-21  9:39   ` Alexis
  2016-01-21 13:22     ` Clément Pit--Claudel
@ 2016-01-21 22:10     ` Marcin Borkowski
  2016-01-22  7:02       ` Alexis
  2016-01-22 14:32       ` Clément Pit--Claudel
  1 sibling, 2 replies; 13+ messages in thread
From: Marcin Borkowski @ 2016-01-21 22:10 UTC (permalink / raw)
  To: Alexis; +Cc: emacs-devel


On 2016-01-21, at 10:39, Alexis <flexibeast@gmail.com> wrote:

> More generally, i highly recommend Jeffrey Friedl's book 
> "Mastering Regular Expressions". It's not Emacs-specific, but it 
> provides in-depth explanations of why certain regexen are time- 
> and/or space-hungry.

Also, this: https://swtch.com/~rsc/regexp/regexp1.html .  (Btw, the author
criticizes Friedl very strongly at the end; I am not sure whether this
is deserved.  Still, a very good read it is.)

> Alexis.

Hth,

-- 
Marcin Borkowski
http://octd.wmi.amu.edu.pl/en/Marcin_Borkowski
Faculty of Mathematics and Computer Science
Adam Mickiewicz University



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Documentation on debugging regexp performance
  2016-01-21 22:10     ` Marcin Borkowski
@ 2016-01-22  7:02       ` Alexis
  2016-01-22 14:32       ` Clément Pit--Claudel
  1 sibling, 0 replies; 13+ messages in thread
From: Alexis @ 2016-01-22  7:02 UTC (permalink / raw)
  To: Marcin Borkowski; +Cc: emacs-devel


Marcin Borkowski <mbork@mbork.pl> writes:

> Also, this: https://swtch.com/~rsc/regexp/regexp1.html .  (Btw, 
> the author criticizes Friedl very strongly at the end; I am not 
> sure whether this is deserved.  Still, a very good read it is.)

That looks very interesting indeed - thanks!


Alexis.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Documentation on debugging regexp performance
  2016-01-21 22:10     ` Marcin Borkowski
  2016-01-22  7:02       ` Alexis
@ 2016-01-22 14:32       ` Clément Pit--Claudel
  1 sibling, 0 replies; 13+ messages in thread
From: Clément Pit--Claudel @ 2016-01-22 14:32 UTC (permalink / raw)
  To: emacs-devel

[-- Attachment #1: Type: text/plain, Size: 580 bytes --]

On 01/21/2016 05:10 PM, Marcin Borkowski wrote:
> 
> On 2016-01-21, at 10:39, Alexis <flexibeast@gmail.com> wrote:
> 
>> More generally, i highly recommend Jeffrey Friedl's book 
>> "Mastering Regular Expressions". It's not Emacs-specific, but it 
>> provides in-depth explanations of why certain regexen are time- 
>> and/or space-hungry.
> 
> Also, this: https://swtch.com/~rsc/regexp/regexp1.html .  (Btw, the author
> criticizes Friedl very strongly at the end; I am not sure whether this
> is deserved.  Still, a very good read it is.)

Indeed, a great read!


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Documentation on debugging regexp performance
  2016-01-21 17:16     ` Alan Mackenzie
@ 2016-01-23  6:12       ` Stefan Monnier
  0 siblings, 0 replies; 13+ messages in thread
From: Stefan Monnier @ 2016-01-23  6:12 UTC (permalink / raw)
  To: emacs-devel

> I'm afraid I don't know.  It might be that compiling a regexp for a
> linear-time matcher would be slower.  Or, possibly, nobody has sat down
> and hacked out a better regexp engine.

That's about right.  I'd love to use some newer linear-time
regexp-engine.


        Stefan




^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-01-23  6:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-01-21  5:29 Documentation on debugging regexp performance Clément Pit--Claudel
2016-01-21  6:36 ` Yuri Khan
2016-01-21  9:39   ` Alexis
2016-01-21 13:22     ` Clément Pit--Claudel
2016-01-21 22:10     ` Marcin Borkowski
2016-01-22  7:02       ` Alexis
2016-01-22 14:32       ` Clément Pit--Claudel
2016-01-21 11:42 ` Wolfgang Jenkner
2016-01-21 16:38   ` Clément Pit--Claudel
2016-01-21 15:27 ` Alan Mackenzie
2016-01-21 16:37   ` Clément Pit--Claudel
2016-01-21 17:16     ` Alan Mackenzie
2016-01-23  6:12       ` Stefan Monnier

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).