On Sat, Jan 23, 2021 at 07:38:49AM +0100, moasenwood--- via Users list for the GNU Emacs text editor wrote: > moasenwood--- via Users list for the GNU Emacs text editor wrote: > > > Can I parse/split a string into sentences based on > > human-language punctuation? > > > > Did anyone do that already? > > I mean very mechanically is fine, no linguistics or anything. > > So this > > "'This sentence is spoken by Mr. W. E. B Dubois, Esq.!' played > through amazon.com alexa speakers?" > > would be > > ("'" "This sentence is spoken by Mr" "." "W" "." "E" "." "B > Dubois" "," "Esq" "." "!" "'" "played through amazon" "." > "com" "alexa "speakers" "?") Not exactly your result, but this comes close: (split-string "'This sentence is spoken by Mr. W. E. B Dubois, Esq.!' played through amazon.com alexa speakers?" "[[:punct:]][[:space:]]*") => ("" "This sentence is spoken by Mr" "W" "E" "B Dubois" "Esq" "" "" "played through amazon" "com alexa speakers" "") You can adjust the results by tweaking the regexp (try word boundaries like '\<' and '\>' if you want to keep punctuation) or the other split-string's optional params (e.g. drop the empty matches, etc.). Cheers - t