icon Top 9 categories map      RocketAware > Perl >

Why don't word-boundary searches with \b work for me?

Tips: Browse or Search all pages for efficient awareness of Perl functions, operators, and FAQs.



Home

Search Perl pages


Subjects

By activity
Professions, Sciences, Humanities, Business, ...

User Interface
Text-based, GUI, Audio, Video, Keyboards, Mouse, Images,...

Text Strings
Conversions, tests, processing, manipulation,...

Math
Integer, Floating point, Matrix, Statistics, Boolean, ...

Processing
Algorithms, Memory, Process control, Debugging, ...

Stored Data
Data storage, Integrity, Encryption, Compression, ...

Communications
Networks, protocols, Interprocess, Remote, Client Server, ...

Hard World
Timing, Calendar and Clock, Audio, Video, Printer, Controls...

File System
Management, Filtering, File & Directory access, Viewers, ...

    

Why don't word-boundary searches with \b work for me?

Two common misconceptions are that \b is a synonym for \s+, and that it's the edge between whitespace characters and non-whitespace characters. Neither is correct. \b is the place between a \w character and a \W character (that is, \b is the edge of a ``word''). It's a zero-width assertion, just like ^, $, and all the other anchors, so it doesn't consume any characters. the perlre manpage describes the behaviour of all the regexp metacharacters.

Here are examples of the incorrect application of \b, with fixes:

    "two words" =~ /(\w+)\b(\w+)/;          # WRONG
    "two words" =~ /(\w+)\s+(\w+)/;         # right

    " =matchless= text" =~ /\b=(\w+)=\b/;   # WRONG
    " =matchless= text" =~ /=(\w+)=/;       # right

Although they may not do what you thought they did, \b and \B can still be quite useful. For an example of the correct use of \b, see the example of matching duplicate words over multiple lines.

An example of using \B is the pattern \Bis\B. This will find occurrences of ``is'' on the insides of words only, as in ``thistle'', but not ``this'' or ``island''.


Source: Perl FAQ: Regexps
Copyright: Copyright (c) 1997 Tom Christiansen and Nathan Torkington.
Next: Why does using $&, $`, or $' slow my program down?

Previous: How do I efficiently match many regular expressions at once?



(Corrections, notes, and links courtesy of RocketAware.com)


[Overview Topics]

Up to: NUL terminated String Comparison and Search




Rapid-Links: Search | About | Comments | Submit Path: RocketAware > Perl > perlfaq6/Why_don_t_word_boundary_searches.htm
RocketAware.com is a service of Mib Software
Copyright 2000, Forrest J. Cavalier III. All Rights Reserved.
We welcome submissions and comments