Skip to main content
RegEx Corner

RegEx Corner: Phone Number Formatting

Back-end Development

We recently encountered a requirement on a project that the phone number collected in a webform be formatted to the E.164 standard before being passed along to an external system via a custom handler. To do this, we'll need to construct a regular expression and then use it within a PHP function to format the number.

The basic task is to recognize a phone number regardless of a variety of valid user inputs: Are there parentheses around the area code or not? Spaces or dashes? Country code or no country code? The regex needs to be agnostic to these concerns and still produce a match without false positives or negatives.

(?:\d{1}-?\s?)?\(?(\d{3})\)?-?\s?(\d{3})-?\s?(\d{4})

This is the final regular expression we came up with. Let's break it down, piece by piece!

(?:\d{1}-?\s?)?

This first section is optional, because of the trailing ? character. It is a non-capturing group that recognizes a single-digit country code. What's a non-capturing group? It is signified by the (?: at the front, and just means that we want to match this, but don't need to refer to this information later. It will allow us to recognize phone numbers with or without a country code prefix as valid but ignore it during the larger match.

\(?(\d{3})\)?

The second section is a normal (capturing) group that matches the three-digit area code of the phone number. We optionally match opening and/or closing parentheses.

-?\s?(\d{3})

In this bit, we capture the three-digit prefix of the phone number, potentially separated from the area code by a dash (-?) or space (\s?).

-?\s?(\d{4})

Finally we do that again, but instead match the remaining four digits of the number (again with an optional dash or space).

The webform in question only collects information from users in the US and Canada. This allowed us to be pretty specific about the makeup of that phone number, and also means that the country code prefix should always be a +1. This, coupled with our regex, gives us the pieces we need to format the final number in a PHP function.

  private function e164Convert(string $tel): string {
    $regex = '/(?:\d{1}-?\s?)?\(?(\d{3})\)?-?\s?(\d{3})-?\s?(\d{4})/m';
    $replacement = "+1$1$2$3";
    $result = preg_replace($regex, $replacement, $tel);
    return $result;
  }

Since the country code in our regex is a non-capturing group, it isn't part of the groups we're concatenating. So, we prepend a +1 to the result string, giving us our nicely E.164 formatted phone number!