Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bahrain (BH) postcode regex incorrect #207

Open
BenWalters opened this issue Jan 30, 2021 · 6 comments
Open

Bahrain (BH) postcode regex incorrect #207

BenWalters opened this issue Jan 30, 2021 · 6 comments
Assignees

Comments

@BenWalters
Copy link

Hi,

I believe the postcode regex for Bahrain to be incorrect.
https://www.gstatic.com/chrome/autofill/libaddressinput/chromium-i18n/ssl-aggregate-address/data/BH

It is currently (?:\d|1[0-2])\d{2} however I've found that postcodes are:

Valid post code numbers are 101 to 1216 with gaps in the range. Known as block number (Arabic: رقم المجمع‎) formally. The first digit in NNN format and the first two digits in NNNN format refer to one of the 12 municipalities of the country. PO Box address doesn't need a block number or city name, just the PO Box number followed by the name of the country, Bahrain.

And https://en.youbianku.com/Bahrain

@roelofr
Copy link

roelofr commented Jan 31, 2021

I made a quick script to find all numbers on the page, the gist is here.

The table on https://en.youbianku.com/Bahrain does seem to match this 1-12 range, so the following regexp would suffice:

(?:\d|1[0-2])\d{2}

@BenWalters
Copy link
Author

@roelofr When I initially looked at the RegEx I too thought it would be okay but I had a play around and it seems to be ineffective for postcodes greater than 999: https://regex101.com/r/39cZU1/1/

I believe ^(1[0-1][0-9]{2}|12[0-1][0-6]|[1-9][0-9]{2}|10[1-9])$ might be a better solution: https://regex101.com/r/9nASya/1

@roelofr
Copy link

roelofr commented Feb 2, 2021

I believe ^(1[0-1][0-9]{2}|12[0-1][0-6]|[1-9][0-9]{2}|10[1-9])$ might be a better solution

This is very expressive for a regex, but mostly, it seems the Google API doesn't return start or end markers, possibly to allow finding postal codes in longer strings.

The pattern I posted matches 2103 matches 103, and ignores the 2. How could I miss that, thanks for the correction 👍

So, to fix this, we probably need to check for word markers or EOLs.
Since the last two digits are always required, I tried keeping them out of the alternate match.

So, my v2 would be:

(?:^|\b)(?:1[0-2]|[1-9])\d{2}(?:$|\b)

I ran it again your pattern, and added a 012, which should be invalid too. Results are here.

Small edit: made all groups non-capturing groups.

@BenWalters
Copy link
Author

(?:^|\b)(?:1[0-2]|[1-9])\d{2}(?:$|\b)

Looks good to me! I'd be more than happy with that being implemented.

@BenWalters
Copy link
Author

@sockix @sebsg are you able to assist with this change?

@sockix
Copy link
Contributor

sockix commented Feb 25, 2021

Hi Ben. Yep, I will take a look and see if I'm able to make the change.

@sockix sockix self-assigned this Feb 25, 2021
bojanz added a commit to bojanz/address that referenced this issue Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants