Skip to content

Beider-Morse + Daitch-Mokotoff Phonetic Matching (soundex) Algorithm

License

Notifications You must be signed in to change notification settings

aurek/BMDMSoundex

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Beider-Morse + Daitch-Mokotoff Soundex Algorithm

This is a fork of the algorithm developed by Alexander Beider and Stephen P. Morse for phonetic matching of names and words. This algorithm generates less quantity of false hits comparing to soundex() and methaphone(). Also it's possible to use this algorithm for some non-latin alphabets without a transliteration.

Credits

Authors: Alexander Beider, Paris and Stephen P. Morse, San Francisco
Website: http://stevemorse.org/phoneticinfo.htm (source download, information and contacts)

Information

Currently there are 17 languages supported: Arabic, Czech, Dutch, English, French, German, Greek, Hebrew, Hungarian, Italian, Latvian, Polish, Portuguese, Romanian, Russian, Spanish, Turkish. In Russian and Greek languages both native and latin alphabets are supported. Also BMPM (Beider-Morse Phonetic Matching) can parse Hebrew names with Ashkenazic and Sephardic rules.

Differences

This fork's goal is to get rid of deprecated and global functions, global variables and to represent algorithm in OOP-like style. Also there were implemented some fixes and modifications for unification purposes. While exceeding the limits of procedural code now it's possible to include algorithm in frameworks and third-parity applications without a headache. Latvian language experimental support added.

Requirements

PHP 5+; mbstring extenstion

Usage

<?php

require_once 'phonetic/Phonetic.php';
$phonetic = Phonetic::app()->run();

// Process string with a Beider-Morse algorithm and retrieve BM phonetic keys
$p = $phonetic->BMSoundex->getPhoneticKeys('Hello world');

// Try to guess string's language
$l = $phonetic->BMSoundex->getPossibleLanguages('Grzegorz');

// Retrieve all supported languages
$g = $phonetic->BMSoundex->getLanguages();

// Process string with a Beider-Morse algorithm and after that with Daitch-Mokotoff Soundex
$b = $phonetic->BMSoundex->getNumericKeys('ברצלונה');

Ashkenazic and Sephardic support

<?php

require_once 'phonetic/Phonetic.php';

// use 'sep' instead of 'ash' to init a Sephardic module
$phonetic = Phonetic::app()->run('ash');

Multiple languages in one string

<?php

require_once 'phonetic/Phonetic.php';
$phonetic = Phonetic::app()->run();

$p = $phonetic->BMSoundex->getPhoneticKeys('This is Спарта!');
$n = $phonetic->BMSoundex->getNumericKeys('This is Спарта!');

Different languages matching

<?php

require_once 'phonetic/Phonetic.php';
$phonetic = Phonetic::app()->run();

// Words in different languages with the same pronunciation
// in most cases give intersections in results.

print_r($phonetic->BMSoundex->getNumericKeys('Zelinska'));
print_r($phonetic->BMSoundex->getNumericKeys('Зелинска'));

// Array
// (
//     [0] => Array
//         (
//             [0] => 486450
//         )
//
// )
//
// Array
// (
//     [0] => Array
//         (
//             [0] => 486450
//         )
//
// )

License

Project is distributed under GNU GPL v3 in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

Copyright (c) 2013 Olegs Capligins

About

Beider-Morse + Daitch-Mokotoff Phonetic Matching (soundex) Algorithm

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • PHP 100.0%