First, I would like to say that I'am impressed by the quality of Coppermine and the by the amount of work it represents.
Living in a country where 3 different languages are spoken, I paid a special attention to the automatic language detection based on the Accepted-Language and User-Agent HTTP strings.
GENERAL REMARKMY SUGGESTIONThe code below is faster and has more features. Faster by the use of PCRE regex functions that are *much* faster than the POSIX ones. In a little benchmark (100 loops) the new code is 3 times faster if there is a Accepted-Language string and up to 5 times faster on the User-Agent string.
As for the new feature, in the definition of the http Accepted-Language string w3c says:
Each language-range MAY be given an associated quality value which represents an estimate of the user's preference for the languages specified by that range. The quality value defaults to "q=1".http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.4My code below takes the user preferences into account by sorting the languages tokens on their weight (q=0.x)
For example: if the Accepted-Language strings looks like:
ww,ww-zz,de=0.2;q=0.1,it;q=0.5,en;q=0.3, the code will disregard the non-existing ww or ww-zz tags and will pick-up the language-tag that has the higher q factor,
it in this case.
function lang_detect_q($available_languages) {
if (!empty($_SERVER['HTTP_ACCEPT_LANGUAGE'])) {
$language_tokens = explode(',', $_SERVER['HTTP_ACCEPT_LANGUAGE']);
// loop through each Accept-Language token and find quality level (i.e. q=0.8)
$lang_tag = $quality_tag = array();
foreach ($language_tokens as $language_token ) {
// explodes on ;q
$q_explode = explode(';q=', $language_token);
// if no q factor in token default q value = 1
$q = isset($q_explode[1]) ? $q_explode[1] : 1;
// add language_tag and quality_tag to array
$lang_tag[] = $q_explode[0];
$quality_tag[] = $q;
}
// sorts array on key in reverse order (higher quality first)
// array_multisort was too slow
arsort($quality_tag);
// loop throuh every quality_tag array
foreach ($quality_tag as $q_key => $q_val) {
// loop through each available_languages
foreach ($available_languages as $key => $language) {
if (preg_match('#^(?:'. $language[0] .')#i', $lang_tag[$q_key])){
// exit function on first match.
return $available_languages[$key][1];
}
}
}
// if Accept-Language not present in the client's http header, we try the User-Agent string
} elseif (!empty($_SERVER['HTTP_USER_AGENT'])) {
// once again, loop through each available_languages
foreach ($available_languages as $key => $language) {
if (preg_match('#[(,; [](?:'. $language[0] .')[]),;]#i', $_SERVER['HTTP_USER_AGENT'])) {
// exit function on first match.
return $available_languages[$key][1];
}
}
}
// if nothing found --> exit function with false (or default language value if necessary)
return false;
}
$lang = lang_detect_q($available_languages);
// If we catched a valid language, configure it
if ($lang) {
$USER['lang'] = $lang;
}
As for the $available_languages array, the PCRE functions run slightly faster when the grouping parenthesis (option1|option2) are rendered non capturing as in (
?:option1|option2). So,
'fr' => array('fr(?:-[[:alpha:]]{2})?|french', 'french', 'fr'),Let me know if something need to be changed.