Advanced search  

News:

cpg1.5.48 Security release - upgrade mandatory!
The Coppermine development team is releasing a security update for Coppermine in order to counter a recently discovered vulnerability. It is important that all users who run version cpg1.5.46 or older update to this latest version as soon as possible.
[more]

Pages: [1]   Go Down

Author Topic: Danish Characters Issues - UTF-8  (Read 27636 times)

0 Members and 1 Guest are viewing this topic.

mboesen

  • Coppermine frequent poster
  • ***
  • Offline Offline
  • Gender: Male
  • Posts: 104
    • Boesen Photography & Design
Danish Characters Issues - UTF-8
« on: June 20, 2011, 11:06:26 am »

Hallo

I have just installed 1.5.12 and is testing it (am still using 1.4.x, until 1.5.12 works properly for my needs) and unfortunately have run into some UTF-8, special danish character problems. The problem isn't only in 1.5, but also 1.4. It actually started when Adobe Lightroom was upgraded from 3.3 to 3.4. Now I know this is not a support forum for LR, but after several discussion on Adobe Forum I will also try here.

The problem occurs when exporting images from LR and then adding them to Coppermine. In LR 3.3 there was no problem, but now in LR 3.4 the danish characters øæå is not translated correctly... I get this

FC Nordsjælland mod Randers FC
FC Nordsjælland mod Randers FC i Farum Park. Resultat 2-1. Mål af ....

I started out thinking that is it LR that have a problem. But according to Adobe in the new version of LR, they say...

"I think what you're seeing is by design. Lightroom is now following the Metadata Working Group guidelines for IPTC character set encoding, which requires always using  UTF-8 character set encoding.  This allows files to be exchanged correctly between systems that use different local character set encodings."

I tried to explain that Coppermine was indeed using UTF-8. Then I was directed to this....

Have a look at your image using Jeffrey's Exif Viewer (a WebInterface to ExifTool) and check the content there.

And at Jeffreys my data is correct... coming from LR....

So now I really don't know where to start....

Can anyone help on this issue?

My gallery is at http://boesenfoto.dk/gallery/index.php
Logged

mboesen

  • Coppermine frequent poster
  • ***
  • Offline Offline
  • Gender: Male
  • Posts: 104
    • Boesen Photography & Design
Re: Danish Characters Issues - UTF-8
« Reply #1 on: June 21, 2011, 08:47:11 am »

Morning guys

I have now made some tests and the problem seems to be in Coppermine....

I made an images and entered øæå... and it is fine i Bridge (and other places where I can check the meta data), but not in Coppermine. The really weird thing is that I tried ÆØÅ.øæå and the only thing in Coppermine was the "."... If I added text before, then the øæå is imported but in weird forms.

I am trying to have someone explain to me what exactly is the difference from LR 3.3 and 3.4.x., since it suddenly don't work in Coppermine. Will returns once I get the explanation..

Thanks for your time
Logged

Αndré

  • Administrator
  • Coppermine addict
  • *****
  • Country: de
  • Offline Offline
  • Gender: Male
  • Posts: 15764
Re: Danish Characters Issues - UTF-8
« Reply #2 on: June 22, 2011, 12:11:48 pm »

We're talking about EXIF/IPTC data, right? Afaik a user (also from the Nordic countries) already reported a similar issue. Please read this thread and try if the suggestions will (at least partly) fix your issues.
Logged

mboesen

  • Coppermine frequent poster
  • ***
  • Offline Offline
  • Gender: Male
  • Posts: 104
    • Boesen Photography & Design
Re: Danish Characters Issues - UTF-8
« Reply #3 on: June 24, 2011, 07:17:45 am »

Morning André

I did find that thread and started replying, but I believe the system said it was an old thread and if I really wanted to reply. So I decided to start a new thread. After reading the thread I got the feeling that he still had the same problems and corrected them manually. That will be really hard for me when adding more than a 100 images from soccer matches at the same time.

I got this from the Adobe thread where I discuss if it could be a problem in Lightroom....

"In Lightroom 3.4 we updated the reading and writing of metadata to conform to the guidelines of the Metadata Working Group specification. As part of the MWG spec, some changes were made to text encoding for reading/writing various metadata fields. See http://www.metadataworkinggroup.org/p... (page 32, 33). My guess is that the Coppermine might not follow the MWG spec."

Does that help anything. I don't much about coding, so it doesn't help me much ;-)

Thanks for your time...
Logged

mboesen

  • Coppermine frequent poster
  • ***
  • Offline Offline
  • Gender: Male
  • Posts: 104
    • Boesen Photography & Design
Re: Danish Characters Issues - UTF-8
« Reply #4 on: June 24, 2011, 07:32:58 am »

Just tried to add the config line. It changed the danish characters, also the screwed up ones, again :-/ But not into the correct letters. Tried to upload new image with the mentioned characters, but still no luck...
Logged

Αndré

  • Administrator
  • Coppermine addict
  • *****
  • Country: de
  • Offline Offline
  • Gender: Male
  • Posts: 15764
Re: Danish Characters Issues - UTF-8
« Reply #5 on: June 24, 2011, 10:51:10 am »

It's quite hard to test it with existing files. If you can, install a second test gallery on your server. Make sure the database character set is set to utf8 and try if it makes a difference to use the additional dbcharset config line or not.

Can you please attach one or more images with metadata, that makes trouble? So I can perform some own tests. Thanks.
Logged

mboesen

  • Coppermine frequent poster
  • ***
  • Offline Offline
  • Gender: Male
  • Posts: 104
    • Boesen Photography & Design
Re: Danish Characters Issues - UTF-8
« Reply #6 on: June 30, 2011, 02:14:11 pm »

Hi André

Sorry hadn't seen your answer. Apparently I don't get an email notification... gotta check that out ;-)

I did add the line and uploaded new photos without luck. Will look into making a new DB but am not that hardcore in all this DB stuff so I am afraid to to mess up the whole thing...

But until then I will add a few files for you to use for testing....

Thanks alot

Michael
Logged

mboesen

  • Coppermine frequent poster
  • ***
  • Offline Offline
  • Gender: Male
  • Posts: 104
    • Boesen Photography & Design
Re: Danish Characters Issues - UTF-8
« Reply #7 on: August 17, 2011, 11:02:58 am »

Still having huge problems with this. Tried to run a PHP check in Coppermine and found this

iconv
iconv support  enabled 
iconv implementation  glibc 
iconv library version  1.11 

Directive Local Value Master Value
iconv.input_encoding ISO-8859-1 ISO-8859-1
iconv.internal_encoding ISO-8859-1 ISO-8859-1
iconv.output_encoding ISO-8859-1 ISO-8859-1

Could that be the problem? And if so is that something I have to change on my webhost server?

Michael
Logged

Αndré

  • Administrator
  • Coppermine addict
  • *****
  • Country: de
  • Offline Offline
  • Gender: Male
  • Posts: 15764
Re: Danish Characters Issues - UTF-8
« Reply #8 on: August 17, 2011, 11:26:34 am »

Sorry I have overlooked this thread. I just did some testing. The IPTC data in your files is htmlentitiy encoded, which seems to be the cause.

The following works for me. Open include/picmgmt.inc.php, find
Code: [Select]
                $title = (isset($iptc['Headline'])) ? $iptc['Headline'] : $title;
                $caption = (isset($iptc['Caption'])) ? $iptc['Caption'] : $caption;
                $keywords = (isset($iptc['Keywords'])) ? implode($CONFIG['keyword_separator'], $iptc['Keywords']) : $keywords;
and replace with
Code: [Select]
                $title = (isset($iptc['Headline'])) ? html_entity_decode($iptc['Headline']) : $title;
                $caption = (isset($iptc['Caption'])) ? html_entity_decode($iptc['Caption']) : $caption;
                $keywords = (isset($iptc['Keywords'])) ? implode($CONFIG['keyword_separator'], html_entity_decode($iptc['Keywords'])) : $keywords;
Logged

mboesen

  • Coppermine frequent poster
  • ***
  • Offline Offline
  • Gender: Male
  • Posts: 104
    • Boesen Photography & Design
Re: Danish Characters Issues - UTF-8
« Reply #9 on: August 17, 2011, 11:37:29 am »

Fantastic... that actually worked.....

except for one little thing. Now it doesn't include keywords at all?!

But thank you so much.... many many hours of searching the internet and trying out stuff has ended ;-)
Logged

Αndré

  • Administrator
  • Coppermine addict
  • *****
  • Country: de
  • Offline Offline
  • Gender: Male
  • Posts: 15764
Re: Danish Characters Issues - UTF-8
« Reply #10 on: August 17, 2011, 11:38:17 am »

I just had a closer look at the IPTC data handling. The data will be sanitized and the function htmlentities is used to do that. Please disregard/undo my last code change suggestion.

Instead, open include/iptc.inc.php, find
Code: [Select]
$data=htmlentities(strip_tags(trim($data,"\x7f..\xff\x0..\x1f")),ENT_QUOTES); //sanitize data against sql/html injection; trim any nongraphical non-ASCII character:and replace with
Code: [Select]
$data=htmlspecialchars(strip_tags(trim($data,"\x7f..\xff\x0..\x1f")),ENT_QUOTES); //sanitize data against sql/html injection; trim any nongraphical non-ASCII character:

I'm currently not sure if it may cause a security vulnerability.
Logged

mboesen

  • Coppermine frequent poster
  • ***
  • Offline Offline
  • Gender: Male
  • Posts: 104
    • Boesen Photography & Design
Re: Danish Characters Issues - UTF-8
« Reply #11 on: August 17, 2011, 12:10:59 pm »

With that change I get an upload error...

The first solution worked well, except for the keywords, but I can live with that ;-)

But if you figure out a better solution, I am all ears!
Logged

Αndré

  • Administrator
  • Coppermine addict
  • *****
  • Country: de
  • Offline Offline
  • Gender: Male
  • Posts: 15764
Re: Danish Characters Issues - UTF-8
« Reply #12 on: August 17, 2011, 12:24:32 pm »

With that change I get an upload error...
Please enable debug mode and post the error message.
Logged

mboesen

  • Coppermine frequent poster
  • ***
  • Offline Offline
  • Gender: Male
  • Posts: 104
    • Boesen Photography & Design
Re: Danish Characters Issues - UTF-8
« Reply #13 on: August 17, 2011, 01:28:41 pm »

Just checked up on all files and removed the dbcharset (suggested in swedish thread) from config... and now it works perfekt.

Don't know if I missed something before or just uploaded a wrong version of a file....

But it works! Great.. finally it seems that I can upgrade to 1.5.x from 1.4

Thanks alot
Logged

nemo12

  • Coppermine newbie
  • Offline Offline
  • Posts: 1
Re: Danish Characters Issues - UTF-8
« Reply #14 on: April 24, 2014, 07:50:14 am »

My solution for äöõü in estonian:

function strip_IPTC($data) {
    if (is_array($data)) {
        foreach ($data as $key=>$item) {
             $data[$key]=strip_IPTC($item);
        }
    } else {
      $data=mb_convert_encoding($data, 'UTF-8', 'ISO-8859-1');

added last line.
Logged
Pages: [1]   Go Up
 

Page created in 0.041 seconds with 20 queries.