Title: Securing a gallery from search engines
Post by: M3 on September 19, 2004, 11:06:01 pm
My last host ran across a problem when I installed a Coppermine gallery to their system. I only got about half of my pictures uploaded (about 2,500) when things started to crash (with little to no visitors, I might add). I was told later that it brought down the entire MYSQL database on the webhost and everyone lost their galleries and message boards (I'm not sure this is true or not, the host proved to be a liar on other things).

I was told that the biggest problem was that search engines were caching my images, and they were going to modify a new installation to prevent that. They were also going to modify the code so that only registered visitors would see a full size image.

Does this sound normal to you guys? Have you heard of this happening before? Search engines?!

I know there is a mod for the "registered visitors only" in the FAQ which I will apply myself. However, my old hosts have since threw a tantrum, cancelled my services and refused to even tell me what work they'd done on my galleries to modify them.

I'm wary of installing Coppermine on my new host for fear this happens again. I can cut down on the usage with the FAQ mod, but what can I do about the search engines?

And I know you guys like to have people "figure it out for themselves" but I'm not techy in any way. It's like handing a gentile something in Hebrew and then telling them to find the answers in there. Being told to try and work it out yourself is incredibly frustrating, especially when someone has come so far and learned so much already but is stumped on something so technical that it's completely over their heads. Any specific, detailed, dumbed down responses or code snippets would be MUCH appreciated, even if they are sent in a private message. I'm determined to get my site functional again, and I hate to abandon Coppermine and move to another gallery system. 

In fact, would it be easier for me overall just to offer someone on this board money to modify and install a gallery for me to ensure success?
Title: Re: Securing a gallery from search engines
Post by: Casper on September 20, 2004, 12:00:37 am
Try this to prevent search engines eating your bandwidth.

Open your theme/template.html, and then place this code in the heade section of the html;

Code: [Select]
Do this in every theme/template.html you have.
Title: Re: Securing a gallery from search engines
Post by: M3 on September 20, 2004, 03:48:05 am
Excellent! I didn't expect it to be a one line, easy type of thing.

I may scatter some pictures on a separate html page in the site to allow search engines to demonstrate (on a strictly picture-only search, ala google) that I've got really good stuff.

Throw a few bread crumbs out there but lock up the bakery.

Thank you very much, I can't wait to try this out! :)
Title: Re: Securing a gallery from search engines
Post by: Joachim Müller on September 20, 2004, 06:02:40 am
I can't see how a spider robot is meant to bring a gallery with approx. 5000 pics down. I tried this myself by letting loose an offline webiste copier called "httrack" that does a similar thing to a search engine crawler on my page several times. Imo your webhost lied to you, or the webserver your gallery has been running on was overloaded with too many virual hosts running on one machine.

There are other methods that keep crawlers (search engine spiders) from indexing your page. The best method is imo the creation of a file called "robots.txt" in the webroot (only there!), with detailed instructions to the crwalers what to index and what not.

Here are some examples what code you could put into the robots.txt file:
User-agent: *
        Disallow: /your_coppermine_folder/
This will keep the crawler from indexing anything that resides within your coppermine install.

Disallow: /
Will prevent a special crawler (in this case offline caching mechanisms of MS IE) to spider the page.

To disallow google's spider to index pics, you can use this one:
User-agent: Googlebot
Disallow: /*.jpg$

If you don't want your pics to turn up in google's pic search index, use
User-Agent: Googlebot-Image
Disallow: /


Title: Re: Securing a gallery from search engines
Post by: Casper on September 20, 2004, 02:45:45 pm
Gaugaus solution is the more thorough one.  When my site hada problem, it was not crashing, just using huge ammounts of bandwidth, it was caused by msnbot, which for some reason was doing my site many times a day.
Title: Re: Securing a gallery from search engines
Post by: M3 on September 22, 2004, 08:13:31 pm
deleted. nevermind! uploaded everything, found some more specific bots to block and I wish I had known about this a few years ago! Now I know why my bandwidth usage was so terribly high on the first incarnations of my site.

I'm off to tell my friends about this. My teaser page is sounding more and more like what I should be doing and locking my galleries up tighter! Thanks a million!!