Topic: Optimize requests of CPG for next versions ? ... (Read 110035 times)

Rodinou · « **on:** March 09, 2005, 03:57:32 pm »

My server crashes all the days because Coppermine seems to have some limits when they are a lot of people with a lot of galleries ...

with 5000 visits days since 1 week, and about 200 000 pages / day ... we have seen it

Quote

SELECT COUNT(*) from cpg11d_comments, cpg11d_pictures WHERE approved = 'YES' AND cpg11d_comments.pid = cpg11d_pictures.pid A
ND aid NOT IN (20,691,93,195,215,343,368,383,538,443,466,483,486,493,494,495,525,540,567,610,622,638,659,664,753,761,774,782,7
92,821,894,947,982,993,1033,1055,1075,1081,1100,1150,1164,1223);
# Time: 050309 15:35:05
# User@Host: rouenonline[rouenonline] @ localhost []
# Query_time: 153 Lock_time: 147 Rows_sent: 1 Rows_examined: 59029
SELECT * FROM cpg11d_pictures WHERE approved = 'YES'AND hits > 0 AND aid NOT IN (20,691,93,195,215,343,368,383,538,443,466,483
,486,493,494,495,525,540,567,610,622,638,659,664,753,761,774,782,792,821,894,947,982,993,1033,1055,1075,1081,1100,1150,1164,12
23) ORDER BY hits DESC, filename LIMIT 78 ,1;
# User@Host: rouenonline[rouenonline] @ localhost []
# Query_time: 50 Lock_time: 44 Rows_sent: 1 Rows_examined: 58997
SELECT * FROM cpg11d_pictures WHERE approved = 'YES'AND hits > 0 AND aid NOT IN (20,691,93,195,215,343,368,383,538,443,466,483
,486,493,494,495,525,540,567,610,622,638,659,664,753,761,774,782,792,821,894,947,982,993,1033,1055,1075,1081,1100,1150,1164,12
23) ORDER BY hits DESC, filename LIMIT 46 ,1;
# User@Host: rouenonline[rouenonline] @ localhost []
# Query_time: 46 Lock_time: 40 Rows_sent: 1 Rows_examined: 58997
SELECT * FROM cpg11d_pictures WHERE approved = 'YES'AND hits > 0 AND aid NOT IN (20,691,93,195,215,343,368,383,538,443,466,483
,486,493,494,495,525,540,567,610,622,638,659,664,753,761,774,782,792,821,894,947,982,993,1033,1055,1075,1081,1100,1150,1164,12
23) ORDER BY hits DESC, filename LIMIT 46 ,1;
# User@Host: rouenonline[rouenonline] @ localhost []
# Query_time: 60 Lock_time: 54 Rows_sent: 1 Rows_examined: 59086
SELECT * FROM cpg11d_pictures WHERE approved = 'YES'AND hits > 0 AND aid NOT IN (20,691,93,195,215,343,368,383,538,443,466,483
,486,493,494,495,525,540,567,610,622,638,659,664,753,761,774,782,792,821,894,947,982,993,1033,1055,1075,1081,1100,1150,1164,12
23) ORDER BY hits DESC, filename LIMIT 135 ,1;
# User@Host: rouenonline[rouenonline] @ localhost []
# Query_time: 46 Lock_time: 40 Rows_sent: 1 Rows_examined: 58997
SELECT * FROM cpg11d_pictures WHERE approved = 'YES'AND hits > 0 AND aid NOT IN (20,691,93,195,215,343,368,383,538,443,466,483
,486,493,494,495,525,540,567,610,622,638,659,664,753,761,774,782,792,821,894,947,982,993,1033,1055,1075,1081,1100,1150,1164,12
23) ORDER BY hits DESC, filename LIMIT 46 ,1;
# User@Host: rouenonline[rouenonline] @ localhost []
# Query_time: 46 Lock_time: 40 Rows_sent: 1 Rows_examined: 58997
SELECT * FROM cpg11d_pictures WHERE approved = 'YES'AND hits > 0 AND aid NOT IN (20,691,93,195,215,343,368,383,538,443,466,483
,486,493,494,495,525,540,567,610,622,638,659,664,753,761,774,782,792,821,894,947,982,993,1033,1055,1075,1081,1100,1150,1164,12
23) ORDER BY hits DESC, filename LIMIT 46 ,1;
# User@Host: rouenonline[rouenonline] @ localhost []
# Query_time: 81 Lock_time: 75 Rows_sent: 1 Rows_examined: 59086
SELECT * FROM cpg11d_pictures WHERE approved = 'YES'AND hits > 0 AND aid NOT IN (20,691,93,195,215,343,368,383,538,443,466,483
,486,493,494,495,525,540,567,610,622,638,659,664,753,761,774,782,792,821,894,947,982,993,1033,1055,1075,1081,1100,1150,1164,12
23) ORDER BY hits DESC, filename LIMIT 135 ,1;

It's quite incredible !!! for each request, Coppermine looks all the counts !!! Is it plan to "change" this ? Or Coppermine is "perfect" for "little galleries with few traffic" ? ...

Nibbler · « **Reply #1 on:** March 09, 2005, 04:08:01 pm »

I will look into caching/db performance/unnecessary queries etc for 1.5 when I have time.

Tarique Sani · « **Reply #2 on:** March 10, 2005, 07:27:52 am »

We have already done a lot of optimization of queries where * was used and count(*) was used for CPG NG, what really kills the MySQL is the NOT IN clause with long list of aids.... As of now I do not know how to avoid that.

omniscientdeveloper · « **Reply #3 on:** March 10, 2005, 08:32:01 am »

I think, like Nibbler said, we could use some sort of caching. Even if we cached values for 5 minutes (or some admin setting), that would improve the speed a lot. A button on the admin page could erase the caches. I also think that a default cache of 5 minutes is small enough to prevent the server from getting cluttered with cache files, unless the site constantly has a high level of users online at one time.

Tarique Sani · « **Reply #4 on:** March 10, 2005, 09:58:16 am »

We can use a cache BUT lets start with discussing what do we cache? The most intensive page is index.php and that is cacheable only for non loggedin users - will that help?

nol33t · « **Reply #5 on:** March 10, 2005, 11:23:10 am »

Quote from: Tarique Sani on March 10, 2005, 07:27:52 am

what really kills the MySQL is the NOT IN clause with long list of aids.... As of now I do not know how to avoid that.

candid (ok not that much

) point of view: why then not doing it, and do the filtering on the script side? I had experiences where that happend to be the most efficient solution

Tarique Sani · « **Reply #6 on:** March 10, 2005, 11:39:01 am »

@nol33t - elaborate

Rodinou · « **Reply #7 on:** March 10, 2005, 11:48:50 am »

happy my request is not an invalid request ...

I'm afraid in fact because my site is known by a lot of people : this summer, we'll cover "Sziget Festival" and I'll have thousands new visitors for photos of the biggest european festival ... if my server CRASHES today ... I don't want to think about in few months ...

I'll wait your solution to use coppermine with fantasya (which has MORE visitors that my site sortons.net ...)

But I remember, when we dev' fantasya.net ... we have done a caching system about 5mn .... and it could support 20 000 visitors / day without problem.

nol33t · « **Reply #8 on:** March 10, 2005, 02:54:15 pm »

@Tarique
I'm working on a project where some table rowcount went from 1 to 15 millions in one year

The same queries which were taking 2 min to run, suddenly ( really suddenly, i mean the next day ) were taking 45 min
Some of those queries were queries with nested NOT IN in it.
From what i learnt: the sql engine, depending on the size of the table, choose or not to use the index of the table when running subqueries.
When you reach a certain point ( i'm not dba so i won't go into more details ), the engine decides not to use the index anymore, but to do a table scan --> that's when you start to get in trouble.

To solve that and get back (lot) faster running queries, depending on the need, we did one of the two following things:

- replaced queries of the kind

Code: [Select]

select * from a where a.id not in ( select id from b where etc...)

with

Code: [Select]

select b.id into #tmptable_b where etc.. ( syntax in mysql being "select into temporary table tmptable_b i think..)

select * from a where a.id not in ( select id from #temptable_b )

OR ( and i think that example fits more w/ cpg )

- replaced the same query

Code: [Select]

select * from a where a.id not in ( ... ) and a.another_column = 'another_value"

with

Code: [Select]

select * from a where a.another_column = 'another_value'

then in the script/program,
after fetching the results
(pseudo coding here )

Code: [Select]

forbidden_set = list( 4, 7, 9, 12) //list recovered by sql also
if ( forbidden_set.contains( mysql_result[$i.['aid']] )
then
  skip //NOT IN replacement
else
  process it

Those are just some thoughts
but even if the numbers are differents..i think the concept stays the same

-matt-

Tarique Sani · « **Reply #9 on:** March 11, 2005, 05:51:35 am »

@nol33t - you obviously are not too familiar with CPG - go figure

nol33t · « **Reply #10 on:** March 11, 2005, 09:16:41 am »

@Tarique
LOL

elaborate?

Read my post again, those are just some thoughts, the way CPG exactly works doesn't matter - since you have to change things to optimize them.
My main point was: NOT IN queries are always avoidable ( or optimizable ), replacing them by code or temp table can be way faster, and i'm familiar w/ that one

--edit: if you got time to write more that one line / post, please provide a concrete example, and i'll show you the concrete solution

Joachim Müller · « **Reply #11 on:** March 11, 2005, 10:10:47 am »

Quote from: Tarique Sani on March 11, 2005, 05:51:35 am

@nol33t - you obviously are not too familiar with CPG - go figure

I was under the impression that nol33t knows his way around in coppermine pretty well - at least I don't understand why you're posting this, sounds a bit rude to me (maybe a language issue on my side though). After all, nol33t knows the trade and volunteers to discuss and improve coppermine, he deserves a better reply imo.

Joachim

Tarique Sani · « **Reply #12 on:** March 11, 2005, 11:05:05 am »

@Gaugau - I am not being rude - may be being a bit terse but definitely not yet being rude - I just want nol33t to not talk in generalizations and dismiss others point of view as candid and go on to question why not do it.

What he has pointed out is something very general, known to almost anyone who has been programming long enough BUT I know this cannot be applied in CPG in many places - this is the reason for my response doubting his familiarity with the CPG (code)

Then he goes on to ask for concrete examples - the very first post in this thread is replete with examples.

Just like you rightly do not have tolerance for people asking silly question - I always question people who just talk about programming without showing me the code

@all: Once again like I have mentioned we have optimised a whole lot of stuff for NG - just that *as of now* I do not know how to avoid "aid NOT IN" clause in the all the SQL. Anyone who has CPG specific suggestions is most welcome to contribute to the thread and code.

Caching of stats is one area where we are doing quite a bit of RnD also caching of the index page for non logged in user is another area which we are looking into

nol33t · « **Reply #13 on:** March 11, 2005, 11:42:41 am »

Funny, last time i've showed some code ( forgotten password issue ), nobody took even 5min to look at it.
And it's the first time i've "genreralized" an idea in this board without appropriate code for it

Quote

Then he goes on to ask for concrete examples - the very first post in this thread is replete with examples.

Please don't be so elitist ( i'm no-l33t ), don't reply to others when i'm the one who asked the question
about those queries in the first post: true, forgot about them.
That still can be optimized, but i don't even feel like discussing it anymore - I'm out of that thread

Rodinou · « **Reply #14 on:** March 11, 2005, 02:28:46 pm »

Nol33t

if you continue like this, you'll have a bad karma like me !! (-17 : thanx all people who smite me, I love it)

I think Coppermine is used by thousands users, but not thousands users have big traffic : and I think this problem will become bigger with time : Coppermine becomes a standard, and who says standard, says ... more traffic.

omniscientdeveloper · « **Reply #15 on:** March 11, 2005, 03:35:45 pm »

Quote

Funny, last time i've showed some code ( forgotten password issue ), nobody took even 5min to look at it.
And it's the first time i've "genreralized" an idea in this board without appropriate code for it

Hmm...I remember discussing it....

That's why I responded with, what I thought, a better solution. It was partly using your ideas, but not all; however, you didn't care to respond or do it that way.

nol33t · « **Reply #16 on:** March 11, 2005, 04:09:08 pm »

[ot]

Quote

That's why I responded with, what I thought, a better solution. It was partly using your ideas, but not all; however, you didn't care to respond or do it that way.

I did care and thought about it dude...
but linking to the user session ( id=0..) without big changes seemed to me way to complex for the purpose intented, compared to a ~50 lines diff mod in one file
ok maybe this answer i should have posted

[/ot]

omniscientdeveloper · « **Reply #17 on:** March 11, 2005, 05:26:28 pm »

Quote from: nol33t on March 11, 2005, 04:09:08 pm

[ot]
Quote
That's why I responded with, what I thought, a better solution. It was partly using your ideas, but not all; however, you didn't care to respond or do it that way.
I did care and thought about it dude...
but linking to the user session ( id=0..) without big changes seemed to me way to complex for the purpose intented, compared to a ~50 lines diff mod in one file
ok maybe this answer i should have posted
[/ot]

Ok. As I said before, it states a solution to what you're saying.

@tarique: His post actually was applicable to Coppermine. We're not using any temp tables and not many subqueries, but if you combine queries, it'd form what he showed. As you said, the index and thumbnails pages are the biggest hoggers. Anyway, I still need to take some time and see what you guys have done with NG. I'm sure some of these issues will be void.

Rodinou · « **Reply #18 on:** March 11, 2005, 05:32:14 pm »

that's why dhtml solutions are welcome (like the mod nol has made about the filmstrip, and I dream to have a whole interface like this ...)

Joachim Müller · « **Reply #19 on:** March 11, 2005, 07:01:45 pm »

Rod, DHTML is in no way related to this discussion. I understand that it is your pet topic, but we're talking about database queries here, which are server-sided. DHTML is client-sided and won't help us in trying to reduce the query load

...

Joachim

News:

Author Topic: Optimize requests of CPG for next versions ? ... (Read 110035 times)

Nibbler