This forum is read-only now. Please use Forum 2 for new posts

xml No replies possible in the archive
Two-leg statistics based on first result 1999-2012
Author: bert.kassies
Date: 24-08-2012, 15:08
NEW STATISTICS ON A 10-YEAR and 30-YEAR PERIOD ARE AVAILABLE.
SEE Two-leg statistics based on the first leg result 2012.
ALTHOUGH THE NUMBERS IN THIS TOPIC ARE CORRECT THERE IS A DISCUSSION ON THE TYPE OF MATCHES AND THE PERIOD USED.

Updated two-leg statistics based on the first leg result since 1999 (the modern era). Now including statistics on penalty shooutouts (it is assumed the outcome of a penalty shoutout does not depend on the first leg result and thus needs to be treated separately). Only matches in qualifying rounds and knock-out rounds of the Champions League and Europa League/UEFA Cup are counted (matches that do have a 2nd leg).

See also 2008 statistics and original JPV statistics.

The table shows:
a. 1st leg result (home team vs. away team)
b. chance that the home team advances based on 1st leg result
c. chance that tie is decided with a penalty shootout
d. chance that the away team advances based on 1st leg result
e. number of times that the home team advances based on 1st leg result
f. number of times that tie is decided with a penalty shootout
g. number of times that the away team advances based on 1st leg result
h. total number of matches with the same 1st leg result
6-0   100% / 0% /   0%     19 -  0 -   0  ( 19)
5-0 100% / 0% / 0% 35 - 0 - 0 ( 35)
5-1 100% / 0% / 0% 26 - 0 - 0 ( 26)
4-0 100% / 0% / 0% 70 - 0 - 0 ( 70)
3-0 97% / 1% / 2% 146 - 1 - 3 (150)
4-1 91% / 4% / 4% 42 - 2 - 2 ( 46)
2-0 89% / 3% / 8% 175 - 6 - 15 (196)
3-1 85% / 0% / 15% 94 - 0 - 17 (111)
4-2 73% / 0% / 27% 11 - 0 - 4 ( 15)
1-0 63% / 5% / 32% 203 - 17 - 102 (322)
2-1 52% / 6% / 43% 102 - 11 - 85 (198)
3-2 51% / 2% / 47% 24 - 1 - 22 ( 47)
0-0 35% / 3% / 61% 92 - 9 - 160 (261)
1-1 28% / 5% / 67% 80 - 15 - 190 (285)
2-2 20% / 1% / 79% 22 - 1 - 85 (108)
0-1 13% / 4% / 83% 28 - 10 - 186 (224)
3-3 13% / 0% / 88% 2 - 0 - 14 ( 16)
1-2 6% / 1% / 93% 8 - 1 - 127 (136)
2-3 5% / 0% / 95% 2 - 0 - 38 ( 40)
0-2 4% / 1% / 96% 5 - 1 - 136 (142)
1-3 1% / 0% / 99% 1 - 0 - 67 ( 68)
0-3 0% / 1% / 99% 0 - 1 - 78 ( 79)
1-4 0% / 0% / 100% 0 - 0 - 24 ( 24)
0-4 0% / 0% / 100% 0 - 0 - 28 ( 28)
1-5 0% / 0% / 100% 0 - 0 - 12 ( 12)
0-5 0% / 0% / 100% 0 - 0 - 20 ( 20)
---- -- ---- ----
1187 76 1415 (2678)

44% 3% 53%

Only match results where at least 10 matches have the same result are included. Please note that match results with a low number of total matches have usually a large error margin. That excludes 77 matches from at grand total of 2755 matches in the period 1999-2012.

Re: Two-leg statistics based on first result 1999-2012
Author: bert.kassies
Date: 24-08-2012, 15:24
And to add something to the discussion between Ricardo and mavano at Forum 2 on the chances of Dutch teams to advance to the group stage:

Feyenoord (2-2 at home): 20%
AZ (lost 1-0 away): 35%
Twente (lost 3-1 away): 15%
Heerenveen (lost 2-0 away): 10%
PSV (won 5-0 away): 100%

So, the chance of at least 2 teams out of 5 is only about 60%.

Re: Two-leg statistics based on first result 1999-2012
Author: Metallica
Date: 24-08-2012, 16:04
thanks Bert
Dnipro(2-2 away) 79%
Metalist(2-0 away) 96%

not bad =)

Re: Two-leg statistics based on first result 1999-2012
Author: reisanibal
Date: 24-08-2012, 16:20
Thanks, wonderful statistics. And it is also interesting to note that if you are playing the first match away, you are more likely to advance 53% vs 44%

Re: Two-leg statistics based on first result 1999-2012
Author: SirHenri
Date: 24-08-2012, 16:28
Edited by: SirHenri
at: 24-08-2012, 16:39
Very nice update of the old data ! (now alltogether 1955 to 2012 )

All we need to know are the special cases 1-3 (hoping Gladbach adds a 2nd miracle) and 0-3 and maybe also 3-0 and 4-1. Which teams where included ? I can remember Werder Bremen to turn a 3-0 vs. Lyon around, but that'S it :-s

Re: Two-leg statistics based on first result 1999-2012
Author: nemesys
Date: 24-08-2012, 16:37
Great work, as always, Bert!

Cheers!

- nemesys

Re: Two-leg statistics based on first result 1999-2012
Author: Lorric
Date: 24-08-2012, 16:41
Edited by: Lorric
at: 24-08-2012, 16:45
Very nice Bert. It's funny I was thinking about this kind of thing yesterday. It's nice to see you getting creative, first that graph, and now this.

And yes, clear proof that the 2nd leg is an advantage. Although I'd like to see how the stats on that shape up if you remove CL last 16 and EL last 32 ties, due to the bias that exists there.

EDIT: The away draw is an extremely good result in a 2 leg tie, isn't it.

A 3-2 at home is only a very marginal advantage. Watch out Tromso...

Re: Two-leg statistics based on first result 1999-2012
Author: Forza-AZ
Date: 24-08-2012, 16:56
Thanks, wonderful statistics. And it is also interesting to note that if you are playing the first match away, you are more likely to advance 53% vs 44%

That might have something to do with the first knock-out round after the groupstage, since then the better teams in the groups always play away first. However the % that matches make up in the total number of matches shouldn't be that high to account for 9% difference. So this indeed shows that playing away first is an advantage.

Re: Two-leg statistics based on first result 1999-2012
Author: Lorric
Date: 24-08-2012, 17:07
Edited by: Lorric
at: 24-08-2012, 17:08
So, English analysis:

Liverpool: 85%
Newcastle: 69.5%

And the battle for 12th, Cyprus:

AEL Limassol: 55%
APOEL: 69.5%

Turkey:

Fenerbahce: 45%
Trabzonspor: 36.5%
Bursaspor: 85%


Are you going to update these stats after Q4 is over?

Re: Two-leg statistics based on first result 1999-2012
Author: Forza-AZ
Date: 24-08-2012, 19:23
Any special reason why you chose 1999 as a starting point, and not all European results ever (or at least since away goals rule is in place, I'm not sure if they always counted away goals).
Then you should get an even more accurate %, also for scores that happened less then 10 times since 1999.

Re: Two-leg statistics based on first result 1999-2012
Author: Overgame
Date: 24-08-2012, 19:24
SirHenr : I remember Deportivo - Milan (4-1 0-4). The 0-3 is coming from a team from Bucarest, after losing 0-3 on paper.

Re: Two-leg statistics based on first result 1999-2012
Author: bert.kassies
Date: 24-08-2012, 20:36
Some remarks or questions, some answers:

@Forza

1) If only qualifying matches are counted (total 1561 matches) the chances for 1st leg home/away almost don't change: 1st leg at home 45%, 1st leg away 53%.

2) Choosing 1999 as a starting point (that means first season used is 1998/99) is done because that was the start of the modern era with two cups and without the Cup Winners Cup. Which means in general more qualifying matches, evolving into the current format with 4 qualifying rounds. But the choice is arbitrary. In general you are right that more matches increase the accuracy, but since the format (and maybe the statistics itself) change over time I prefer not to go back too far.

@Lorric
I have no plans to update these stats mid season. There are too many arbitrary choices to make that relevant.


To get some idea of the differences I took some samples with only qualifying match results:

0-0 gives 32% / 65% for qualifying only vs. 35% / 61% for all matches with 2nd leg,
1-0 gives 66% / 30% for qualifying only vs. 63% / 32% for all matches with 2nd leg,
1-1 gives 25% / 69% for qualifying only vs. 28% / 67% for all matches with 2nd leg,
3-1 gives 84% / 16% for qualifying only vs. 85% / 15% for all matches with 2nd leg,
0-1 gives 9% / 86% for qualifying only vs. 13% / 83% for all matches with 2nd leg.

Hard to draw conclusions. It seems more like statistical noise then a systematic difference.

Re: Two-leg statistics based on first result 1999-2012
Author: bert.kassies
Date: 24-08-2012, 21:09
@SirHenri, at your service:

1) 1999 UC R1 VfB Stutgart - Feyenoord, 1-3 and 0-3
2) 2010 EL Q4 Dinamo Bucuresti - Slovan Liberec, 0-3 and 0-3, penalty shootout 9-8
3
a) 2000 UC R3 Olympique Lyon - Werder Bremen, 3-0 and 4-0
b) 2002 UC Q1 Trans Narva - IF Elfsborg, 3-0 and 5-0
c) 2004 CL Q1 Dinamo Tbilisi - SK Tirana, 3-0 and 3-0, penalty shootout 2-4
d) 2005 UC Q2 AEK Larnaca - Maccabi Petah-Tikva, 3-0 and 4-0
4
a) 2000 CL Q2 Litex Lovech - Widzew Lódz, 4-1 and 4-1, penalty shootout 2-3
b) 2001 UC R1 Real Zaragoza - Wisla Kraków, 4-1 and 4-1, penalty shootout 3-4
c) 2002 UC R3 Club Brugge - Olympique Lyon, 4-1 and 3-0
d) 2004 CL QF AC Milan - Deportivo La Coruńa, 4-1 and 4-0

Re: Two-leg statistics based on first result 1999-2012
Author: Lorric
Date: 24-08-2012, 21:45
Edited by: Lorric
at: 24-08-2012, 21:45
Now this is interesting:

a) 2000 CL Q2 Litex Lovech - Widzew Lódz, 4-1 and 4-1, penalty shootout 2-3
b) 2001 UC R1 Real Zaragoza - Wisla Kraków, 4-1 and 4-1, penalty shootout 3-4

Wisla Krakow emulating fellow Polish side Widzew Lodz.

Re: Two-leg statistics based on first result 1999-2012
Author: AlanK
Date: 24-08-2012, 23:05
@bert:

What is probably meant by "1999" is the 1999-2000 season: in 1998-1999, the Cup Winners's Cup ("Recopa") was played, Lazio beating Mallorca 2-1 in the Final; winning goal by Pavel Nedved--just thought you should know.

Re: Two-leg statistics based on first result 1999-2012
Author: gogu
Date: 25-08-2012, 01:06
@bert

"To get some idea of the differences I took some samples with only qualifying match results:

0-0 gives 32% / 65% for qualifying only vs. 35% / 61% for all matches with 2nd leg,
1-0 gives 66% / 30% for qualifying only vs. 63% / 32% for all matches with 2nd leg,
1-1 gives 25% / 69% for qualifying only vs. 28% / 67% for all matches with 2nd leg,
3-1 gives 84% / 16% for qualifying only vs. 85% / 15% for all matches with 2nd leg,
0-1 gives 9% / 86% for qualifying only vs. 13% / 83% for all matches with 2nd leg.

Hard to draw conclusions. It seems more like statistical noise then a systematic difference."

You only can see the difference for the general case and not for a particular result. If the better team plays away first, the chance that the away team gets a good result is higher. After that I don't think there is a statistical difference for the 2nd leg. I hope I was clear.

Re: Two-leg statistics based on first result 1999-2012
Author: SirHenri
Date: 25-08-2012, 07:47
@ bert

thanks A LOT again fpr those 10 special cases !!!

I couldn't remember that Stuttgart matchup vs. Feyernoord (maybe because it was 1st round), but now it's in my head again !

Re: Two-leg statistics based on first result 1999-2012
Author: bert.kassies
Date: 25-08-2012, 07:55
@AlanK (and Forza and Lorric)

Darn, you are right. I messed it up. Knowing that 1999 was the last year with CWC, I looked into the database section and saw 1999 as a split for the coefficient calculation (qualifying matches only counted for 50% and completely new team coefficients). And then I went on, and included the 1998/99 season.

So, that makes the choice for 1999 even more arbitrary, haha. Maybe I should do a recalc and taking into consideration also remarks of Forza and Lorric. E.g. by taking a 10-year and 30-year moving window, and publish them each year. By comparing those we might be able to see a trend. Though it will difficult (like proving a climate change), it could be some nice stuff for discussions.

And there is also the fact which Forza mentioned that the round of 16 after the group stage in CL has the group winners playing away in the 1st leg (since the new format was introduced in 2003/04). Though it is just a small amount of matches it is not good to include a systematic bias. So, it might be better to exclude these matches.

The CL round of 16 tie results since 2003/04 are: (1st leg home wins / shootout / 1st leg away wins)
2003/04:  2  /  0  /  6
2004/05: 3 / 0 / 5
2005/06: 1 / 0 / 7
2006/07: 2 / 0 / 6
2007/08: 3 / 2 / 3
2008/09: 2 / 1 / 5
2009/10: 4 / 0 / 4
2010/11: 1 / 0 / 7
2011/12: 2 / 1 / 5
------- -- -- --
total 20 / 4 / 48

Re: Two-leg statistics based on first result 1999-2012
Author: bert.kassies
Date: 25-08-2012, 07:57
@gogu

Sorry but I don't understand what you are saying.

Re: Two-leg statistics based on first result 1999-2012
Author: Lorric
Date: 25-08-2012, 14:20
Don't forget the Europa League/UEFA Cup first KO round Bert. That's 1st vs 2nd in Europa and 1st vs 3rd in UEFA. And 2nd vs CL dropouts.

Re: Two-leg statistics based on first result 1999-2012
Author: bert.kassies
Date: 25-08-2012, 21:54
Sure Lorric, the same applies to the UC/EL round of 32 since 2004/05. Tomorrow I hope to find some time to do the updated statistics.

Re: Two-leg statistics based on first result 1999-2012
Author: bollie76
Date: 25-08-2012, 22:17
@Bert: In all qualifying matches, how often the seeded team did play the first match away from home?
And how often does an unseeded team beats a seeded team?

I ask this, because I think being seeded/unseeded is much more a factor than playing away/home in the first game.
So we have to take a look at the seeded/unseeded statistics.
Remember EL-Q2 draw last year. All seeded teams played the second leg at home...

Re: Two-leg statistics based on first result 1999-2012
Author: bert.kassies
Date: 25-08-2012, 22:28
@bollie: I don't think that will be an issue here. That's just luck of the draw. Since UEFA used a draw by numbered balls for groups of 5 seeded vs. 5 unseeded teams on average it will happen once in 32 years that all seeded teams play the first leg at home. That's no systematic bias.

Of course statistics on how often unseeded teams beat seeded teams could also be interesting, but that's a different subject.

Re: Two-leg statistics based on first result 1999-2012
Author: bollie76
Date: 25-08-2012, 22:36
What I wanted to say: statistically it is certainly not unlikely that (over a sample of 1500 matches) the seeded teams played in 53% of the matches the first game away (or at home).
And this would put the home/away advantage in another spotlight (either one direction or the other).

Re: Two-leg statistics based on first result 1999-2012
Author: bert.kassies
Date: 25-08-2012, 22:43
True, but I tend to see that as one of the factors that's within the statistical noise.

And finally there is also a practical problem: my database of match results doesn't have seeding info available

Re: Two-leg statistics based on first result 1999-2012
Author: mavano
Date: 26-08-2012, 11:59
Edited by: mavano
at: 26-08-2012, 12:28
I'm not disputing the numbers, but I will dispute their relevance. They give an indication, but have a lot of lose ends.

First of, those stats literally throw away half their own relevance. It discards the results in the 2nd match. Say Heerenveen wins 5-0 next week, What was the anomaly then? The 5-0? the 2-0? Heerenveen lost 1-0 away in the last round. But went though. Why can't I see that stat in the list? Why doesn't that count when assessing AZ's chances in this round? And why does some result 10 years ago between a team from Portugal and Georgia has more relevance? I know you open up a can of worms by including results from the 2nd match in the same list, but you can't just ignore them because they don't fit the model and then just act like they're not there.

The list doesn't differentiate between seeded and unseeded. Let's say 3-1 happened 100x. 85x the home team went through the next week. That would make 85% in the list. Now we look at seeded/unseeded. We assume the draws are random so 50% of the time the seeded team starts at home. 49x of 50 the seeded team indeed went through. That leaves 36 of 50 for unseeded teams that started at home. So the argument that (according to this example) Twente has a 28% chance of progressing is just as valid as saying it's 15%.

Even better. 25x the gap in coefficient between the teams was bigger then 15 points and 25x it wasn't. Of the times it wasn't, the seeded team went through 1x out of 25. The times it was, the seeded team went through 13x of 25.

So, if this list was complete Twente could perhaps be a 52% favorite to make it to the next round. Off course they're not, but I hope you get my point.



I applaud the work that many of you do, and the list gives an indication, but there is still a lack sufficient data (imho) but more importantly there are some serious holes in it's setup.

Re: Two-leg statistics based on first result 1999-2012
Author: bert.kassies
Date: 26-08-2012, 13:37
mavano,

The percentages in this topic are based on average results. If you think there is reason to believe that the Dutch teams involved in the play-offs are special, then these percentages do not apply.

However, the conclusion that besides PSV just one other Dutch club will qualify for EL group stage, is by far the most probable outcome. Even tuning the percentages a bit will not change much. The chance that no clubs besides PSV qualify is pretty much bigger then the chance that 2 other clubs and PSV qualify.

Of course, together with you, I hope we beat the stats

Re: Two-leg statistics based on first result 1999-2012
Author: mavano
Date: 26-08-2012, 16:57
It wasn't meant as reasoning for why Dutch teams will or won't win. And I don't understand why you imply that it was... I used the Dutch results as examples for the flaws of the list. I'm not saying the flaws in the list will mean that the Dutch teams will win and are special. You kind of twisted my post 180 degrees...

Let me ask again outside of Dutch results:

- How many teams won 2-0 at home and lost 3-1 in the 2nd match? Why does that go in the list under 2-0 and not under 3-1? Wouldn't the list be much more precise when both results are put in a Matrix?

This list discards the scoreline of 50% of all matches, You can't argue that. But despite missing half of all relevant data I'm supposed to look at it as a good predictor for all future results?

- We need the numbers for seeded/unseeded. It has a big influence on the draw. But in the list it doesn't exist. It could easily make a 10% difference or more.

Re: Two-leg statistics based on first result 1999-2012
Author: reisanibal
Date: 26-08-2012, 19:07
Edited by: reisanibal
at: 26-08-2012, 19:13
@ gogu
I think you misunderstood what Bert wrote. He compares the qualifying rounds vs. all the matches. 2nd legs are included in all of them.

Re: Two-leg statistics based on first result 1999-2012
Author: bert.kassies
Date: 26-08-2012, 19:19
Sorry mavano, I had no intention to twist your post in any way. My answer was based on your reaction on Forum 2 to a post by Ricardo who was referring to the percentages that I used in the 1st reply of this topic.

I ignored your remarks on not using 2nd leg results because I don't understand them. This topic is on two leg statistics based on the 1st leg result. By definition there is no knowledge of the 2nd leg result. The whole meaning is to use only the result of the 1st leg to say something on the result of the complete tie.

About the use of seeded/unseeded data see my remark to bollie above.

Re: Two-leg statistics based on first result 1999-2012
Author: Forza-AZ
Date: 26-08-2012, 20:39
So mavano wants to have a list of all teams that won at home 2-0 for instance (no matter if it was the 1st or 2nd leg) and the percentage of that teams that qualified for next round.

That can be very interesting stats, but it's a completely different one from the one we are discussing here.
It would be interesting however to know if the percentages are very different between the 2 stats. I guess it won't be.

Re: Two-leg statistics based on first result 1999-2012
Author: mavano
Date: 26-08-2012, 22:19
Edited by: mavano
at: 26-08-2012, 22:20
I'll switch to Dutch for this one post to explain what I meant to Bert:

Wat ik bedoelde te zeggen is dat het resultaat in de terugwedstrijd in deze lijst wordt genegeerd. Stel een ploeg wint zijn thuiswedstrijd met 2-0, en gaat door naar de volgende ronde, of het dan 2-0 of 1-3 werd in de 2de wedstrijd weten we niet.

89% van de ploegen ging door na een 2-0 thuisoverwinning in de eerste wedstrijd. Maar misschien (puur theoretisch) verloor 70% van die ploegen de uitwedstrijd wel met 3-1... Zouden die 3-1's dan niet moeten worden meegewogen bij het bereken van de kansen van Twente?

Misschien zouden alle resultaten in een excellsheet kunnen worden gezet. Van links naar rechts de resultaten in de heenwedstrijd (1-0, 1-1, 2-1, 2-2, enz) en van boven naar beneden de resultaten van de terugwedstrijd. Dan kan je per combinatie (2-0 thuis/1-1 uit, 3-1 thuis/0-1 uit, enz, enz) turven. En aan de uiteinden van de tabel kan je per thuis EN uit resultaat zien hoe vaak de ploeg doorging. Uit die tabel kan je het bovenstaande lijstje halen, maar nog heel veel meer.

Re: Two-leg statistics based on first result 1999-2012
Author: nemesys
Date: 26-08-2012, 22:46
I was just about to say the same as AZ.

Ans I also agree with Gogu, what he says it is true: if in few matches the "first match away" club was stronger on paper (since first in group) or not, it doesn't matter.

Because this stat is not thought IMO to show if first away is an advantage, but just: how many times out of 100, a club that wins the first leg home 2-1, will go trough? The answer is 52%, good enough for me.

This work does exactly what it is meant for, so from me it is only: thanks for your awesome job Bert!

Then if someone else whats to elaborate and refine those stats adding more criteria to get conclusions on other matters, and then publish its result, I believe that would be welcome.

As always, just my 2 cents!
(I could be wrong in my positions).

Cheers!

- nemesys

Re: Two-leg statistics based on first result 1999-2012
Author: Yuval
Date: 27-08-2012, 02:21
Edited by: Yuval
at: 27-08-2012, 03:02
Great info, bert! You've done it again . Thank you.

Can you add statistics about penalty shootouts? I mean, regardless of the first leg result of course, just the chance of the home/away team to win it when it comes to penalty shootout. I think it will help complete the picture of first/second leg statistics.

Re: Two-leg statistics based on first result 1999-2012
Author: bert.kassies
Date: 27-08-2012, 10:21
mavano,

Thanks for your explanation. I think the method you propose is now clear to me.

However, I still do not think that adding 2nd leg matches would improve the prediction of whether a club goes to next round based on first leg result.

The reason is that I think that the character of the 2nd leg match is too much different from the first leg. Comparable to the reason why group stage results are excluded.

So, I keep the table as is, though in a new version for a 10-year and 30-year period at Two-leg statistics based on the first leg result 2012.

In the next weeks my plan is to upload club coefficients to the database with match results, so that I will be able to do some stats on the performance of clubs based on team ranking.

Re: Two-leg statistics based on first result 1999-2012
Author: bert.kassies
Date: 27-08-2012, 11:08
@Yuval

See new topic for penalty shootout stats.

Re: Two-leg statistics based on first result 1999-2012
Author: Overgame
Date: 27-08-2012, 15:16
@mavano : I can read dutch (but I'm probably not able to write something in Dutch), so I'll answer to your last post.

The point of Bart's stats is simple : they are just stats. You don't take everything into account, like the strength of both teams.

Example : a "weaker" team wins 2-1 against a "giant", but with a ton of luck, like 2 half-opporutinities and 2 goals, a bad game for the "giant', etc. You won't give them more than 20% chance (or even less !) in the second leg, even if the stat says "more than 50%".

You can read the stat in this way :
"If I don't have any information about the 2 teams, the probability of qualification are : {read the table> after that result in the first leg". Nothing more, nothing less. Having more/less information will change the probabilities.

Example : let's take AZ-Anzhi .
-If I don't know the result of the first leg, the table says "AZ has 53% chance to qualify (without counting the PSO)".
-If I know the result of the first leg, the table says "AZ has 32% chance to qualify (without counting the PSO)".
-If I have more information, I will come to another number, between 0 and 100 Perhaps Anzhi (or AZ) has bought the game

And back to your message, the result of the second leg isn't really important. That results is really dependant of the teams involved, some are used to play the second game at 100%, some are used to get the minimal result.