# Forum statistics, which aircraft is most popular



## Marcel (Jan 7, 2017)

There was a question on the forum about bots, which reminded me of a little experiment I did some years ago. I was at that time preparing for a project at my work where I would mine data from scientific publications and do some analysis on that. So I thought as a first experiment to do the same on this forum. So I wrote a bot that mined the posts from the Aviation forum and did some simple analysis on it. The thread you can find here The forum's most discussed single engined fighter of ww2 is....

So after reading the post about bots, I decided to brush off the dust on that old script and see if it would work again on the new forum layout. Of course it didn't. So I cleaned up the code, migrated the code from python2 to python3 (the program language, not the snake) and enhanced the working of the script.

The old script was very simple. It collected all the posts, filtered out the quotes and stored it as one long text file. I now changed this behavior in that it also stores the author, the posted date and the post number in the thread. The mined data will then be put in a little relational database so I can analyse more easily. I did not include the word-cloud script anymore, can still add that later.

For the programmers among us, the code can be found on github: GitHub - mke21/ww2aircraftcrawler: little script to find some simple statistics on the ww2aircraft forum.

*If you didn't follow all that technical stuff, the summary is this: I made a script to count the occurrence of aircraft names on the forum. So I can count how many posts contained the word 'spitfire' or another aircraft. I can also do that per user or per year. I thought it would be a nice experiment to think up some analysis that I can do. These are the same techniques that Google, Microsoft and Facebook release on you in your daily browsing time. Of course their setup is much more sophisticated
For now I have a list of 90 single engine fighters. I will post this later in a post. Maybe you guys can come up with ideas that we could try.*

I will post the current list tomorrow. Hope you will join me.

Reactions: Informative Informative:
1 | Like List reactions


----------



## Wurger (Jan 7, 2017)




----------



## Gnomey (Jan 7, 2017)

Very cool Marcel, will be interesting to see the results.


----------



## Wayne Little (Jan 8, 2017)

Bring it on...


----------



## Marcel (Jan 8, 2017)

No ideas? I'll post some next time to give you some ideas what I can do here.

So here are the 90 aircraft that I already have. In the program I try to take into account all ways that a user could type the name. So a '-' can also be a '.' or a ' ' or a '_' etc. BF109 can also be ME109, Spitfire can be Spit etc. Making these search strings is most difficult and time consuming. I use regex for that, if you want, you can look the real search-strings in the git repository I gave in the opening post. You'll have to look at the file 'aircraft.py' which is in the ww2crawler directory.


```
Avia B-534
Avia BH-33
Armstrong Whitworth Scimitar
Boeing P-12
Bristol Bulldog
Fairey Firefly II
Fiat CR.32
Fiat CR.42
Gloster Gamecock II
Gloster Gauntlet
Gloster Gladiator
Grumman F3F
Hawker Demon
Hawker Fury
Hawker Nimrod
Heinkel He 51
Kawasaki Ki-10
Koolhoven F.K.52
Polikarpov I-15
Arsenal VG-33
Avia B-135
Bell P-39 Aircobra
Bell P-63 Kingcobra
Bloch MB.150-157
Boeing P-26 Peashooter
Breda Ba.27
Brewster Buffalo
CAC Boomerang
Caudron C.714
Curtiss P-36
Curtiss P-40
Curtiss-Wright CW-21
Dewoitine D.500/D.510
Dewoitine D.520
Fiat G.50
Fiat G.55
Focke-Wulf Fw 190
Focke-Wulf Ta 152
Fokker D.XXI
Grumman F4F/FM Wildcat
Grumman F6F Hellcat
Grumman F8F Bearcat
Hawker Hurricane
Hawker Tempest
Hawker Typhoon
Heinkel He 112
IAR 80
Ikarus IK-2
Kawanishi N1K/N1K-J
Kawasaki Ki-61
Kawasaki Ki-100
Koolhoven F.K.58
Lavochkin LaGG-1
Lavochkin LaGG-3
Lavochkin La-5
Lavochkin La-7
Loire 46
Macchi C.200
Macchi C.202
Macchi C.205
Messerschmitt Bf 109
Mikoyan-Gurevich MiG-1
Mikoyan-Gurevich MiG-3
Mitsubishi A5M
Mitsubishi A6M Zero
Mitsubishi J2M
Morane-Saulnier M.S.406
Nakajima Ki-27
Nakajima Ki-43
Nakajima Ki-44
Nakajima Ki-84
North American P-51 Mustang
Polikarpov I-16
PZL P.7
PZL P.11
PZL P.24
Reggiane Re.2000
Reggiane Re.2001
Reggiane Re.2005
Republic P-43
Republic P-47 Thunderbolt
Seversky P-35
Supermarine Seafire
Supermarine Spitfire
Vought F4U/FG Corsair
Vultee P-66 Vanguard
Yakovlev Yak-1
Yakovlev Yak-3
Yakovlev Yak-7
Yakovlev Yak-9
```


----------



## Marcel (Jan 8, 2017)

So what can you do. Mind you this is a simple script I wrote. I am not using any 'Natural Language' or other fancy algorithms, this is simply a wordcount. Also the list of aircraft is not extensive as you saw.
Yesterday evening, I set the script to work to download all of the 6000+ threads in the Aviation forum. For me this forum represents the hart of this site. This downloading is by fat the longest step here, as crawling through this forum takes about 3 hours when using a single cpu. Multithreading would make this faster, but that would make the script more complex and I don't want to spend money hiring an Amazon cluster.

First of all, the most simple of questions: who is the champion mentioning SE fighters in his post? This is the top 10:

```
Shortround6|2937
tomo pauk|2630
GregP|2307
drgondog|2089
stona|1587
davebender|1302
FLYBOYJ|1210
parsifal|1160
wuzak|980
kool kitty89|961
```

The other obvious one, which aircraft of the list is mentioned most in the posts. Although I am now only counting the post and mentioning twice in one post still only counts as one, I think not much have changed since the last time I did this:

```
Supermarine Spitfire|16436
North American P-51 Mustang|12972
Messerschmitt Bf 109|8848
Republic P-47 Thunderbolt|7446
Focke-Wulf Fw 190|7280
Curtiss P-40|5909
Hawker Hurricane|5742
Mitsubishi A6M Zero|5575
Vought F4U/FG Corsair|5347
Grumman F6F Hellcat|3313
Grumman F4F/FM Wildcat|3219
Bell P-39 Aircobra|2948
Focke-Wulf Ta 152|2130
Hawker Typhoon|2056
Hawker Tempest|1772
Brewster Buffalo|1596
Grumman F8F Bearcat|1124
Hawker Demon|1065
Curtiss P-36|926
Bell P-63 Kingcobra|824
Supermarine Seafire|743
Mikoyan-Gurevich MiG-1|646
Hawker Fury|627
Nakajima Ki-43|627
Fokker D.XXI|615
Gloster Gladiator|555
Heinkel He 51|515
Polikarpov I-16|510
Nakajima Ki-84|498
Lavochkin La-5|467
Kawasaki Ki-61|454
Yakovlev Yak-3|438
Fiat G.50|420
Fiat G.55|408
Lavochkin La-7|386
Dewoitine D.500/D.510|360
Heinkel He 112|331
Mikoyan-Gurevich MiG-3|322
Fiat CR.42|316
Republic P-43|308
Dewoitine D.520|307
Yakovlev Yak-1|307
Seversky P-35|279
Reggiane Re.2005|258
Kawasaki Ki-100|251
Nakajima Ki-44|245
Lavochkin LaGG-3|242
Nakajima Ki-27|236
Mitsubishi J2M|213
Boeing P-26 Peashooter|200
Morane-Saulnier M.S.406|199
Macchi C.202|197
Polikarpov I-15|193
CAC Boomerang|189
Kawanishi N1K/N1K-J|184
Macchi C.200|145
Mitsubishi A5M|141
Macchi C.205|132
Vultee P-66 Vanguard|118
Reggiane Re.2000|113
Arsenal VG-33|96
Curtiss-Wright CW-21|94
Grumman F3F|81
Reggiane Re.2001|80
IAR 80|72
Yakovlev Yak-7|69
Yakovlev Yak-9|69
Gloster Gauntlet|64
Fiat CR.32|52
Avia B-534|46
Hawker Nimrod|40
Bloch MB.150-157|30
Lavochkin LaGG-1|26
Avia B-135|22
Boeing P-12|22
Ikarus IK-2|18
PZL P.11|18
Koolhoven F.K.58|17
Caudron C.714|14
Armstrong Whitworth Scimitar|13
Kawasaki Ki-10|13
Loire 46|13
PZL P.24|6
PZL P.7|4
Breda Ba.27|3
Koolhoven F.K.52|3
Fairey Firefly II|1
Gloster Gamecock II|1
```

I was right, the spitfire is on top again, so who is so fond of this spitfire? Again the top 10:

```
tomo pauk|836
Shortround6|796
stona|784
wuzak|460
GregP|438
parsifal|424
drgondog|410
nuuumannn|367
Glider|347
Soren|310
```
 Soren in the top 10 and the oldies know for sure that Soren was not a real fan of the Spitfire. This illustrates the limitation of this methode. It doesn't tell you whether the post was positive or negative. For that I would have to put way more time into this.

So what is my most mentioned aircraft in the Aviation forum?

```
Marcel|Fokker D.XXI|75
Marcel|Messerschmitt Bf 109|43
Marcel|Supermarine Spitfire|33
Marcel|Brewster Buffalo|26
Marcel|Hawker Hurricane|18
Marcel|Heinkel He 112|17
Marcel|North American P-51 Mustang|10
Marcel|Focke-Wulf Fw 190|7
Marcel|Mitsubishi A6M Zero|6
Marcel|Vought F4U/FG Corsair|5
Marcel|Grumman F4F/FM Wildcat|4
Marcel|Koolhoven F.K.58|4
Marcel|Curtiss P-36|3
Marcel|Curtiss P-40|3
Marcel|Grumman F6F Hellcat|3
Marcel|IAR 80|3
Marcel|Bell P-39 Aircobra|2
Marcel|Curtiss-Wright CW-21|2
Marcel|Fiat G.50|2
Marcel|Hawker Tempest|2
Marcel|Breda Ba.27|1
Marcel|Caudron C.714|1
Marcel|Dewoitine D.520|1
Marcel|Gloster Gamecock II|1
Marcel|Gloster Gauntlet|1
Marcel|Gloster Gladiator|1
Marcel|Hawker Fury|1
Marcel|PZL P.11|1
Marcel|Polikarpov I-15|1
Marcel|Polikarpov I-16|1
```

Again no surprises there, the Fokker on top It is probably because I did the D.XXI in Dutch service thread, some years ago.

Reactions: Bacon Bacon:
3 | Like List reactions


----------



## Wurger (Jan 8, 2017)




----------



## herman1rg (Jan 8, 2017)

I think I might now what my own most mentioned aircraft is...................


----------



## Marcel (Jan 8, 2017)

When I start the computer again, I will see what it is. 

Btw are there any requests for aircraft? You see my current list in an earlier post but which aircraft do you want me to add?

Reactions: Bacon Bacon:
1 | Like List reactions


----------



## GrauGeist (Jan 8, 2017)

How about some obscure fighter types that saw service in WWII?

Avia B-135 (Czechslovakia)
MAVAG Heja/MÁVAG Héja (Hungary) 
Rogozarski/Rogožarski IK-3 (Yugoslavia)
VL Myrsky (Finland)


----------



## rochie (Jan 8, 2017)

Great stuff Marcel.

So you have proved the Spitfire was the greatest fighter of all time and the Bf 109 was only third !

Reactions: Like Like:
1 | Like List reactions


----------



## parsifal (Jan 8, 2017)

Very impressive.....


----------



## GrauGeist (Jan 9, 2017)

rochie said:


> Great stuff Marcel.
> 
> So you have proved the Spitfire was the greatest fighter of all time and the Bf 109 was only third !


Only these were the best, Karl, because they had those pesky roundels removed!

Reactions: Like Like:
2 | Like List reactions


----------



## Thorlifter (Jan 9, 2017)

Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair

There, that ought to help move the best plane of WWII to the top of the list!!!


----------



## Marcel (Jan 9, 2017)

Thorlifter said:


> Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair Corsair
> 
> There, that ought to help move the best plane of WWII to the top of the list!!!


Said the man with a P40 in his siggy 

2 problems though:
1. I only mined the Aviation forum
2. All these corsairs are in 1 post, so count as 1


----------



## rochie (Jan 9, 2017)

GrauGeist said:


> Only these were the best, Karl, because they had those pesky roundels removed!
> 
> View attachment 362358


----------



## Thorlifter (Jan 9, 2017)

Marcel said:


> Said the man with a P40 in his siggy


----------



## Wurger (Jan 9, 2017)




----------



## Marcel (Jan 9, 2017)

herman1rg said:


> I think I might now what my own most mentioned aircraft is...................


Okay Herman, you don't talk much about SE fighters. Here is your score:

```
herman1rg|Hawker Hurricane|5
herman1rg|Supermarine Spitfire|4
herman1rg|Grumman F6F Hellcat|2
herman1rg|Hawker Demon|2
herman1rg|Messerschmitt Bf 109|2
herman1rg|Republic P-47 Thunderbolt|2
herman1rg|Grumman F4F/FM Wildcat|1
herman1rg|Hawker Fury|1
herman1rg|North American P-51 Mustang|1
```


----------



## Marcel (Jan 9, 2017)

BTW guys, I found this wiki: List of aircraft of World War II - Wikipedia I think I used it last time to create the current list. I think I will spend some time to put these into a searchlist.


----------



## Marcel (Jan 9, 2017)

Okay, added some heavy fighters and heavy bombers. Didn't start on the medium bombers and attack aircraft as it is a huge list. Also still have to do the transport a/c, but that is also a huge list.

Spitfire and Mustang are still on top, way ahead of the others. As I stated in the previous thread, anyone mentioning one of these fighters should be banned for lack of originality 
We see that of the heavy fighters, the p-38 is quite popular. I was surprised to see that the bf-110 lags behind so much.

Anyway, here the updated list:

```
Supermarine Spitfire|16436
North American P-51 Mustang|12972
Messerschmitt Bf 109|8848
Lockheed P-38|7677
Republic P-47 Thunderbolt|7446
Focke-Wulf Fw 190|7280
Avro Lancaster|5910
Curtiss P-40|5909
Hawker Hurricane|5742
Mitsubishi A6M Zero|5575
Vought F4U/FG Corsair|5347
Boeing B-17 Flying Fortress/Fortress|4512
Messerschmitt Me 262|3990
Grumman F6F Hellcat|3313
Grumman F4F/FM Wildcat|3219
Boeing B-29 Superfortress|3178
Consolidated B-24 Liberator|2954
Bell P-39 Aircobra|2948
Messerschmitt Bf 110|2539
Focke-Wulf Ta 152|2130
Hawker Typhoon|2056
Hawker Tempest|1772
Brewster Buffalo|1596
Bristol Beaufighter|1294
Heinkel He 177|1264
Westland Whirlwind|1147
Grumman F8F Bearcat|1124
Gloster Meteor|1089
Hawker Demon|1065
Lockheed P-80 Shooting Star|966
Messerschmitt Me 410|961
Curtiss P-36|926
Boulton Paul Defiant|910
Handley Page Halifax|885
Northrop P-61 Black Widow|872
Fairey Fulmar|825
Bell P-63 Kingcobra|824
Fokker G.I|821
Messerschmitt Me 210|773
Heinkel He 162|771
Supermarine Seafire|743
Focke-Wulf Fw 200|678
Heinkel He 219|659
Mikoyan-Gurevich MiG-1|646
Hawker Fury|627
Nakajima Ki-43|627
Fokker D.XXI|615
Gloster Gladiator|555
Grumman F7F Tigercat|554
Short Stirling|553
Heinkel He 51|515
Polikarpov I-16|510
Nakajima Ki-84|498
Avro Manchester|495
Lavochkin La-5|467
Kawasaki Ki-61|454
Yakovlev Yak-3|438
Fiat G.50|420
Fiat G.55|408
Lavochkin La-7|386
Dewoitine D.500/D.510|360
Fairey Firefly|349
Blackburn Skua|341
Heinkel He 112|331
Mikoyan-Gurevich MiG-3|322
Blackburn Roc|321
Fiat CR.42|316
Republic P-43|308
Dewoitine D.520|307
Yakovlev Yak-1|307
Bell P-59 Airacomet|292
Seversky P-35|279
Consolidated B-32 Dominator|275
Focke-Wulf Ta 154|265
Reggiane Re.2005|258
Kawasaki Ki-100|251
Nakajima Ki-44|245
Lavochkin LaGG-3|242
Nakajima Ki-27|236
Mitsubishi J2M|213
Boeing P-26 Peashooter|200
Morane-Saulnier M.S.406|199
Macchi C.202|197
Polikarpov I-15|193
CAC Boomerang|189
Kawanishi N1K/N1K-J|184
Piaggio P.108|168
Petlyakov Pe-3|158
Macchi C.200|145
Mitsubishi A5M|141
Macchi C.205|132
Petlyakov Pe-8|128
Vultee P-66 Vanguard|118
Grumman Goblin|117
Reggiane Re.2000|113
Kawasaki Ki-45|100
Farman F.221-223|99
Westland Welkin|98
Arsenal VG-33|96
Curtiss-Wright CW-21|94
Grumman F3F|81
Reggiane Re.2001|80
Vickers Warwick|79
IAR 80|72
Consolidated PB4Y-2 Privateer|71
Yakovlev Yak-7|69
Yakovlev Yak-9|69
Gloster Gauntlet|64
Ryan FR Fireball|62
Fiat CR.32|52
Tupolev TB-3|51
Avia B-534|46
Hawker Nimrod|40
Bloch MB.150-157|30
Nakajima J1N|30
Lavochkin LaGG-1|26
Avia B-135|22
Boeing P-12|22
Ikarus IK-2|18
PZL P.11|18
Bell YFM-1 Airacuda|17
Curtiss Hawk II|17
Koolhoven F.K.58|17
Caudron C.714|14
Armstrong Whitworth Scimitar|13
Kawasaki Ki-10|13
Loire 46|13
Kawasaki Ki-102|12
Curtiss Hawk III|10
Kochyerigin DI-6|10
Potez 630|10
Blériot-SPAD S.510|6
IMAM Ro.57|6
Messerschmitt Me 163|6
PZL P.24|6
PZL P.7|4
Avia BH-33|3
Blohm & Voss BV 142|3
Breda Ba.27|3
Koolhoven F.K.52|3
Mitsubishi Ki-109|2
Fairey Firefly II|1
Gloster Gamecock II|1
```


----------



## Old Wizard (Jan 9, 2017)




----------



## herman1rg (Jan 10, 2017)

Cheers Marcel


----------



## Gnomey (Jan 10, 2017)

Interesting stuff!


----------

