Following with some WRTC2006
Post Contest analysis, we are pleased to disclose to
our WRTC2006 friends a analysis made by our friend Fabian DJ1YFK regarding Dx
Spots /WRTC2006 logs.
WRTC 2006 DXspot /
Log analysis
Thanks
to the policy of the WRTC 2006 to make all logs
public and thus offering a data pool of about 85000
QSOs, it is possible to generate a lot of
interesting statistics. One topic in contesting that
has been discussed a lot lately is cheating by using
the DXCluster but claiming unassisted.
By
using the data from 46 WRTC log and the collected
DXCluster spots for each of the WRTCstations from
the OH2AQDatabase, it is easy to draw conclusions
who was using the DXCluster to find WRTC stations
and who didn't.
Correlating Cluster
Spots with Log Entries
This
analysis bases on the assumption, that there is a
correlation between the time at which a DXspot is
made and the time of the appearance of a station who
saw the spot in the contest log. For every spot that
shows up in the database for a certain station, say
PT5V, the contents of the following x
minutes of PT5V's log are evaluated, where the value
of the time frame x can be varied.
Example
Right
at the start of the contest PT5V was spotted by
NQ4I.
NQ4I 21012.7 PT5V 1212 08 Jul 2006
Lets
say the time frame was set to 5 minutes. The
following QSOs from PT5V's log are extracted:
QSO: 21012 CW 20060708 1212 PT5V 599 15 NQ4I 599 08
QSO: 21012 CW 20060708 1212 PT5V 599 15 KA9FOX 599 07
QSO: 21012 CW 20060708 1212 PT5V 599 15 NO2R 599 08
QSO: 21012 CW 20060708 1213 PT5V 599 15 W5GN 599 07
QSO: 21012 CW 20060708 1213 PT5V 599 15 K3ND 599 08
QSO: 21012 CW 20060708 1213 PT5V 599 15 VA2WDQ 599 04
QSO: 21012 CW 20060708 1213 PT5V 599 15 W9IU 599 08
QSO: 21012 CW 20060708 1214 PT5V 599 15 W1ZT 599 08
QSO: 21012 CW 20060708 1214 PT5V 599 15 WA5POK 599 07
QSO: 21012 CW 20060708 1215 PT5V 599 15 K8MFO 599 08
QSO: 21012 CW 20060708 1215 PT5V 599 15 W9OL 599 08
QSO: 21012 CW 20060708 1215 PT5V 599 15 W3YY 599 08
QSO: 21012 CW 20060708 1215 PT5V 599 15 K2AAW 599 07
QSO: 21012 CW 20060708 1216 PT5V 599 15 WB2AA 599 08
QSO: 21012 CW 20060708 1216 PT5V 599 15 PX5A 599 15
For
every callsign which appearsin this time frame,
except the callsign of the spotting station itself,
a counter is increased. These stations have possibly
used the DXcluster spot, sent by NQ4I, to find and
work PT5V, but this might just have been coincidence.
This
will be repeated for every single DXspot
that has been sent during the WRTC. About 2650
spots for the PT5/PW5stations have been made, which
results in 2650 possible xminute time
frames. In every of these time frames, the number of
appearances of every single callsign in the
corresponding log was counted.
Evaluation
Just
counting the absolute number of appearances after a
DXspot would not make much sense. A station that
works 100 QSOs with WRTC stations and  by
coincidence  20 of these QSOs happen within x
minutes after a DXspot would look more suspicious
than someone who works a total of 10 WRTC stations
and 9 of them within a time frame after a spot.
What
is interesting is the percentage of QSOs
for a specific station that were made within x
minutes after a DXspot.
Since
there are quite a number of uniques within the WRTC
logs, and  by the rules of coincidence  many of
them are within a time frame after a DXspot, it
makes sense to set a threshold for the
minimum number of QSOs accross all WRTC logs, for a
callsign to be included in the results. In the given
results (see below), thresholds of 0, 10 and 20 were
chosen.
Results
Here
are the links to the results. The results with a
threshold of 0 are just included for completeness
probably not very helpful.
Time frame:

2 minutes

5 minutes

10 minutes

Threshold (QSOs):

(0)
/ 10
/ 20

(0)
/ 10
/ 20

(0)
/ 10
/ 20

The
first column of the tables contains the percentage
of QSOs that a station made with any WRTCStations,
that fell within a x minute time frame after a
DXspot.
The
second column is the total number of QSOs that
followed a DXspot, the third column is the total
number of QSOs with WRTC stations, fourth column is
the callsign, followed by the claimed category and
score of their log submission (if available).
Examples
The
results speak for themselves, have a look and draw
your conclusions. It seems that quite a number of
stations that submitted as single OP might have used
the DXcluster to aid finding WRTCstations...
To
illustrate the effects of changing the time frame,
lets have a look at two different callsigns, N3RS (Single
OP) and NO2R (M/S).
NO2R claimed
MultiOne and also submitted a lot of packet spots
during the contest, so he was clearly using the
DXcluster. He made a total of 163 QSOs with WRTC
stations during the contest. The following table
shows, the distribution of his QSOs after spots,
within a 2, 5 and 10minute time frame.
Time frame

%

QSOs

increase %

2

55.2

90

0

5

71.2

116

16

10

80.4

131

9.2

More
than half of the WRTC stations were worked within 2
minutes after a spot for them, another strong
increase by 16% to 71.2% is made within the 3
minutes until the 5 minutewindow is over. In the
following 5 minutes, until the 10minute window is
over, only a few more stations are worked.
The
analysis of N3RS, who claimed
single OP (unassisted) shows totally different
picture. He made a total of 166 QSOs with WRTC
stations. The following table shows the same data as
the table for NO2R above.
Time frame

%

QSOs

increase %

2

19.9

33

0

5

38.6

64

18.7

10

60.8

101

22.2

Only
one fifth of the spotted WRTC stations were worked
within 2 minutes after a spot for them appeared. The
increase with increasing time from the spot is
remarkably higher, i.e. the longer the time
difference between the spot, the higher the
probability for a QSO. This result lets no doubt
that N3RS was actually not using the DXcluster (noone
would have suspected anyway ;), the correlation
between his QSOs and the spots is significantly
lower compared to NO2R.
Both
N3RS and NO2R show a typical pattern for their
respective categories. If you have a close look at
the full results, you will find a lot of stations
who absolutely do not fit into this pattern...
Finally...
There
are several problems which are not covered by the
methods to generate these simple statistics:
 Clock accuracy.
A QSO might appear in a log at an earlier time
than the time stamp of the spot, due to
incorrect clock settings
 PY was not a
rare multiplier in this contest, and all WRTC
stations are just one multiplier. Many notorious
cheaters might only work real multipliers from
DX spots, so they will not appear high in this
analysis.
 Likewise,
stations that have been S&Ping especially
for WRTC stations might, although not using the
DX cluster, show very high correlation in this
analysis.
All
raw data, the source code used to generate the
statistics etc. is available here.
Any
comments, critics etc appreciated via eMail mail@fkurz.net
Fabian,
DJ1YFK
