[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [amibroker] 47 Missing QuotesPlus Index Symbols and Auditing QP Data



PureBytes Links

Trading Reference Links

Tomasz,

 

Hi.   I have MASSIVE mixed feelings on this issue…

 

First, the theoretical answer is…

 

Based on 20+ years experience building data warehouses, data marts, decision support systems and all manner of other analytical systems, my knee-jerk answer is “HECK NO!   Data MUST be cleaned up at the SOURCE!!!”

 

Dirty source data is a major problem with any analytical system.   Data warehouse jocks (like me) deal with this issue every day.   There are large, commercial software applications whose entire purpose is to audit and correct source data.  

 

The problem with trying to detect and correct dirty source data is that corrupt data comes in many unknown forms.  Dirty data is NOT the opposite of clean data.   For example…

 

It is relative easy to add “A” to “B” if we know the definition of “A” and “B”.   E.g. if A = 1, B = 2, and the algorithm is A+B, then the result is “3”. 

 

But what happens if A+B suddenly equals “14”?!?   What is the algorithm to determine WHY it equals “14”?  Is “A” no longer “1”?  Is “B” no longer “2”?  Are the values still correct, but has the relationship algorithm changed?   Do we correct “A”, correct “B” or write a new algorithm?!?

 

I’ve built many data cleansing systems and they are essentially a hack.  You try to repair someone else’s failure.  You define and code a workaround algorithm that invariably is a hassle, works for a while, and then suddenly breaks.   (Source data is corrected, new data problem, etc.)

 

Unfortunately, the realistic answer is…  

 

We need a data cleansing “system” to clean up the data ourselves.    Although I do NOT want the hassle either for you or for us, I think that we need a three pronged solution:

 

1.       Detect (audit) bad data in the source database.

2.       Cleanse or work around the bad data.    

3.       Raise awareness and provide feedback about the bad data – to other users and the data vendor.

 

My auditing process (below) is an example of step 1.  It is long and cumbersome, and only looks at one type of problem.  We need something more flexible and easier to implement.  Perhaps some simple utility or scan that can hit the source database directly looking for data problems.

 

Your suggestion (below) is an example of step 2 and has much merit.  I think it will work, but…   We still need to better detect the problems first (step 1).   And, we need to ensure that we can still retrieve all other data for these “override” symbols – including company name, Sector/Industry mapping and “Extra” data.

 

Thoughts?

 

Regards,

 

Dan.

 

P.s., as a database developer, my biggest “soapbox” issue is the complexity of data and its ability to “break”.  Most people don’t understand this and view a pretty UI as the application.  Unfortunately, what we see here is an example of the issue.   L

 


From: amibroker@xxxxxxxxxxxxxxx [mailto:amibroker@xxxxxxxxxxxxxxx] On Behalf Of Tomasz Janeczko
Sent: Thursday, September 22, 2005 11:46 PM
To: amibroker@xxxxxxxxxxxxxxx
Subject: Re: [amibroker] 47 Missing QuotesPlus Index Symbols and Auditing QP Data

 

Dan,

 

As you know I offered already some changes:

 

"The other option is of course allowing other IssueStatus values like 'N", "C", "P"

during retrieval process, however this would lead to retrieving also other symbols

(not listed below) that are not actively trading and this may not be what most people want/need.

"

 

From your response I know that it is not what you are after.

 

Another solution that may help is to have a special user-definable text file

that will hold tickers to retrieve always regardless of IssueStatus value.

There you would enter all those 47 symbols with not-actively-trading status

once and they would be included in all subsequent retrievals.

 

What do you think?


Best regards,
Tomasz Janeczko
amibroker.com

----- Original Message -----

From: Dan Clark

Sent: Friday, September 23, 2005 4:51 AM

Subject: [amibroker] 47 Missing QuotesPlus Index Symbols and Auditing QP Data

 

Hi,

For Quotes Plus users (like me) - there are data issues.   I learned this when I tried to plot the “!NYA” (NYSE Composite) symbol.   It was missing in AB!   After discussing this with Tomasz, it turns out that the “!NYA” IssueStatus is not valid and is therefore being excluded from AB!  

Below is a list of 47 QP3 Index symbols that appear to be corrupt and cannot be loaded into AmiBroker.   These include major indices like the NYSE Composite (“!NYA”), many iShares indices and several HOLDRS indices. 

I strongly urge QuotesPlus data users to audit AmiBroker versus QuotesPlus data to ensure that your data is correct.  There are multiple methods that can be used, but here’s the method that I used:

1.       Using Quotes Plus SPP, retrieve and save all symbols listed as “Indexes”.  

2.       Open an Excel workbook and load the symbols.  They should be in column “A”, with the first symbol in Column A2.

3.       Using AB, scan for all symbols all symbols that were in the group “Index”.   Here’s the Exploration:


if(GroupID(1)== "Index")
   Buy =
1;
else
   Buy =
0;
  
Filter=Buy;
SetOption("NoDefaultColumns", True );
AddTextColumn(Name(), "Symbol");
AddTextColumn(FullName(), "Company");

 

4.       Select and copy the results of the scan.

5.       Paste into symbols into the open Excel workbook in Column “C” so that the first symbol is in Column “C2”.

6.       Paste this formula in Column “B2” (between the two lists of symbols:

=IF(NOT(ISERROR(VLOOKUP(A2,$C$2:$C$228,1,FALSE))), "", A2)”

Ensure that the lookup range (“$C$2:$C$228” in the example) matches the entire range of AB symbols.  For example, it you have 240 symbols, the range would be something like “$C$2:$C$242”.

7.       Copy the formula from B2 to the bottom of range of symbols in column A.   I.e., just ensure that all of the QuotesPlus symbols are referenced.  

8.       What you should end up with is a list of symbols in the “B” column which are in SPP, but NOT in AB.  

9.       Copy the “B” column and pasted the VALUES only (Edit à Paste Special à Values) to a convenient blank worksheet.

10.   Sort this list of symbols.  This gives you a nice ordered list of symbols.

11.   Select the ordered list of symbols and paste transpose (Edit à Paste Special à Transpose) the symbols so that they are in a single row. 

12.   Assuming the row of symbols starts in cell “A1”, type the formula

=A1 & " ,"

13.   The result is a comma-delimited row of symbols with suitable for adding to AB.

14.   Copy the cells containing the comma-delimited symbols.

15.   In AB, select Symbol à New.

16.   In the text box in the “Enter a new symbol name” paste your comma-delimited list of symbols.  Then click OK.

17.   The symbols have now been added to AB.   However, they will NOT have a company name. I.e., FullName() will return blank.  (This is useful.)

18.   Run the following Exploration:

if(FullName()== "")
   Buy =
1;
else
   Buy =
0;

IssueType = GetExtraData("IssueType");

IssueStatus = GetExtraData("IssueStatus");


Filter=Buy;
SetOption("NoDefaultColumns", True );
AddTextColumn(Name(), "Symbol");
AddTextColumn(FullName(), "Company");

AddTextColumn(IssueType(), "IssueType");
AddTextColumn(FullName(), "IssueStatus");

 

I found the data issues in Index symbols.  There are probably other bad symbols that we are not aware of.   Send your list of bad symbols to QuotesPlus and copy AmiBroker support.    This should help get the data cleaned up.

 

Regards,

 

Dan.

 

 

Symbol

Company

IssueType

IssueStatus

!AM-T

 

A

 

!AM-UV

 

A

 

!BUX

 

A

N

!EMX

 

A

N

!EUX

 

A

P

!FFX

 

A

C

!GLI

 

A

P

!IPH

 

A

P

!IXBT

 

A

C

!IZN

 

A

N

!KGI

 

A

N

!NAG

 

A

N

!NBF

 

A

N

!NHG

 

A

N

!NIJ

 

A

D

!NIR

 

A

N

!NJB

 

A

N

!NJG

 

A

N

!NJH

 

A

N

!NJM

 

A

N

!NJU

 

A

N

!NJW

 

A

N

!NJZ

 

A

N

!NLA

 

A

N

!NLB

 

A

N

!NLE

 

A

N

!NLF

 

A

N

!NLL

 

A

N

!NLR

 

A

N

!NLU

 

A

N

!NME

 

A

N

!NMJ

 

A

N

!NMV

 

A

N

!NNV

 

A

N

!NY-AV

 

A

C

!NY-DV

 

A

C

!NY-T

 

A

 

!NY-UV

 

A

 

!NYA

 

A

C

!TOP

 

A

N

!VIO

 

A

N

!XAH

 

A

N

!XEU

 

A

P

!XRH

 

A

P

!XSH

 

A

P

!XUH

 

A

P

!YIH

 

A

N

 



Please note that this group is for discussion between users only.

To get support from AmiBroker please send an e-mail directly to
SUPPORT {at} amibroker.com

For other support material please check also:
http://www.amibroker.com/support.html





YAHOO! GROUPS LINKS