PureBytes Links
Trading Reference Links
|
How do you determine how "clean" your data is? Besides using the Database Purify command it seems to me an essential tool is the ability to compare one AmiBroker database to another and see what the differences are.
For example, it would be nice to know what the difference is between say PremiumData.com and Yahoo's free data. What is the difference between how they handle symbol changes and splits? What happens if you get PremiumData historical data but update it with Yahoo data?
And what's the difference between a database filled every day for a month with the AmiQuote Yahoo Current source (or the Historical source going back only a day), and a database that is filled all at once with the last 30 days of data. The second way would presumably pick up any fixes to the Yahoo / Commodity Systems data that had occurred, whereas the first technique wouldn't.
How significant is the difference and is it worth worrying about. Should Yahoo free data users periodically just reget their entire databases? Instead of relying on the quick Yahoo Current source maybe they should use the Historical Source and get a week's work of data every day to pick up any recent fixes.
But how can you tell what is worth doing without the ability to compare databases?
I know from working on my Yahoo Stock Ticker downloader (see http://finance.groups.yahoo.com/group/amibroker/message/140652 for more details on this work in progess) that an Industry Browser download as of July 24, 2009 had this breakdown on Sunday:
Yahoo Stock List Exchange Summary (8): 538 AMEX 453 NASDAQ CM 1026 NASDAQ GM 1289 NASDAQ NM 2312 NYSE 1898 OTC BB 7 Other OTC 4 PCX ---- 7527
Yahoo Stock List Sector Summary (9): 815 Basic Materials 29 Conglomerates 524 Consumer Goods 2166 Financial 802 Healthcare 443 Industrial Goods 1309 Services 1292 Technology 147 Utilities ---- 7527
But now (July 28, 2009) says this:
Yahoo Stock List Exchange Summary (8): 538 AMEX 368 NASDAQ CM 971 NASDAQ GM 1427 NASDAQ NM 2312 NYSE 1894 OTC BB 7 Other OTC 4 PCX ---- 7521
Yahoo Stock List Sector Summary (9): 815 Basic Materials 29 Conglomerates 524 Consumer Goods 2165 Financial 801 Healthcare 441 Industrial Goods 1308 Services 1291 Technology 147 Utilities ---- 7521
And the following symbols have died:
Bad symbols (7): Borland Software Corp. (N/A: BORL) China Networks International H (: CNWHF) Endocare Inc. (N/A: ENDO) Maven Media Holdings, Inc. (N/A: MVMH.OB) Neah Power Systems, Inc. (N/A: NPWS.OB) TransTech Services Partners In (N/A: TTSP.OB) Wataire International, Inc. (N/A: WTARE.OB)
So even a few days results in quite a few differences just in which stocks are listed. (Borland Software has disappeared??! Oh, merged with Micro Focus the other day)
The massive changes in the NASDAQ markets seems suspicious and maybe my Python script isn't quite working yet, but the exchange lookup is pretty hard to mess up.
Back to the original question: I could always implement something in COM to extract the symbol info back out to Python and do the comparisons there but I was wondering if there is another solution.
__._,_.___
**** IMPORTANT PLEASE READ ****
This group is for the discussion between users only.
This is *NOT* technical support channel.
TO GET TECHNICAL SUPPORT send an e-mail directly to
SUPPORT {at} amibroker.com
TO SUBMIT SUGGESTIONS please use FEEDBACK CENTER at
http://www.amibroker.com/feedback/
(submissions sent via other channels won't be considered)
For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG:
http://www.amibroker.com/devlog/
__,_._,___
|