PureBytes Links
Trading Reference Links
|
I was wondering whether the script referred to here is available anywhere.
Thanks,
H
--- In amibroker@xxxxxxxxxxxxxxx, "tpowers2010" <wingusr@xxx> wrote:
>
> I'm currently working on a Python 2.5 script to download all the stocks
> listed in the Yahoo Industry Browser <http://biz.yahoo.com/p/> by
> sector then industry.
>
> I basically do the same thing that is done by the Excel workbook found
> at http://icc-az.com/amibroker_files%5CStocks_XLS.zip
> <http://icc-az.com/amibroker_files%5CStocks_XLS.zip> . However, that
> page says "Since this using plain VBA for all extraction, it is very
> slow. Expect 12 hours to do an extract...".
>
> For comparison, my Python script currently takes about 8 minutes or so.
> The main reason is that I can get ticker, company name, sector, and
> industry without having to download the individual company profile
> pages. And, unlike the Excel solution which downloads entire webpages
> (including images), I only have to grab the basic html page.
>
> Using the Python 3rd party BeautifulSoup module
> <http://www.crummy.com/software/BeautifulSoup/> , it turns out it's
> pretty easy to extract the required information from the raw html
> (rather than making Excel convert webpages to spreadsheets).
>
> Finally, to get the exchange information, instead of having to read each
> company's profile page I use the
> http://finance.yahoo.com/d/quotes.csv?s=TICKERS&f=x
> <http://finance.yahoo.com/d/quotes.csv?s=TICKERS&f=x> URL with TICKERS
> replaced with a + separated list of ticker symbols to get the exchanges
> for 200 companies at once.
>
> A caveat is that it turns out that getting info from the Industry
> Browser pages alone surprisingly yields ticker symbols that are already
> incorrect! (This seems to happen for any stock whose exchange is listed
> as "n/a". My impression is that the newer Yahoo
> <http://biz.yahoo.com/ic/ind_index.html> Industry Center
> <http://biz.yahoo.com/ic/ind_index.html> page is more accurate but
> slightly harder to parse.
>
> Therefore to be absolutely sure that the tickers are valid, you end up
> having to make sure you can download each companies profile or quotes
> page. The only time I've tried doing that took about 3 hours. As a side
> benefit of this process you can scape additional information on each
> company (such as number of employees). Only about 10 or so of the 7500+
> symbols were listed incorrectly on the main Industry Browser pages (all
> of them being OTC BB traded stocks).
>
> I'm thinking about using multiple threads to download say 10 pages at
> once to speed up this last process. Unfortunately, I didn't design the
> original code to be thread-safe so this will take some work.
>
> Once I have the basic stock information I spit out a .csv list (readable
> by Excel), broker.sectors, and broker.industries files. I also use a
> separate small Python script to initialize a new AmiBroker database. You
> have to manually update the Markets since there is apparently no way to
> do this from COM (but there are only 8 of them).
>
> One thing I noticed is that the brokers.industries file used to
> initialize new databases seems to have an undocumented limit of about 38
> or 39 characters for Industry Name? The "Textile - Apparel Footwear &
> Accessories" industry gets truncated and a bogus industry gets added
> unless I first limit the industry name length.
>
> Also, Industries don't appear to be sorted correctly under their Sectors
> (I saw another post here that mentions the same thing).
>
> Anyway, this is all somewhat of a work in progress. It also is a
> command-line only script. There is no GUI associated with it. You'll
> have to be comfortable with installing ActiveState's free python 2.5 for
> Windows distribution, installing the BeautifulSoup, and mechanize
> modules, and running scripts from a Command Prompt.
>
------------------------------------
**** IMPORTANT PLEASE READ ****
This group is for the discussion between users only.
This is *NOT* technical support channel.
TO GET TECHNICAL SUPPORT send an e-mail directly to
SUPPORT {at} amibroker.com
TO SUBMIT SUGGESTIONS please use FEEDBACK CENTER at
http://www.amibroker.com/feedback/
(submissions sent via other channels won't be considered)
For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG:
http://www.amibroker.com/devlog/
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/amibroker/
<*> Your email settings:
Individual Email | Traditional
<*> To change settings online go to:
http://groups.yahoo.com/group/amibroker/join
(Yahoo! ID required)
<*> To change settings via email:
mailto:amibroker-digest@xxxxxxxxxxxxxxx
mailto:amibroker-fullfeatured@xxxxxxxxxxxxxxx
<*> To unsubscribe from this group, send an email to:
amibroker-unsubscribe@xxxxxxxxxxxxxxx
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
|