[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [amibroker] Re: Freakishly fast backtest using 64 cores



PureBytes Links

Trading Reference Links

Yes, if your data fit in RAM.

Best regards,
Tomasz Janeczko
amibroker.com
----- Original Message ----- 
From: "cstrader" <cstrader232@xxxxxxxxxxxx>
To: <amibroker@xxxxxxxxxxxxxxx>
Sent: Tuesday, August 12, 2008 9:58 PM
Subject: Re: [amibroker] Re: Freakishly fast backtest using 64 cores


> Wow... is this why the second and subsequent runs of many optimization 
> problems seem so incredibly fast?  (first run loads data into memory, 
> subsequent runs do not?)
> 
> Thanks
> 
> 
> ----- Original Message ----- 
> From: "Tomasz Janeczko" <groups@xxxxxxxxxxxxx>
> To: <amibroker@xxxxxxxxxxxxxxx>
> Sent: Tuesday, August 12, 2008 3:40 PM
> Subject: Re: [amibroker] Re: Freakishly fast backtest using 64 cores
> 
> 
>> Hello,
>>
>> What is true for GPU it is not necesarily true for CPU. GPU has dedicated 
>> wide RAM
>> bus and faster RAM as opposed to system memory.
>>
>> AmiBroker does a lot to utilise memory to maximum extent where 
>> possible/feasible.
>>
>> Actually AFL speed is limited by system memory if you run out of on-chip 
>> cache.
>> http://www.amibroker.com/kb/2008/08/12/afl-execution-speed/
>>
>> So going for more memory usage not always means faster execution.
>>
>> Sure you can pre-compute everything, and use pre-computed values but
>> you need to understand that people are doing VERY different things with 
>> AmiBroker
>> and their problems are not the same as problems you are trying to solve.
>> For example some customers are backtesting entire US stock universe (8000+ 
>> symbols)
>> over 10 or 20 years. That's about 1.3GB for DATA alone. Now if you are 
>> running
>> porfolio backtest you need to keep trading signals and that can be as much 
>> as 1GB in
>> such case. Quickly you are reaching 3GB RAM limit of 32 OS. There is no 
>> place
>> to store "pre-computed" values.
>> AmiBroker by nature needs to provide best blend of speed, moderate memory 
>> / CPU requirements.
>> User-specific single-task solutions may go into specialisation and tricks 
>> that are
>> not feasible for commercial general-purpose product that is intended to 
>> keep
>> large user base happy.
>>
>> Best regards,
>> Tomasz Janeczko
>> amibroker.com
>> ----- Original Message ----- 
>> From: "dloyer123" <dloyer123@xxxxxxxxx>
>> To: <amibroker@xxxxxxxxxxxxxxx>
>> Sent: Tuesday, August 12, 2008 4:09 PM
>> Subject: [amibroker] Re: Freakishly fast backtest using 64 cores
>>
>>
>>> The programing guide lists the 8600M and 8700M as having 32 computing
>>> cores.  Not sure what they are clocked at.  Power is an issue.  The
>>> desktop versions need dedicated power connectors.  The big cards need
>>> two.
>>>
>>> Actually, when I am doing development on my laptop, I just use the
>>> emulator.  It is about 100x slower than my desktop system, but still
>>> about 20x to 50x faster than Ami alone.  The speed difference in
>>> emulation mode is mostly due to the precomputed and cached price
>>> arrays.
>>>
>>> Tomasz:  I suspect that there is an opportunity to trade memory for
>>> speed, even with 1 core.  Memory is cheap and would be a simpler way
>>> to get a performance boost than porting to multi core, GPU or CPU.
>>>
>>>
>>>
>>> --- In amibroker@xxxxxxxxxxxxxxx, "Tomasz Janeczko" <groups@xxx>
>>> wrote:
>>>>
>>>> Dell has 3 off the shelf
>>>> > laptops in their entertainment/performance range that use GeForce
>>>> > 8600M and 8700M with 256MB & 2*2456MB (min 256 required for CUDA?)
>>>>
>>>> Mobile ones are very poor cousins. Belive me. I own brand new
>>> notebook (ASUS) with GeForce8600M
>>>> and it is SLOW in 3D. I mean SLOW. Did I mention that it is SLOW?
>>>>
>>>> In 3D Mark it gets the same results as my 3 year old desktop 6600GT.
>>>>
>>>> Best regards,
>>>> Tomasz Janeczko
>>>> amibroker.com
>>>> ----- Original Message ----- 
>>>> From: "brian_z111" <brian_z111@xxx>
>>>> To: <amibroker@xxxxxxxxxxxxxxx>
>>>> Sent: Tuesday, August 12, 2008 12:40 AM
>>>> Subject: [amibroker] Re: Freakishly fast backtest using 64 cores
>>>>
>>>>
>>>> > DL
>>>> >
>>>> >
>>>> > I am following at the top level and understand what you are doing
>>> OK
>>>> > (you make me wish I had learnt programming/IT).
>>>> >
>>>> > I like your CPU.
>>>> >
>>>> > Allowing niche trading is what AB is all about?
>>>> >
>>>> > I'll put my money on MS/"general purpose computing on GPU" - I
>>> don't
>>>> > think the masses are in love with MS but for 80% of people who
>>> can do
>>>> > 80% of what they want with MS the price to move elsewhere is too
>>>> > high - they are just in love with max output for min input.
>>>> >
>>>> > If you go to the trouble to write a plug-in do you think it will
>>> be
>>>> > around long/require much ongoing support from you?
>>>> >
>>>> > I can see the benefits of the speed - for a group of traders it
>>> is a
>>>> > definite edge they would have for a year or two (I don't think
>>> any
>>>> > other trading software will be seeing this for a while? -
>>> especially
>>>> > in the AT area where more crunching could be done fast enough to
>>> keep
>>>> > up with live data.
>>>> >
>>>> > I don't blame Tomasz for not sitting his backside on the cutting
>>>> > edge - too dangerous for developers with long term clientele.
>>>> >
>>>> > Not having a go at Tomasz - to clarify - Tomeasz said GEForce
>>> 8800
>>>> > can't be put in a notebook?
>>>> >
>>>> > To my understanding there seems to be a reasonable number of
>>> laptops
>>>> > around that could use your method e.g. Dell has 3 off the shelf
>>>> > laptops in their entertainment/performance range that use GeForce
>>>> > 8600M and 8700M with 256MB & 2*2456MB (min 256 required for CUDA?)
>>>> >
>>>> > I looked at the GeF links in Paul's post but they didn't have
>>> much
>>>> > specific info there that I could see - I assume the above cards
>>> wiil
>>>> > run your system.
>>>> >
>>>> > I am not a buyer for now but good luck with it and what you have
>>> done
>>>> > already is a good contribution to AB - once someone on the block
>>> has
>>>> > a new super-dooper gadget pretty soon the neighbours want one too
>>> and
>>>> > demand grows.
>>>> >
>>>> > brian_z
>>>> >
>>>> >
>>>> >
>>>> > --- In amibroker@xxxxxxxxxxxxxxx, "dloyer123" <dloyer123@> wrote:
>>>> >>
>>>> >> This uses the mid range video card that happened to come with my
>>>> >> system, a 9800GT.  The newer 260 and 280 cards are 3 to 4 times
>>>> >> faster.  The 260 can be found at best buy for $300.  Some
>>> laptops
>>>> >> have compatible cards as well.
>>>> >>
>>>> >> The video card has its own memory, mine has 512MB, some have as
>>>> > much
>>>> >> as 1GB.  This memory is very fast, once it is loaded from the
>>> main
>>>> >> system.  Nvidia has a professional line of products that have
>>> much
>>>> >> more memory.
>>>> >>
>>>> >> Get get the best performance, my AFL code makes one pass over
>>> the
>>>> >> data, calling a Dll.  The Dll takes all of the data needed by
>>> the
>>>> >> calculation and loads a copy to the video card.  This upload is
>>>> > slow,
>>>> >> the entire upload takes about 45 seconds for all 1000 symbols.
>>>> >>
>>>> >> Once all of the data is uploaded, the Dll loads a "kernel" into
>>> the
>>>> >> graphics cores that perform the actual computation and generates
>>>> > the
>>>> >> trade list.  This part is very fast and performs all of the same
>>>> >> functions that my AFL version does.  The resulting trade list is
>>>> > the
>>>> >> same.
>>>> >>
>>>> >> Because the data loaded into video memory, it can be resused for
>>>> > many
>>>> >> passes over the data with different optimization values.  So,
>>>> >> hundreds of combinations of optimization values can be tried per
>>>> >> second.
>>>> >>
>>>> >> For non optimization runs, the Dll just loads one symbol into
>>> video
>>>> >> memory and processes it.  Counting the overhead of moving data
>>> to
>>>> > the
>>>> >> video card and extracting the trade list for a single symbol,
>>> the
>>>> >> result is similar to AFL code alone.  This lets me test the code
>>>> > and
>>>> >> make sure it is correct.
>>>> >>
>>>> >> This approach works best when the data only needs to be loaded
>>>> > once,
>>>> >> then "resused" many times.  It also works best when there is a
>>> lot
>>>> > of
>>>> >> data to work with.
>>>> >>
>>>> >> What is more interesting to me and what would be more useful for
>>>> >> others would be a general drive that requires no Dll changes to
>>>> >> modify the system.  The performance would not be as good as hand
>>>> >> optimized code, but would still be much better than AFL code
>>>> > alone.
>>>> >> It would take trading system design to a whole new level.  It
>>> would
>>>> >> provide enough performance to make working with Intra day data
>>> as
>>>> >> easy as daily data is today.
>>>> >>
>>>> >> Writing such a driver would be hard, but I have already done
>>> some
>>>> >> prototypes and design work.  I am tempted to do it for my own
>>> use.
>>>> >> If I made it available to others supporting it would be a PITA.
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >> --- In amibroker@xxxxxxxxxxxxxxx, "Paul Ho" <paul.tsho@> wrote:
>>>> >> >
>>>> >> > I'm very interested
>>>> >> > could you elaborate a bit more
>>>> >> > What model of Nvidia chipset are you using, and with how much
>>>> >> memory?
>>>> >> > Not sure exactly what you mean when you say
>>>> >> > It uses AmiBroker to load the symbol data and perform
>>>> > calculations
>>>> >> > that do not depend on the optimization parameters. Once loaded
>>>> > into
>>>> >> > video memory, repeated passes can be made with different
>>>> >> parameters,
>>>> >> > avoiding any overhead.
>>>> >> > Can you give me some examples. I presume when your dll is
>>> called.
>>>> >> AB passes
>>>> >> > one or more arrays of data belonging to 1 symbol, is that true?
>>>> >> > Not sure exactly what the rest mean either. How many functions
>>>> > are
>>>> >> you
>>>> >> > running in your dll, and what does each of the do?
>>>> >> > Great of you to share your insight.
>>>> >> > Cheers
>>>> >> > Paul.
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> >   _____
>>>> >> >
>>>> >> > From: amibroker@xxxxxxxxxxxxxxx
>>>> > [mailto:amibroker@xxxxxxxxxxxxxxx]
>>>> >> On Behalf
>>>> >> > Of dloyer123
>>>> >> > Sent: Tuesday, 5 August 2008 9:19 AM
>>>> >> > To: amibroker@xxxxxxxxxxxxxxx
>>>> >> > Subject: [amibroker] Freakishly fast backtest using 64 cores
>>>> >> >
>>>> >> >
>>>> >> >
>>>> >> > Greetings,
>>>> >> >
>>>> >> > I ported part of my AFL backtest code to a plugin, that takes
>>>> >> > advantage of the graphics math cores on the video card that
>>> are
>>>> >> > normally used for 3d graphics.
>>>> >> >
>>>> >> > I was able to get a several thousand fold performance
>>> improvement
>>>> >> > over AFL code alone.
>>>> >> >
>>>> >> > My goal was to reduce the 25 seconds AFL code alone uses for a
>>>> >> single
>>>> >> > portfolio level back test to less than 1 second, allowing
>>> multi
>>>> > day
>>>> >> > optimization and walkforward runs to complete in a more
>>>> > reasonable
>>>> >> > time, and also just to see how fast I could get it to run.
>>>> >> >
>>>> >> > The backtest runs over 1 year of 5 minute bars for about 1000
>>>> >> > symbols. 1 year of data normally takes 25 seconds for
>>> AmiBroker
>>>> >> > alone, or 18 seconds for 6 months of data. A typical
>>> optimization
>>>> >> > run takes hundreds of these passes per walk forward step,
>>> taking
>>>> >> > hours.
>>>> >> >
>>>> >> > Using the Nvidia CUDA API, running on my mid range video card.
>>> It
>>>> >> > was much faster. Much, much, much faster. How fast?
>>>> >> >
>>>> >> > It reduced the run time from 25s to... 4.4ms. That is more
>>> than
>>>> >> > 200/s!
>>>> >> >
>>>> >> > I didnt believe the timing when I saw it at first. So, I put
>>>> > 1,000
>>>> >> > runs in a loop and sure enough, it ran 1,000 iterations in
>>> about
>>>> > 4
>>>> >> > 1/2 seconds. This far exceeded my gaol or expectations.
>>>> >> >
>>>> >> > The resulting trade list matches that obtained by the AFL
>>> version
>>>> >> of
>>>> >> > this code.
>>>> >> >
>>>> >> > I estimate that it is processing 32GB of bar data/sec.
>>>> >> >
>>>> >> > Getting this to work at peak performance was tricky. Most of
>>> what
>>>> > I
>>>> >> > have learned about code optimization does not apply.
>>>> >> >
>>>> >> > It uses AmiBroker to load the symbol data and perform
>>>> > calculations
>>>> >> > that do not depend on the optimization parameters. Once loaded
>>>> > into
>>>> >> > video memory, repeated passes can be made with different
>>>> >> parameters,
>>>> >> > avoiding any overhead.
>>>> >> >
>>>> >> > For non backtest/optimization runs, the code just evaluates
>>> one
>>>> >> > symbol and passes the data back to AmiBroker
>>> buy/sell/short/cover
>>>> >> > arrays, making it easy to test, validate and visualize the
>>>> > trades.
>>>> >> > There is very little performance gain in this case.
>>>> >> >
>>>> >> > There are problems, however. To run optimizations at peak
>>> speed,
>>>> > I
>>>> >> > can not use AmiBroker to calculate the optimization goal
>>>> > function.
>>>> >> > So, I am in the process of writing code to match signals and
>>>> >> > calculate the portfolio fitness function. Once I do this, I
>>> will
>>>> > be
>>>> >> > able to perform full optimizations and walk forwards at 3
>>> orders
>>>> > of
>>>> >> > magnitude faster than is possible with AmiBroker alone.
>>>> >> >
>>>> >> > Also, this is not general purpose code. Changing the system
>>> code
>>>> >> > means changing a dll written in C. However, there is no reason
>>>> > that
>>>> >> > this could not be made more general.
>>>> >> >
>>>> >> > I have made some prototypes of "Cuda" versions of basic AFL
>>>> >> > functions. The idea is to queue the function calls into a
>>>> >> definition
>>>> >> > executed by a micro kernel running on the graphics cores. The
>>>> >> result
>>>> >> > would be the ability to use the full power of the graphics
>>> cores
>>>> > by
>>>> >> > modifying AFL code to use Cuda aware versions with no changes
>>> to
>>>> > C
>>>> >> > code. It would be an interesting, but big project.
>>>> >> >
>>>> >>
>>>> >
>>>> >
>>>> >
>>>> > ------------------------------------
>>>> >
>>>> > Please note that this group is for discussion between users only.
>>>> >
>>>> > To get support from AmiBroker please send an e-mail directly to
>>>> > SUPPORT {at} amibroker.com
>>>> >
>>>> > For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG:
>>>> > http://www.amibroker.com/devlog/
>>>> >
>>>> > For other support material please check also:
>>>> > http://www.amibroker.com/support.html
>>>> > Yahoo! Groups Links
>>>> >
>>>> >
>>>> >
>>>>
>>>
>>>
>>>
>>> ------------------------------------
>>>
>>> Please note that this group is for discussion between users only.
>>>
>>> To get support from AmiBroker please send an e-mail directly to
>>> SUPPORT {at} amibroker.com
>>>
>>> For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG:
>>> http://www.amibroker.com/devlog/
>>>
>>> For other support material please check also:
>>> http://www.amibroker.com/support.html
>>> Yahoo! Groups Links
>>>
>>>
>>>
>>
>> ------------------------------------
>>
>> Please note that this group is for discussion between users only.
>>
>> To get support from AmiBroker please send an e-mail directly to
>> SUPPORT {at} amibroker.com
>>
>> For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG:
>> http://www.amibroker.com/devlog/
>>
>> For other support material please check also:
>> http://www.amibroker.com/support.html
>> Yahoo! Groups Links
>>
>>
>>
> 
> 
> ------------------------------------
> 
> Please note that this group is for discussion between users only.
> 
> To get support from AmiBroker please send an e-mail directly to 
> SUPPORT {at} amibroker.com
> 
> For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG:
> http://www.amibroker.com/devlog/
> 
> For other support material please check also:
> http://www.amibroker.com/support.html
> Yahoo! Groups Links
> 
> 
> 

------------------------------------

Please note that this group is for discussion between users only.

To get support from AmiBroker please send an e-mail directly to 
SUPPORT {at} amibroker.com

For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG:
http://www.amibroker.com/devlog/

For other support material please check also:
http://www.amibroker.com/support.html
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/amibroker/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/amibroker/join
    (Yahoo! ID required)

<*> To change settings via email:
    mailto:amibroker-digest@xxxxxxxxxxxxxxx 
    mailto:amibroker-fullfeatured@xxxxxxxxxxxxxxx

<*> To unsubscribe from this group, send an email to:
    amibroker-unsubscribe@xxxxxxxxxxxxxxx

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/