Overhead is not a constant ... It is a function of a variety of
things
not all of which am I even aware of and some of which can be
fairly
significant ...
--- In amibroker@xxxxxxxxxps.com,
"Ton Sieverding"
<ton.sieverding@...> wrote:
>
> Of
course not. You'll always keep the overhead as a constant. But
as a rule
of thumb it works fine for me in situations where time is
the bottleneck
...
>
> Regards, Ton.
>
> ----- Original Message
-----
> From: Fred Tonetti
> To: amibroker@xxxxxxxxxps.com
> Sent: Wednesday, June 18, 2008 2:25 PM
> Subject: RE:
[amibroker] Multi Core Optimization, L2 Cache &
Optimization Run
Times
>
>
>
> The relationship isn't quite that
clear .
>
>
>
> I'm still playing with this feature
for IO but if you are using
AB's exhaustive search for a variety of things
and have a multiple
CPU / Core machine try MCO on some of your
optimization problems .
>
>
>
>
>
----------------------------------------------------------
----------
>
> From: amibroker@xxxxxxxxxps.com
[mailto:amibroker@xxxxxxxxxps.com]
On Behalf Of Ton Sieverding
> Sent: Wednesday, June 18, 2008 4:29
AM
> To: amibroker@xxxxxxxxxps.com
>
Subject: Re: [amibroker] Multi Core Optimization, L2 Cache &
Optimization Run Times
>
>
>
> Fred does this
show me that 'doubling the cores equals halving
the time' -)
>
>
>
> Regards, Ton.
>
>
>
>
----- Original Message -----
>
> From: Fred Tonetti
>
> To: amibroker@xxxxxxxxxps.com
>
> Sent: Wednesday, June 18, 2008 1:10 AM
>
>
Subject: RE: [amibroker] Multi Core Optimization, L2 Cache &
Optimization Run Times
>
>
>
> Here are some
results I got with my new toy .
>
> This is using a reasonably
complex system on ~500 symbols over
10 years i.e. ~2500 bars ...
>
> Cores Time Percent
>
> 1
218
>
> 2 114
52.29%
>
> 3 79 36.24%
>
> 4 62 28.44%
>
> 5 52 23.85%
>
> 6 46 21.10%
>
> 7 41
18.81%
>
> 8 37 16.97%
>
> As expected the higher
you go the more overhead there is . but
improvements like this are still
well worth the effort . Especially
on a single box .
>
>
>
----------------------------------------------------------
--------
>
> From: amibroker@xxxxxxxxxps.com
[mailto:amibroker@xxxxxxxxxps.com]
On Behalf Of Steve Dugas
> Sent: Saturday, June 14, 2008 7:00 PM
>
To: amibroker@xxxxxxxxxps.com
>
Subject: Re: [amibroker] Multi Core Optimization, L2 Cache &
Optimization Run Times
>
> Very interesting Fred, thanks!
This looks encouraging, at
least for us EOD guys.
>
> One
thing I notice - at 32 tickers, it looks like the curve
has "recovered" to
what you might expect to see even if there was no
dent at 16. And also,
after 32 the curve seems to get a second wind,
i.e. it "inverts" and the
time per symbol decreases *more* rapidly as
more tickers are added. What
do you think might account for that? Is
it just due to the log nature of
the chart? Thanks!
>
> Steve
>
> ----- Original
Message -----
>
> From: Fred Tonetti
>
> To: amibroker@xxxxxxxxxps.com
>
> Sent: Saturday, June 14, 2008 5:49 PM
>
>
Subject: [amibroker] Multi Core Optimization, L2 Cache &
Optimization
Run Times
>
> Given TJ's comments about:
>
> - The
amount of memory utilized in processing
symbols of data
>
>
- Whether or not this would fit in the L2 cache
>
> - The effect
it would have on optimizations when it
didn't
>
> I finally
got around to running a little benchmark for Multi
Core Optimization using
the program I wrote and posted ( MCO ) which
I'll be posting a new version
of shortly .
>
> These tests were run under the following
conditions:
>
> - A less than state of the art laptop with
>
> o Core 2 Duo 1.86 Ghz processor
>
> o 2 MB of
L2 Cache
>
> - Watch Lists of symbols each of which
>
> o Contains the next power of two number of symbols of
the
previous i.e. 1, 2, 4, 8, 16, 32, 64, 128, 256
>
> o Contains
Symbols containing ~5000 bars of data .
>
> Given the
above:
>
> - Each symbol should require 160,000 bytes i.e.
~5,000 bars * 32 bytes per bar
>
> - Loading more than 13
symbols should cause L2 cache
misses to occur
>
>
Results:
>
> - See the attached data & chart
>
>
There are several interesting things I find regarding the
results
.
>
> - The "dent" in the curve looking left to right
occurs
right where you'd think it would, between 8 symbols and 16
symbols i.e.
from the point at which all data can be loaded to and
accessed from the L2
cache to the point where it no longer can .
>
> - The "dent"
occurs in the same place running either
one or two instances of AB
>
> - The "dent" while clearly visible is hardly
traumatic in terms
of run times
>
> - The relationship of run times between running
one
and two instances of AB is consistent at 40% savings in terms of run
times regardless of the number of symbols.
>
> - This is
also in line when one looks at how much
CPU is utilized when running one
instance of AB which on the test
machine is typically in the 54 - 60%
range.
>
> I have a new toy that I'll be trying these benchmarks
on
again shortly i.e. a dual core 2 duo quad 3.0 ghz .
>
>
>
>
>
----------------------------------------------------------
--------
>
> I am using the free version of SPAMfighter for private users.
>
It has removed 480 spam emails to date.
> Paying users do not have this
message in their emails.
> Try SPAMfighter for free now!
>
>
>
>
>
>
>
----------------------------------------------------------
----------
>
I am using the free version of SPAMfighter for private users.
> It has
removed 480 spam emails to date.
> Paying users do not have this message
in their emails.
> Try SPAMfighter for free
now!
>