Hi Paul - That is very informative, thank you
so much for taking the time to write it. It sounds a bit complicated for me ( at
least, to be *sure* I am doing it right and so I will not
destroy 3 or 4 PC's during the learning process 8 -
). So I think I am better off putting it on the back burner for now
but I will definitley save your answer and may try it down the road somewhere.
Thanks again!
Steve
----- Original Message -----
Sent: Monday, May 26, 2008 7:44 AM
Subject: RE: [amibroker] Re: Dual-core
vs. quad-core
Steve,
My knowledge on OC is quite limited, I have only OCed
3 pcs in the last 4 to 5 years. Fortunately, they are all still working, so
you can say that my experiences have been favourable. OC reminds a lot about
Car Hot Rodding in my younger days. they are quite similar, both are attempts
to modify a basically mass manufactured product to performance higher
than their specifications. Both are based on home grown wisdom rather
than instituitionalised research. OCing is basically elevating both the CPU
clock speed as well as that of the memory bus, and my methods, all from what I
read on the internet are always based on first clocking the memory controller
hub (MCH), and once that is done, overclock the CPU by increasing the
multiplier, (CPU speed is always adjusted as a multiplier of bus speed becasue
they need to be synchronised).
The danger related to OC is always that of
overheating, firstly the CPU, and secondary the MCH. So choosing a MB that has
decent cooling features for particularly the MCH is of the most
importance.(CPU are always well looked after by MB manufacturers, and
increasingly MCH is going the same way). In addition, the faster the memory,
the more sucessful the exercise is. So My advice is always getting the fastest
memory you can afford. (this is even more so given what Tomasz has said
recently in his AB performance tests.
Before you overclock, you need to download a few
tools
1. cpuz
2. coretemp
3. memory tesing
You also need to find a set of instructions for your
MB. I found this set of instruction for my MB pretty good http://www.hardforum.com/showthread.php?t=1169366 In
this instruction, you will find where to download the tools I mentioned, as
well as a good methodology to follow. You may be able to find you MB specific
instructions on this site.
Also I found that disabling all devices that I dont
need helps - these include parallel and serial ports, basically all the
things I dont use and I can disable through the Bios
setting.
Testing: I only use AB for stress and performance
testing (I use the recommened software from the site for diagnostic
tests), because that is what I'm ocing for. What I would suggest would be
to use a few of your AFLs, insert in them a few Getperformancecounter
statements (AB function). These should include a relatively
simple AFL, a long and complicated AFL testing lots of symbols (to
test Memory access performance) and one that is somewhere
in between. At the same time - monitor the temperatures. These are far
more meaningful tests than the stress tests that most OC sites
recommend.
I think dingo is
quite an expert in this area. may be he will say a few
words.
Cheers
Paul.
Hi Paul - I found your comment about
overclocking interesting, have googled around a bit but find that most of
the discussion is over my head. For example The Overclockers
Forum
discusses overclocking the Intel Q6600 chip on
my new computer and people are claiming to get as much as 3.8GH out of this
2.4 GH chip. If you can find the time, would you mind saying a few words
about overclocking, how it is done, and what are the dangers/limits etc? Do
you need special software to monitor the core temps? Thanks!
Steve
----- Original Message -----
Sent: Wednesday, May 14, 2008 2:12
AM
Subject: RE: [amibroker] Re:
Dual-core vs. quad-core
I havent noticed any slow down when I run 2 instances
of AB optimizing almost on a continuous bases on my core 2 Duo.
I have 4 Mb L2 cache. In fact with overclocking, I'm able to increase
the core speed significantly, and noticably faster on
AB optimization, without increasing the temps to above 50 deg
C
Hello,
I just run the same code on my relatively new notebook
(Core 2 Duo 2GHz (T7250)) and the loop takes less than 2ns per
iteration (3x speedup). So it looks like the data sits entirely inside
the cache. This core 2 has 2MB of cache and thats 4 times more than
on Athlon x2 I got.
> If what you say is true, and one core
alone fills the memory > bandwidth, then there should be a net
loss of performance while > running two copies of ami.
It
depends on complexity of the formula and the amount of data per
symbol you are using. As each array element has 4 bytes, to fill 4 MB
of cache you would need 1 million array elements or 100 arrays each
having 10000 elements or 10 arrays each having 100K elements.
Generally speaking people testing on EOD data where 10 years is just
2600 bars should see speed up. People using very very long intraday
data sets may see degradation, but rather unnoticeable.
Best
regards, Tomasz Janeczko amibroker.com ----- Original Message
----- From: "dloyer123" <dloyer123@xxxxxxcom> To:
<amibroker@xxxxxxxxxps.com> Sent:
Tuesday, May 13, 2008 8:12 PM Subject: [amibroker] Re: Dual-core vs.
quad-core
> Nice, tight loop. It is good to see someone that
has made the effort > to make the most out of every cycle and the
result shows. > > My new E8400 (45nm 3GHz, dual core)
system should arrive tomorrow. > The first thing I will do will
be to benchmark it running ami. I run > portfolio backtests over
a few years of 5 minute data over a thousand > or so symbols.
Plenty of data to overflow the cache, but still fit > in memory.
No trig. > > I'll post what I find. > > If
what you say is true, and one core alone fills the memory >
bandwidth, then there should be a net loss of performance while >
running two copies of ami. > > > > --- In amibroker@xxxxxxxxxps.com,
"Tomasz Janeczko" <groups@xxx> >
wrote: >> >> Hello, >> >> FYI:
SINGLE processor core running an AFL formula is able to >
saturate memory bandwidth >> in majority of most common
operations/functions >> if total array sizes used in given
formula exceedes DATA cache size. >> >> You need to
understand that AFL runs with native assembly speed >> when
using array operations. >> A simple array multiplication like
this >> >> X = Close * H; // array
multiplication >> >> gets compiled to just 8 assembly
instructions: >> >> loop: 8B 54 24 58 mov edx,dword
ptr [esp+58h] >> 00465068 46 inc > esi ; increase
counters >> 00465069 83 C0 04 add eax,4 >> 0046506C
3B F7 cmp esi,edi >> 0046506E D9 44 B2 FC fld dword ptr
[edx+esi*4- > 4] ; get element of close array >> 00465072
D8 4C 08 FC fmul dword ptr [eax+ecx- > 4] ; multiply by element of
high array >> 00465076 D9 58 FC fstp dword ptr [eax- > 4]
; store result >> 00465079 7C E9 jl > loop ; continue
until all elements are processed >> >> As you can
see there are three 4 byte memory accesses per loop > iteration
(2 reads each 4 bytes long and 1 write 4 byte long) >>
>> On my (2 year old) 2GHz Athlon x2 64 single iteration of
this loop > takes 6 nanoseconds (see benchmark code
below). >> So, during 6 nanoseconds we have 8 byte reads and 4
byte store. > Thats (8/(6e-9)) bytes per second = 1333 MB per
second read >> and 667 MB per second write simultaneously i.e.
2GB/sec combined ! >> >> Now if you look at memory
benchmarks: >> http://community.compuserve.com/n/docs/docDownload.aspx?webtag=ws- >
pchardware&guid=6827f836-8c33-4063-aaf5-c93605dd1dc6 >>
you will see that 2GB/s is THE LIMIT of system memory speed on >
Athlon x64 (DDR2 dual channel) >> And that's considering the
fact that Athlon has superior-to-intel > on-die integrated
memory controller (hypertransfer) >> >> // benchmark
code - for accurrate results run it on LARGE arrays - > intraday
database, 1-minute interval, 50K bars or more) >>
GetPerformanceCounter(1); >> for(k = 0; k < 1000; k++
) X = C * H; >> "Time per single iteration
[s]="+1e-3*GetPerformanceCounter()/ >
(1000*BarCount); >> >> Only really complex
operations that use *lots* of FPU (floating > point)
cycles >> such as trigonometric (sin/cos/tan) functions are
slow enough for > the memory >> to keep up. >>
>> Of course one may say that I am using "old" processor, and
new > computers have faster RAM and that's true >> but
processor speeds increase FASTER than bus speeds and the gap >
between processor and RAM >> becomes larger and larger so with
newer CPUs the situation will be > worse, not better. >>
>> >> Best regards, >> Tomasz
Janeczko >> amibroker.com >> ----- Original Message
----- >> From: "dloyer123"
<dloyer123@x..> >> To: <amibroker@xxxxxxxxxps.com> >>
Sent: Tuesday, May 13, 2008 5:02 PM >> Subject: [amibroker] Re:
Dual-core vs. quad-core >> >> >> > All
of the cores have to share the same front bus and > northbridge.
>> > The northbridge connects the cpu to memory and has
limited > bandwidth. >> > >> > If
several cores are running memory hungry applications, the > front
>> > buss will saturate. >> > >> >
The L2 cache helps for most applications, but not if you are >
burning >> > through a few G of quote data. The L2 cache is
just 4-8MB. >> > >> > The newer multi core
systems have much faster front buses and > that >> >
trend is likely to continue. >> > >> > So, it
would be nice if AMI could support running multi cores, > even
>> > if it was just running different optimization passes
on different >> > cores. That would saturate the front bus,
but take advantage of > all >> > of the memory
bandwidth you have. It would really help those > multi
>> > day walkforward runs. >> > >>
> >> > >> > --- In amibroker@xxxxxxxxxps.com,
"markhoff" <markhoff@> wrote: >> >> >>
>> >> >> If you have a runtime penalty when
running 2 independent AB jobs > on >> >
a >> >> Core Duo CPU it might be caused by too less
memory (swapping to >> > disk) >> >> or
other tasks which are also running (e.g. a web browser,
audio >> >> streamer or whatever). You can check this
with a process explorer >> >> which shows each tasks CPU
utilisation. Similar, 4 AB jobs on a > Core >> >>
Quad should have nearly no penalty in runtime. >> >>
>> >> Tomasz stated that multi-thread optimization does
not scale good >> > with >> >> the CPU
number, but it is not clear to me why this is the case. > In
>> > my >> >> understanding, AA optimization
is a sequential process of > running >> >
the >> >> same AFL script with different parameters. If I
have an AFL with >> >> significantly long runtime per
optimization step (e.g. 1 minute) > the >> >>
overhead for the multi-threading should become quite small
and >> >> independent tasks should scale nearly with the
number of CPUs > (as >> > long >> >>
as there is sufficient memory, n threads might need n-times
more >> >> memory than a single thread). For sure the
situation is > different if >> >> my single
optimization run takes only a few millisecs or > seconds,
>> > then >> >> the overhead for
multi-thread-managment goes up ... >> >>
>> >> Maybe Tomasz can give some detailed comments on
that issue? >> >> >> >> Best
regards, >> >> Markus >> >> >>
> >> > >> >
------------------------------------ >> >
>> > Please note that this group is for discussion between
users only. >> > >> > To get support from
AmiBroker please send an e-mail directly to >> > SUPPORT
{at} amibroker.com >> > >> > For NEW RELEASE
ANNOUNCEMENTS and other news always check DEVLOG: >> > http://www.amibroker.com/devlog/ >>
> >> > For other support material please check
also: >> > http://www.amibroker.com/support.html >>
> Yahoo! Groups Links >> > >> > >>
> >> > > > >
------------------------------------ > >
Please note that this group is for discussion between users
only. > > To get support from AmiBroker please send an
e-mail directly to > SUPPORT {at} amibroker.com > >
For NEW RELEASE ANNOUNCEMENTS and other news always check
DEVLOG: > http://www.amibroker.com/devlog/ >
> For other support material please check also: > http://www.amibroker.com/support.html >
Yahoo! Groups Links > > >
__._,_.___
Please note that this group is for discussion between users only.
To get support from AmiBroker please send an e-mail directly to
SUPPORT {at} amibroker.com
For NEW RELEASE ANNOUNCEMENTS and other news always check DEVLOG:
http://www.amibroker.com/devlog/
For other support material please check also:
http://www.amibroker.com/support.html
__,_._,___
|