Thursday, 12 September 2013

Why is the running time of two instances on two cores larger than one instance only?

Why is the running time of two instances on two cores larger than one
instance only?

I am facing a "maybe" strange problem. Let's say I have an executable.
When I run it on a computer with two cores, the process runs over a time
t1. Then, if I run two instances of the process (the same executable but
on different directories, launched either manually or by using gnu
parallel), the running time for each process is not close to t1 but
actually larger, sometimes close to 1.9t1. I must note that the two cores
are physical (macbook pro mid 2009, Mountain Lion). I have also tested
this behaviour on a linux machine with 8 cores. If I run 1, 2, 3, and 4
instances, the running time per instance is about t1. But, after 5, 6, 7,
and 8 instances, the running time per instance is increasingly larger than
t1.
I have detected this behaviour when running a simulation. I was able to
reduce the test case to the simple test presented below. I wanted to check
std::vector, std::array, static and dynamic arrays, at several compilation
levels. The test code is the following:
#include <iostream>
#include <vector>
#include <array>
#include <cstdlib>
struct Particle {
private:
int nc;
public:
void reset(void) { nc = 0; };
void set(const int & val) { nc = val; };
};
#define N 10000 // number of particles
#define M 200000 // number of steps
#define STDVECTOR 0
#define STDARRAY 0
#define ARRAY 1
#define DYNARRAY 0
int main (void)
{
#if STDVECTOR
std::vector<Particle> particles(N);
#elif STDARRAY
std::array<Particle, N> particles;
#elif ARRAY
Particle particles[N];
#elif DYNARRAY
Particle *particles; particles = new Particle [N];

No comments:

Post a Comment