Today we will practice concurrent programing by rewriting our serial code to one parallelized with MPI. The problem to be solved is the simple, linear advection with periodic boundary conditions. We will limit our attempts to the explicit version of the solver, but mind that implicit can be parallelized as well.
The usuall approach to parallelization for problems that “solve something” over a certain domain is to split this domain amongst processors. The assumption is that computational work is proportional to the number of unknows distributed over this domain. Those are usually associated with the computational mesh. Consequently the first step involves partitioning the mesh. Such partitioning should fulfill two main cryteria: * equally distribute the workload, * minimize resulting communications. Our mesh is rather simple, so partitioning will not be a problem.
Start by adding the MPI_Init();
and
MPI_Finalize();
to the begining and the end of your code.
It is good to check if all components work and compile.
Knowing the size
of the problem and the number of
processors, as well as the rank
of the current process
determine the local problem size and allocate memory accordingly. Then
initiallize the problem, each processor works with its own chunk of
data.
Redesign the dump_solution
function so it can be used in
parallel. You can use MPI_Send
and MPI_Recv
to
synchronize the output.
Print current solution u
and the rank
so we
can verify if all is OK using our favorite plotting program.
We need to modify one of the functions performing the explicit
solution step. Since our problem is periodic and our partitioning
“continous” the first process will need to communicate with the last and
each process with the next one. This brings about a problem. We can
either initialize the comunication such that the first communicates with
the second, the second recives the message and than sends one to the
next and so on. This does not seem like a good idea since processes will
have to synchronize and than would wait for other processes to finish.
To couter that we will apply the non-blocking communication. In the
nonblocking communication the send/recive operations do not lock
control, but mearly indicate that the buffer should be send to or
recived from with the control imedietly returning to the caller. The
price is the fact, that before clearing out the buffer or reading the
data we will need to check if the comunication operation has completed.
This is done with MPI_Wait
, MPI_Status
and
MPI_Request
.
We will discuss positioning the calls while coding this out.
So the code is ready and compiles without errors. But is it working as expected? In general debuging a prallel code brings on a new dimension of problems as there is a number of new things that could be going wrong.