Thomas Preud'homme
619fb7aeba
[commtech] Also compile gomp_stream_64_comm
...
Add gomp_stream_64_comm to the least of communication techniques to
compile.
2012-07-07 23:29:11 +02:00
Thomas Preud'homme
467d0b4122
[commtech] Fixes in gomp_stream
...
* Stick to the sizes used in gomp_stream
* Release data when they are *all* received
2012-07-07 23:26:24 +02:00
Thomas Preud'homme
d8c16a4aa3
Merge branch 'bqv2_buf_end'
2012-07-07 23:14:15 +02:00
Thomas Preud'homme
df09d89933
[commtech] Use only 1 thread per core
...
Creating 2 thread per core in the purpose of receiving while sending is
plain stupid. First it needs 2 threads synchronizing with each other
which has a cost. Second, since only one thread can run at a time the
threads slow each other (using BatchQueue where the sender is on the
same core as the receiver yields bad performance). This patch remove all
this complexity to have one thread receive, compute and then resend
data, which improve performances dramatically.
2012-07-07 23:14:08 +02:00
Thomas Preud'homme
4914b0dcdd
Add CSQ (2/1) and CSQ (2/32), Del CSQ (2/2)
2012-03-27 00:31:16 +02:00
Thomas Preud'homme
a80decaef4
[commtech] Provide 64 cache lines version of algos
...
* Provide for BatchQueue, CSQ, FastForward, MCRingBuffer and GOMP stream
a version using 64 cache lines in total for all buffers.
* Rename common version from _common_comm.h to _common.h to avoid
considering them as communication technique on their own
2012-03-26 16:44:30 +02:00
Thomas Preud'homme
c37c100355
[commtech] Initialize vector in calc_mat.c
2012-03-26 16:14:23 +02:00
Thomas Preud'homme
09afc1ed2b
parsing.sh: Remove assumption about calc args
...
Calc can have several args for useless_loop and line prods and for comm
and barriere bench. Hence:
* Change use_histo to reflect that
* Set list of args per bench/prod instead of globally
* No need for the argument (since there is several) in create_complex_dat_body
2012-03-26 16:13:23 +02:00
Thomas Preud'homme
74f5176116
parsing.sh Remove a few assumptions
...
Remove assumptions around barriere bench:
* Not always 2 memory hierarchy are tested -> numCacheConfigs
* barriereList -> ${bench}List
* Size of the calc argument -> *
2012-03-26 16:04:31 +02:00
Thomas Preud'homme
40dfd58c86
parsing.sh: support batch_queue_* for barriere
...
Count batch_queue_* in barriere bench
2012-03-26 13:20:18 +02:00
Thomas Preud'homme
758198c2b0
[commtech] Add missing .c for new CSQ configs
2012-03-20 12:16:10 +01:00
Thomas Preud'homme
7087998fc6
[commtech] Add the new configs for compilation
2012-03-20 12:05:12 +01:00
Thomas Preud'homme
5840b57937
[commtech] Provide more CSQ configs
...
* Rename CSQ configs to csq_<nbr_buffers>_<size_buffer>_comm.h
* Add several configs
* Default config is csq_comm.h
2012-03-20 11:07:05 +01:00
Thomas Preud'homme
b47a17c6da
Revert junk from "Fix including perf stat in logs"
...
This partially reverts commit 65a2ed9357
.
It removes all the changes in the configuration variable at the top of
the file which were not supposed to be commited.
2012-03-20 10:38:00 +01:00
Thomas Preud'homme
4cd4df3d1c
Remove useless .main.d file
2012-03-19 20:40:24 +01:00
Thomas Preud'homme
f9aa3b227a
CSQ's article suggest SUB_SLOTS should be 64.
2012-03-19 20:40:13 +01:00
Thomas Preud'homme
65a2ed9357
Fix including perf stat in logs
...
This commit fix commit b0441d7a1c
2012-03-14 12:46:47 +01:00
Thomas Preud'homme
30e8b2a2c6
Automate test of pipeline_template
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
4fa9811144
Support NB_CORES between 1 and 12 out of the box
...
Prepare an "omp parallel" pragma for NB_CORES between 2 and 12. This
avoid needing any change in the file for NB_CORES between 1 and 12.
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
dc0931cde0
Remove debugging printf
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
75bd067571
Check the result of the computation
...
Make sure the result of the computation is always the same
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
8010f34abe
Stage time can be made smaller
...
Allow stage time to be smaller by adjusting after the computing was done
instead of before.
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
cacde80b30
Allow automatic test run for lattice
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
a5f52a6c58
Add the never run lattice.cpp
...
Add the never run lattice.cpp from upon lattice.c is based.
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
51cbe32eda
Update .gitignore
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
50778ca358
Remove fmr_omp-str_base
...
Stop worrying about keeping bit identical fmr_omp-str_base
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
502ec92654
Update Makefile for fmr_omp-str_base generation
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
e07d4d39ab
Add template of pipeline parallelism friendly code
...
pipeline_template.c is an example of a pipeline parallelism friendly code in the
sense that it can't be parallelized by any other known parallelization technique.
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
a9793430f9
Add pipeline computation of lattice
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
c7eef474b5
Remove addition of $HOME/local/bin to the PATH
...
Remove addition of $HOME/local/bin to the PATH since it's already in the PATH now
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
23670f3d72
Revert "Add an implementation to compute n'th digit of pi"
...
This reverts commit f480a5e3c2dd2bc23422c6a1c0acea9b3df428c2.
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
da08852ecc
Add an implementation to compute n'th digit of pi
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
cf816f0685
Add a less naïve script to compare BatchQueue to GOMP native
...
communication library *and* to sequential code by performing a
more useful computation.
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
03b32a950a
Add a simple test to try automatic usage of BatchQueue through OpenMP
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
9e1b9aa1b1
Make the script work with GOMP_stream* and GOMP_batchQ* functions
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
57820691d2
Use CFLAGS in Makefile
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
31f7d7760f
Makefile to compile 'n patch FMradio w/ BatchQueue
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
b2fe873992
Add display_streams script
...
display_streams is able to:
+ display the structure of streams
+ display stats about commits and updates
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
b12ce54d29
Commit the script to setup the environment
...
Commit setup_environment which sets up the PATH and LD_LIBRARY_PATH to
find the toolchain (modified gcc) and libraries (openmp).
2012-02-21 18:56:02 +01:00
Thomas Preud'homme
09ed8e7bc3
Initial release of FMradio
...
* Source file for FMradio with (i) openmp stream extension and (ii)
openmp stream and data parallelism extensions.
* Input files (small and larger one) to test FMradio.
* Compiled version of FMradio just in case of any later problem in the
toolchain (although the toolchain itself is saved in git).
2012-02-21 18:56:01 +01:00
Thomas Preud'homme
360870c557
lancement.sh: Unset verbose mode
2012-02-21 18:09:44 +01:00
Thomas Preud'homme
82d3c453e6
lancement.sh: Send data in group
2012-02-21 18:09:34 +01:00
Thomas Preud'homme
b0441d7a1c
lancement.sh: Include perf stats in log files
2012-02-21 18:09:28 +01:00
Thomas Preud'homme
58d9801938
parsing.sh: Make metric pattern work again
2012-02-21 18:09:23 +01:00
Thomas Preud'homme
3c1dbe202c
parsing.sh: Don't create patternPlotFile.gnuplot
2012-02-21 18:09:10 +01:00
Thomas Preud'homme
585166eb58
parsing.sh: Pass all params to create_complex_dat_body
2012-02-21 18:08:01 +01:00
Thomas Preud'homme
7c4a484989
[commtech] Simplify if's in send()
...
Test sender_ptr against the end of the current buffer via
channel->sender_ptr_end
2012-01-30 20:14:22 +01:00
Thomas Preud'homme
a20c9a8a21
[commtech] BatchQueue v2
...
Uses 2 mapping to the same structure to avoid prefetching of the
producer semi-buffer by the consumer. The idea is to access everything
through mapping 1 except semi-buffer 2 which is accessed through mapping
2.
2012-01-30 20:09:52 +01:00
Thomas Preud'homme
fba09b60b8
Remove debug informations
2012-01-30 20:08:54 +01:00
Thomas Preud'homme
c6786815cd
Add native algo from OpenMP stream extension
...
Add native algorithm from OpenMP stream extension. This require adding
one function in commtech.h: end_producer(). This function does nothing
for all communication algorithm but gomp_stream (the algorithm added by
this commit).
2012-01-30 20:07:11 +01:00