rt_benchs

Commit Graph

Author	SHA1	Message	Date
Thomas Preud'homme	6cdff7f5a0	Add copyright/license information	2013-04-22 18:34:41 +02:00
Thomas Preud'homme	467d0b4122	[commtech] Fixes in gomp_stream * Stick to the sizes used in gomp_stream * Release data when they are all received	2012-07-07 23:26:24 +02:00
Thomas Preud'homme	d8c16a4aa3	Merge branch 'bqv2_buf_end'	2012-07-07 23:14:15 +02:00
Thomas Preud'homme	df09d89933	[commtech] Use only 1 thread per core Creating 2 thread per core in the purpose of receiving while sending is plain stupid. First it needs 2 threads synchronizing with each other which has a cost. Second, since only one thread can run at a time the threads slow each other (using BatchQueue where the sender is on the same core as the receiver yields bad performance). This patch remove all this complexity to have one thread receive, compute and then resend data, which improve performances dramatically.	2012-07-07 23:14:08 +02:00
Thomas Preud'homme	4914b0dcdd	Add CSQ (2/1) and CSQ (2/32), Del CSQ (2/2)	2012-03-27 00:31:16 +02:00
Thomas Preud'homme	a80decaef4	[commtech] Provide 64 cache lines version of algos * Provide for BatchQueue, CSQ, FastForward, MCRingBuffer and GOMP stream a version using 64 cache lines in total for all buffers. * Rename common version from _common_comm.h to _common.h to avoid considering them as communication technique on their own	2012-03-26 16:44:30 +02:00
Thomas Preud'homme	c37c100355	[commtech] Initialize vector in calc_mat.c	2012-03-26 16:14:23 +02:00
Thomas Preud'homme	758198c2b0	[commtech] Add missing .c for new CSQ configs	2012-03-20 12:16:10 +01:00
Thomas Preud'homme	7c4a484989	[commtech] Simplify if's in send() Test sender_ptr against the end of the current buffer via channel->sender_ptr_end	2012-01-30 20:14:22 +01:00
Thomas Preud'homme	a20c9a8a21	[commtech] BatchQueue v2 Uses 2 mapping to the same structure to avoid prefetching of the producer semi-buffer by the consumer. The idea is to access everything through mapping 1 except semi-buffer 2 which is accessed through mapping 2.	2012-01-30 20:09:52 +01:00
Thomas Preud'homme	c6786815cd	Add native algo from OpenMP stream extension Add native algorithm from OpenMP stream extension. This require adding one function in commtech.h: end_producer(). This function does nothing for all communication algorithm but gomp_stream (the algorithm added by this commit).	2012-01-30 20:07:11 +01:00
Thomas Preud'homme	a30a5bfe06	Make all threads are joined in join_threads, nb_thread is the id of the last thread, not the number of threads to join. Hence the for loop must include this id.	2011-06-01 15:35:08 +02:00
Thomas Preud'homme	f0c75c7570	SINK thread (not INTERM) notify its termination Use !!node_param->type & SINK in likely macro to test wether we are a SINK node or an INTERM node.	2011-06-01 15:25:08 +02:00
Thomas Preud'homme	bd7379e73a	Propose 2048 and 4096 buffer size for BatchQueue.	2011-05-27 15:42:40 +02:00
Thomas Preud'homme	f05cfdcd92	Improve pipeline (cons and prod in //)	2011-05-25 14:33:42 +02:00
Thomas Preud'homme	f01db158c2	Use multiples of BUF_SIZE when needed Number of cache line sent and size of reception buffer must be a multiple of BUF_SIZE.	2011-05-10 11:14:28 +02:00
Thomas Preud'homme	6fcfd60d2d	Fix buffer loop in BatchQueue single data mode The buffer in single data mode in batchQueue was not circular because a variable was not renamed	2011-05-10 11:02:00 +02:00
Thomas Preud'homme	70f8f95647	Fix option to choose the number of node Option is now in the getopt string and accessible with -l switch.	2011-05-10 11:00:59 +02:00
Thomas Preud'homme	f430cc17a7	Fix bugs coming from refactoring	2011-05-05 19:54:44 +02:00
Thomas Preud'homme	372c36155a	Fix incorrect usage string: --check -> -k	2011-05-05 14:52:41 +02:00
Thomas Preud'homme	756a701466	[commtech] Refactor to chain more than 2 nodes * Refactor the source to be able to chain more than 2 nodes together * Compile all binaries by default (binList must be set manually in lancement.sh to run only a subset of the binaries	2011-05-05 14:34:09 +02:00
Thomas Preud'homme	5d71bc53f1	[commtech] Varying size of buffer for BatchQueue Create several variation of BatchQueue, each with a different buffer size: batch_queue_1024, batch_queue_512, ..., batch_queue_2.	2011-05-05 11:30:00 +02:00
Thomas Preud'homme	9c835d4c46	Add a "sent words == received words" check	2011-05-04 19:35:10 +02:00
Thomas Preud'homme	22c97ab418	[commtech] Make BUF_SIZE definition be per tech Don't define BUF_SIZE globally anymore, but per communication technique	2011-03-02 13:19:22 +01:00
Thomas Preud'homme	7c515200e7	[commtech] Remove asm_cache from the comm techs	2011-03-02 13:19:22 +01:00
Thomas Preud'homme	90b7a8007b	[commtech] Rename c_cache to batch_queue	2011-03-02 13:19:22 +01:00
Thomas Preud'homme	7d7ad0c46a	[commtech] Make WORDS_PER_BUF indep of BUF_SIZE. The number of data sent must be independent of the buffer size chosen by each algorithm.	2011-01-28 04:56:44 +01:00
Thomas Preud'homme	c3aad28ad5	[commtech] Add calculation method Add a calculation method which add the value of the first integer of n consecutive cache lines and write the results in one of the integer of these cache lines. Next calculation uses the next n consecutives cache lines and write the result in the next integer.	2011-01-25 17:24:53 +01:00
Thomas Preud'homme	975411a824	Split CSQ in 2 communication techniques. * Divide CSQ in 2 communication techniques: one with 2 slots (as in BatchQueue aka c_cache) and one with 64 slots (as in the article) * Rename fake communication technique in none communication technique and disable any activity (send no longer does anything)	2011-01-25 17:24:53 +01:00
Thomas Preud'homme	5eb7fb50c7	[commtech] CSQ use memcpy in dequeue for fairness Paper about CSQ uses memcpy in enqueue and dequeue. Although it is not possible to use memcpy in enqueue because of current API, it is possible to use memcpy in dequeue, hence this commit.	2011-01-19 12:37:14 +01:00
Thomas Preud'homme	35a81bb736	[commtech] Place volatile on the right qualifier.	2011-01-13 14:58:13 +01:00
Thomas Preud'homme	2d879dc3fc	[commtech] Fix idx test in c_cache technique. c_cache watching status value when idx % BUF_SIZE != 0 instead of when it's equal zero.	2011-01-03 11:35:42 +01:00
Thomas Preud'homme	6c2868e20c	[commtech bench] Take the mean over 10 run.	2010-10-13 23:57:58 +02:00
Thomas Preud'homme	006b1b1d94	Simplify and rewrite comm API.	2010-10-01 18:57:46 +02:00
Thomas Preud'homme	5e3a7f6ce0	commtechs: BUGFIX wait threads to be initialized	2009-07-07 16:08:00 +02:00
Thomas Preud'homme	c99d8be100	commtechs: BUGFIX deadlock in thread init	2009-07-07 15:56:20 +02:00
Thomas Preud'homme	698341b99e	commtech: Update usage help	2009-07-01 03:10:38 +02:00
Thomas Preud'homme	415004fb4b	commtech: Update usage help	2009-07-01 02:45:28 +02:00
Thomas Preud'homme	3341546c75	commtech: calc lib take 1 argument on command line	2009-07-01 02:36:11 +02:00
Thomas Preud'homme	7a1610961c	commtech: BUGFIX unwanted optimization Replace prod += 42 by prod += fourty_two where fourty_two is a volatile variable to avoid replacement of the loop into a prod += 42 * nb_loop	2009-07-01 02:34:50 +02:00
Thomas Preud'homme	243d8810f1	commtech: avoid a double free corruption Remove srand and rand function call as they generates double free corruption (???)	2009-07-01 01:49:16 +02:00
Thomas Preud'homme	e90348b54c	commtech: BUFFIX in freeing pages Don't try to free the middle of an allocation	2009-07-01 01:48:13 +02:00
Thomas Preud'homme	ba13c18af7	commtech: Free pages when jikes barrier ends	2009-07-01 00:45:19 +02:00
Thomas Preud'homme	e04818645a	commtech: Add a new calculation method This calculation performs only a loop and avoid cache pollution	2009-06-30 22:37:55 +02:00
Thomas Preud'homme	c98db4b4ba	commtech: do_calc() return a void ** This respect what we claim to send to the send() function and allow to reduce the FAKE_NURSERY_START. Thus we are sure gcc won't optimize the second part of the if in include/jikes_barrier_comm.h	2009-06-30 22:35:11 +02:00
Thomas Preud'homme	7bfc46db78	commtech: Delete pages free Pages cannots be freed as fast as they are allocated, so this whole mecanism can only delay the kernel panic. It's wiser to exit badly if too much memory is consumed	2009-06-30 22:32:59 +02:00
Thomas Preud'homme	6b9777cb9b	Align shared_mem and initial jikes buffer	2009-06-25 14:01:18 +02:00
Thomas Preud'homme	c9323cd901	BUGFIX: Fix a possible deadlock if an error occurs	2009-06-25 13:47:50 +02:00
Thomas Preud'homme	177e548efe	commtechs benchs: fake_comm perform the writes	2009-06-24 23:35:58 +02:00
Thomas Preud'homme	2fe89da8a2	free memory after 100 Mo allocated	2009-06-24 23:26:51 +02:00

1 2

75 Commits