===============================================================================
=                                                                             =
=                        SPRINT BETA 0.1.0 RELEASE NOTES                      = 
=                                                                             =
===============================================================================

Latest release SPRINT Beta 0.1.0 - 15.01.2010
Previous release SPRINT Prototype 0.0.3 - 12.11.2008

-------
Content
-------

	1. Scope
	2. What's new in SPRINT Beta 0.1.0
	3. Know Issues
	
	
--------
1. Scope 
-------- 

The changes in SPRINT beta 0.1 have been targeted at improving the scalability
of SPRINT. SPRINT can now process larger data sets. The restriction on the size 
of the output data has been removed thanks to the use of R ff objects and 
binary files. SPRINT successfully returns output data which is larger than the 
memory of the computer. Further improvements to the HPC harness using MPI/IO 
means that SPRINT now scales almost perfectly to 512 cores, providing a major 
improvement in performance and execution time. No additional function has been
added to the library of parallelized statistical R functions. 


----------------------------------
2. What's new in SPRINT Beta 0.1.0
----------------------------------

1. 	The interface to pcor has changed from: pcor(data) to:  

		pcor(data, caching_ = "mmeachflush", filename_ = NULL)
		
	where:
		- 'data' is the input matrix data,
		- 'caching_' caching scheme for the backend, currently mmnoflush or 
	      mmeachflush (flush mmpages at each swap) if no name is specified 
          the default value is "mmeachflush",
	    - 'filename' is a string and is optional. It specifies the name of
	      a file where the results will be saved. By default, the results are
	      saved to a temporary file that is delete after exiting from SPRINT.
		
	Examples of valid calls to pcor to replace cor_result <- cor(t(inData)):
		- ff_obj <- pcor(t(inData))
		- ff_obj <- pcor(t(inData), filename_="output.dat")
		- ff_obj <- pcor(data, caching_="mmeachflush", filename_="output.dat")
		
		
2.	The output file is a binary file instead of a text file.

	Binary file format keeps the precision of double value numbers unlike text
	file format which truncates them at ~13 decimals. Binary files are smaller 
	than plain text files by a factor of ~2.5 allowing for a more efficient use
	of memory and space. Binary files can also be read and written faster than
	plain text files. Binary easily manipulated within R and using R tools, as 
	well as using other programming languages or tools. However, they may have
	to be converted before being used with some tools such as plain text 
	editors.	

3. 	Almost perfect scaling through the use of MPI/IO for up to 512 cores.

	The use of MPI/IO enables each slaves to write their output to the result
	file in parallel and removes the bottleneck created by the Master process
	handling all I/O.
	
4. 	The size of the result output can now be bigger than memory of the Master 
	core process.
	
	The use of MPI/IO enables each slaves to write their output to the result
	file in parallel instead of sending their output back to the Master process
	for I/O. The size of the final output is therefore no longer limited to the
	size of Master process memory. The output is now distributed over all slave
	processes. The new limit is equal to the sum of the memory capacity for all 
	slave processes.  
	

---------------
3. Known Issues
---------------	

1.	If size of the input data is limited. The limitation will depend on the 
	number of cores available for the task farms and the size of their memory.
	
		Max (input_data) =< Slave RAM - (output_data/# of slaves) - small_temp

	Should the dynamic memory allocation fails, an error message is returned.
	
	a.	The input data is too large:
		
		**ERROR** : Input data array memory allocation failed on slave process
					<..>. Aborting.
					
	b.	There isn't enough memory for the result data or temporary data:
	
		**ERROR** : Memory allocation failed on slave process <..>. Aborting.
		
	c.	Additional memory is needed for the result data, SPRINT tried to
		re-allocate the job to get more memory but failed:
		
		**ERROR** : Memory re-allocation failed on slave process <..>. 
					Aborting.


===============================================================================
SPRINT Team

email: sprint@ed.ac.uk
http://www.r-sprint.org

Copyright  2010 The University of Edinburgh. 
===============================================================================


