JLargeArrays is a Java library of one-dimensional arrays that can store up to 263 elements.
All current implementations of Java Virtual Machines allow the creation of one-dimensional arrays of length smaller than 231 elements. In addition, since Java lacks true multidimensional arrays, most of the numerical libraries use one-dimensional arrays to store multidimensional data. With the current limitation, it is not possible to store volumes of a size larger than 12903. On the other hand, the data from scientific simulations or medical scanners continuously grow in size and it is not uncommon to go beyond that limit. JLargeArrays addresses the problem of the maximal size of one-dimensional Java arrays providing ones that can store up to 263 elements. Performance comparison with native Java arrays and Fastutil library shows that JLargeArrays is the fastest solution overall. Possible applications in Java collections as well as numerical and visualization frameworks are also discussed.
In 1999, David Flanagan in his book ”Java in a Nutshell” wrote:
Array index values are integers [..]. Although long is an integer data type, long values cannot be used as array indexes. This may seem surprising at first, but consider that an int index supports arrays with over two billion elements. An int[] with this many elements would require eight gigabytes of memory. When you think of it this way, it is not surprising that long values are not allowed as array indexes.
Even though that statement may sound rather amusing today, it is still valid for all current JVM implementations. One may argue that datasets larger than 231 elements should be processed in an out-of-core or distributed fashion and not loaded at once into the memory. Albeit, that approach usually requires reimplementation of existing algorithms to fit a particular framework, such as MapReduce. On the other hand, servers with more than 500GB of RAM memory are not uncommon and the array length limitation prevents Java programs to fully utilize their capabilities. This motivated the development of JLargeArrays.
JLargeArrays uses low-level memory operations available in sun.misc.Unsafe. Even though that class is marked as an internal proprietary API that may be removed in a future release, it is widely used by JDK itself in several packages such as java.nio or java.util.concurrent. In addition, against common belief, it is available in all major JDK distributions including Oracle, IBM, and OpenJDK. Therefore, we claim that JLargeArrays is a portable library that might be used in many different applications.
See the following paper for more details:
P. Wendykier, B. Borucki and K. S. Nowinski, “Large Java arrays and their applications,” 2015 International Conference on High Performance Computing & Simulation (HPCS), Amsterdam, 2015, pp. 460-467, doi: 10.1109/HPCSim.2015.7237077. [link]