30 December 2009 View Comments

Faster Matlab Calculation: Use Preallocation


Welcome to Stone Studio!

Preallocation

I am a believer in preallocation.  For a particular application, I read in about 13GB of data from a file into a 4-D matrix (this was running on a machine with 32GB of memory).  Before preallocation, I let the process run for about 10 hours, and it still hadn’t finished.  With preallocation the process finished in about 20 minutes.  That’s an improvement of at least 3,000%!

Permute

The permute function can be quite handy – it can shuffle around the dimensions in a matrix with a single function call.  Consider the following matrix:

m_one = rand([2 5 4000]);

The size of this matrix is reported as 2×5×4000 in MATLAB. Now, we can shuffle the dimensions. Let’s make it a 4000×2×5 array:

m_one = permute(m_one, [3 1 2]);

The permute() function takes the array to shuffle around, and the new order of dimensions. In this case, the 3rd dimension moved to become first, 1st dimension second, and 2nd dimension last so that a 2×5×4000 matrix becomes a 4000×2×5 matrix.

Vector notation

Be careful, though: as handy as permute can be, it’s easy to use it inefficiently.  Remember that 13GB 4-D matrix?  I ran permute on that, and memory usage immediately doubled.  In general, I recommend creating the data the right way first! It will save a lot of headache (and RAM) down the road.

If you desperately need only a subset of dimensions, an alternative solution is to use MATLAB’s built-in, efficient vector notation.  For example, to extract the first and third dimensions for a single 2nd-dimension element, just use

m_two = m_one(:,1,:);

The one downside here is that you’ll end up with an annoying singleton dimension that can frustrate other builtin functions like plot. The squeeze function will rescue us.

Squeeze

Squeeze is cool.  After running the previous code, size(m_two) shows us that m_two is a 4000×1×5 matrix.  We could use indexing to access all these elements, but squeeze will make life much easier – it will remove the singleton dimension in the middle.

m_two = squeeze(m_two);

Now, size(m_two) tells us we’ve got a 4000×5 matrix and using the matrix just got that much simpler.

Removing elements from vectors and matrices

There are times when you want to discard elements from a vector or matrix.  I used to do this by creating a new variable to hold just the elements I wanted to keep.  Obviously, there’s a better way.  Let’s remove all the elements of a matrix that are less than 0.5.  It’s insanely easy:

m_three = rand(1,1000); m_three( m_three < 0.5 ) = [];

Now, size(m_three) gives 1×2023.

Conclusions

If you haven’t noticed yet, MATLAB is all about the matrices. Understanding how to efficiently operate on subsets of matrices will give you huge returns in performance. Learn how and when to use permute, squeeze, and vector notation and you’ll be well on your way. Anything else you think should be on this page? Let me know in the comments!

[Tips via: S. Kellis]

blog comments powered by Disqus