Sunday, July 17, 2005

Java: Extracting All Possible Combinations from 2D Arrays of Varying Size

Recently I was having a hell of a time trying to get a snippet of code to work. Generally speaking, it really shouldn't have been that difficult to do, but I just couldn't figure out how to straighten everything out. When faced with this sort of situation, a quick web search will often bring you to the answer, but that was not the case this time. I did eventually get the whole thing working, and offer the solution here in the hopes that no one else will have to waste as much time on this as I did!

I needed to take a 2D array of integers and create another 2D array containing every possible combination of elements from each row in the original array. Sounds simple enough, but I had to allow for varying numbers of both rows and columns in the starting array. That is, I had no idea beforehand how many rows I would get, or more importantly, how many elements would be in each row. Each row could have a different number of elements. So I could have something as simple as:

0 1
0 2

to something as complex as:

1 2 3
1 2
0
0 1 2 3
0 3
0 1 2 3
0 2 3
1 3
1

I would have to take any sort of 2D array like those and construct another 2D array with every possible combination taking one element from each row in the first array. So the first array above would yield:

0 0
1 0
0 2
1 2

and the second would start like this:

1 1 0 0 0 0 0 1 1
2 1 0 0 0 0 0 1 1
3 1 0 0 0 0 0 1 1
1 2 0 0 0 0 0 1 1
2 2 0 0 0 0 0 1 1
3 2 0 0 0 0 0 1 1
1 1 0 1 0 0 0 1 1
2 1 0 1 0 0 0 1 1
3 1 0 1 0 0 0 1 1
and so on...

I could have gone the easy route and assumed each row in the original array was equal in length to the longest row, but that really wasn't correct for my application, plus it would result in many, many wasted steps. To make things more efficient, I needed to process only the possible combinations. So this is how I eventually got it all to work:

First thing I needed to do was calculate how many possible combinations would result.

int validCombos = 1;

for (int i=0; i < my2DArray.length; i++)
    validCombos *= my2DArray[i].length;

By multiplying the all of the lengths of each row together, we get the number of possible combination we can create. You need to start the validCombos counter at 1 instead of zero because we are multiplying, not adding (and don't want to remain at zero forever!).

Next we will construct our new 2D array to contain the results.

combosArray = new int[validCombos][my2DArray.length];

This gives us a final array with as many rows as there are possible combinations, with each row taking one element from each row in the original array.

Next we'll need a counter to keep track of which combination we're working on, and an offset value to track how we're changing the element in each combination.

int combo;
int offset = 1;

Now for the main part of the algorithm. Obviously, we will need to loop through each row in the first array, and make sure we start from the first combination for each iteration.

for (int i=0; i < my2DArray.length; i++) {
    combo = 0;

We need to make sure the 'pattern' we construct stays within the confines of the total number of possible combinations, so we add this line:

while (combo < validCombos) {

We also need to loop though each element in the first array.

for (int j=0; j < my2DArray[i].length; j++) {

And we will use the offset value to determine the rate of repetition of the 'pattern' for each element.

for (int k=0; k < offset; k++) {

We'll need to add this line to make doubly sure we don't grow beyond the maximum size of our resulting array:

if (combo < validCombos) {

And here we can finally start building our combination, one element at a time. Each row in combosArray represents one combination consisting of one value from each row in the original array.

combosArray[combo][i] = my2DArray[i][j];

We also need to increment our combo counter to ensure we fill out the entire array and not just each row over and over!

combo++;

Now we can add close brackets for all except the first for loop.

}
}
}
}

Finally, we need to adjust the offset value for the next element in the combination to ensure that every possible combination is created, and then we can close out the final for loop.

offset *= my2DArray[i].length;
}

And that's it. combosArray now contains rows representing every possible combination from the original 2D array. Print the contents and have a look if you like.

for (int i=0; i < combosArray.length; i++)
{
    System.out.print("[ ");
    for (int j=0; j < combosArray[i].length; j++)
        System.out.print(combosArray[i][j] + " ");
    System.out.println("]");
}

After all that I finally have something working exactly the way I need it to. Given a 2D array of varying row length, we have created another 2D array containing every possible combination taking one value from each original row, without wasting any memory space or unnecessary steps.

Now this might all be a little confusing, and not very useful unless you are working on something very similar to this, but hopefully if someone out there finds themselves in the jam I was in, they will be able make some use of it. The entire block of code is below:

int validCombos = 1;

for (int i=0; i < my2DArray.length; i++)
    validCombos *= my2DArray[i].length;

combosArray = new int[validCombos][my2DArray.length];

int combo;
int offset = 1;

for (int i=0; i < my2DArray.length; i++)
{
    combo = 0;
    while (combo < validCombos)
    {
        for (int j=0; j < my2DArray[i].length; j++)
        {
            for (int k=0; k < offset; k++)
            {
                if (combo < validCombos)
                {
                    combosArray[combo][i] = my2DArray[i][j];
                    combo++;
                }
            }
        }
    }
    offset *= my2DArray[i].length;
}

for (int i=0; i < combosArray.length; i++)
{
    System.out.print("[ ");
    for (int j=0; j < combosArray[i].length; j++)
        System.out.print(combosArray[i][j] + " ");
    System.out.println("]");
}

And just in case you are wondering, this code was used to increase the efficiency of the part-of-speech tagging tool I'm working on.

No comments: