Problem Statement
There are two sorted arrays A and B of sizes m and n respectively.
Find the median of the two sorted arrays( The median of the array formed by merging both the arrays).
Median: The middle element is found by ordering all elements in sorted order and picking out the one in the middle (or if there are two middle numbers, taking the mean of those two numbers).
Examples:
Input: A[] = {1, 4, 5}, B[] = {2, 3}
Output: 3
Explanation:
Merging both the arrays and arranging in ascending:
[1, 2, 3, 4, 5]
Hence, the median is 3
Input: A[] = {1, 2, 3, 4}, B[] = {5, 6}
Output: 3.5
Explanation:
Union of both arrays:
{1, 2, 3, 4, 5, 6}
Median = (3 + 4) / 2 = 3.5
Simple approach: Using Extra Space
The most basic approach is to merge both the sorted arrays using an auxiliary array. The median would be the middle element in the case of an odd-length array or the mean of both middle elements in the case of even length array.
The merging of two sorted arrays is similar to the algorithm which we follow in merge sort.
Algorithm
- The first element of both lists is compared.
- If sorted in ascending order, the smaller element among two becomes a new element of the sorted list.
- This procedure is repeated until both the smaller sublists are empty and the newly combined sublist covers all the elements of both the sublists.
- Maintain a variable count for the output array and if count equals (N + M) / 2, then it is the median of the odd length array and if it is even, store the mean of (N + M) / 2th and (N + M) / 2 – 1 th element.
C++ Implementation
int getMedian(int A[], int B[], int n, int m) { int i = 0; int j = 0; int count; int m1 = -1, m2 = -1; if((m + n) % 2 == 1) { for (count = 0; count <= (n + m)/2; count++) { if(i != n && j != m) { m1 = (A[i] > B[j]) ? B[j++] : A[i++]; } else if(i < n) { m1 = A[i++]; } else { m1 = B[j++]; } } return m1; } else { for (count = 0; count <= (n + m)/2; count++) { m2 = m1; if(i != n && j != m) { m1 = (A[i] > B[j]) ? B[j++] : A[i++]; } else if(i < n) { m1 = A[i++]; } else { m1 = B[j++]; } } return (m1 + m2)/2; } }
Practice on our C++ Compiler
Java Implementation
static int getMedian(int A[], int B[], int n, int m) { int i = 0; int j = 0; int count; int m1 = -1, m2 = -1; if ((m + n) % 2 == 1) { for(count = 0; count <= (n + m) / 2; count++) { if (i != n && j != m) { m1 = (A[i] > B[j]) ? B[j++] : A[i++]; } else if (i < n) { m1 = A[i++]; } else { m1 = B[j++]; } } return m1; } else { for(count = 0; count <= (n + m) / 2; count++) { m2 = m1; if (i != n && j != m) { m1 = (A[i] > B[j]) ? B[j++] : A[i++]; } else if (i < n) { m1 = A[i++]; } else { m1 = B[j++]; } } return (m1 + m2) / 2; } }
Practice on our Java Compiler
Python Implementation
def getMedian(A, B, n, m) : i = 0 j = 0 m1, m2 = -1, -1 if((m + n) % 2 == 1) : for count in range(((n + m) // 2) + 1) : if(i != n and j != m) : if A[i] > B[j] : m1 = B[j] j += 1 else : m1 = A[i] i += 1 elif(i < n) : m1 = A[i] i += 1 # for case when j<m, else : m1 = B[j] j += 1 return m1 else : for count in range(((n + m) // 2) + 1) : m2 = m1 if(i != n and j != m) : if A[i] > B[j] : m1 = B[j] j += 1 else : m1 = A[i] i += 1 elif(i < n) : m1 = A[i] i += 1 # for case when j<m, else : m1 = B[j] j += 1 return (m1 + m2)//2
Practice on our Python Compiler
Time Complexity: O(N + M) where N and M is the size of the array A[] and B[]Space Complexity: O(1)
Efficient Approach (Using binary search)
The key idea to note here is that both the arrays are sorted. Therefore, this leads us to think of binary search. Let us try to understand the algorithm using an example:
A[] = {1, 4, 7}
B[] = {2, 3, 5, 6}
From the above diagram, it can be easily deduced that only the first half of the array is needed and the right half can be discarded.
Similarly, for an even length merged array, ignore the right half and only the left half contributes to our final answer.
Therefore, the motive of our approach is to find which of the elements from both the array helps in contributing to the final answer. Therefore, the binary search comes to the rescue, as it can discard a part of the array every time, the elements don’t contribute to the median.
Algorithm
- The first array is of size n, hence it can be split into n + 1 parts.
- The second array is of size m, hence it can be split into m + 1 parts
- As discussed earlier, we just need to find the elements contributing to the left half of the array.
- Since, the arrays are already sorted, it can be deduced that A[i – 1] < A[i] and B[i – 1] < B[i].
- Therefore, we just need to find the index i, such that A[i – 1] <= B[j] and B[j – 1] <= A[i].
- Consider mid = (n + m – 1) / 2 and check if this satisfies the above condition.
- If A[i-1] <= B[j] and B[j-1]<=A[i] satisfies the condition, return the index i.
- If A[i] < B[j – 1], increase the range towards the right. Hence update i = mid + 1.
- Similarly, if A[i – 1] > B[j], decrease the range towards left. Hence update i = mid – 1.
i is the mid of array A as shown below:
- Few corner cases to take care of :
- If the size of any of the arrays is 0, return the median of the non-zero sized array.
- If the size of smaller array is 1,:
- If the size of the larger array is also one, simply return the median as the mean of both the elements.
- Else, if size of larger array is odd, adding the element from first array will result in size even, hence median will be affected if and only if, the element of the first array lies between , M / 2th and M/2 + 1th element of B[].
- Similarly, if the size of the larger array is even, check for the element of the smaller array, M/2th element and M / 2 + 1 th element.
- If the size of smaller array is 2,
- If a larger array has an odd number of elements, the median can be either the middle element or the median of elements of smaller array and M/2 -1th element or minimum of the second element of A[] and M/2+ 1 th array.
C++ Code for Binary Search Approach
float MO2(int a, int b) { return ( a + b ) / 2.0; } float MO3(int a, int b, int c) { return a + b + c - max(a, max(b, c)) - min(a, min(b, c)); } float MO4(int a, int b, int c, int d) { int Max = max( a, max( b, max( c, d ) ) ); int Min = min( a, min( b, min( c, d ) ) ); return ( a + b + c + d - Max - Min ) / 2.0; } float medianSingle(int arr[], int n) { if (n == 0) return -1; if (n%2 == 0) return (double)(arr[n/2] + arr[n/2-1])/2; return arr[n/2]; } float findMedianUtil( int A[], int N, int B[], int M ) { if (N == 0) return medianSingle(B, M); if (N == 1) { if (M == 1) return MO2(A[0], B[0]); if (M & 1) return MO2( B[M/2], MO3(A[0], B[M/2 - 1], B[M/2 + 1]) ); return MO3( B[M/2], B[M/2 - 1], A[0] ); } else if (N == 2) { if (M == 2) return MO4(A[0], A[1], B[0], B[1]); if (M & 1) return MO3 ( B[M/2], max(A[0], B[M/2 - 1]), min(A[1], B[M/2 + 1]) ); return MO4 ( B[M/2], B[M/2 - 1], max( A[0], B[M/2 - 2] ), min( A[1], B[M/2 + 1] ) ); } int idxA = ( N - 1 ) / 2; int idxB = ( M - 1 ) / 2; if (A[idxA] <= B[idxB] ) return findMedianUtil(A + idxA, N/2 + 1, B, M - idxA ); return findMedianUtil(A, N/2 + 1, B + idxA, M - idxA ); } float findMedian( int A[], int N, int B[], int M ) { if (N > M) return findMedianUtil( B, M, A, N ); return findMedianUtil( A, N, B, M ); }
Java Code for Binary Search Approach
static float MO2(int a, int b) { return (float) ((a + b) / 2.0); } static float MO3(int a, int b, int c) { return a + b + c - Math.max(a, Math.max(b, c)) - Math.min(a, Math.min(b, c)); } static float MO4(int a, int b, int c, int d) { int Max = Math.max(a, Math.max(b, Math.max(c, d))); int Min = Math.min(a, Math.min(b, Math.min(c, d))); return (float) ((a + b + c + d - Max - Min) / 2.0); } static float medianSingle(int arr[], int n) { if (n == 0) return -1; if (n % 2 == 0) return (float) ((double) (arr[n / 2] + arr[n / 2 - 1]) / 2); return arr[n / 2]; } static float findMedianUtil(int A[], int N, int B[], int M) { if (N == 0) return medianSingle(B, M); if (N == 1) { if (M == 1) return MO2(A[0], B[0]); if (M % 2 == 1) return MO2(B[M / 2], (int) MO3(A[0], B[M / 2 - 1], B[M / 2 + 1])); return MO3(B[M / 2], B[M / 2 - 1], A[0]); } else if (N == 2) { if (M == 2) return MO4(A[0], A[1], B[0], B[1]); if (M % 2 == 1) return MO3(B[M / 2], Math.max(A[0], B[M / 2 - 1]), Math.min(A[1], B[M / 2 + 1])); return MO4(B[M / 2], B[M / 2 - 1], Math.max(A[0], B[M / 2 - 2]), Math.min(A[1], B[M / 2 + 1])); } int idxA = (N - 1) / 2; int idxB = (M - 1) / 2; if (A[idxA] <= B[idxB]) return findMedianUtil(Arrays.copyOfRange(A, idxA, A.length), N / 2 + 1, B, M - idxA); return findMedianUtil(A, N / 2 + 1, Arrays.copyOfRange(B, idxB, B.length), M - idxA); } static float findMedian(int A[], int N, int B[], int M) { if (N > M) return findMedianUtil(B, M, A, N); return findMedianUtil(A, N, B, M); }
Python Code for Binary Search Approach
def MO2(a, b) : return ( a + b ) / 2 def MO3(a, b, c) : return a + b + c - max(a, max(b, c)) - min(a, min(b, c)) def MO4(a, b, c, d) : Max = max( a, max( b, max( c, d ) ) ) Min = min( a, min( b, min( c, d ) ) ) return ( a + b + c + d - Max - Min ) / 2 def medianSingle(arr, n) : if (n == 0) : return -1 if (n % 2 == 0) : return (arr[n / 2] + arr[n / 2 - 1]) / 2 return arr[n / 2] def findMedianUtil(A, N, B, M) : if (N == 0) : return medianSingle(B, M) if (N == 1) : if (M == 1) : return MO2(A[0], B[0]) if (M & 1 != 0) : return MO2( B[M / 2], MO3(A[0], B[M / 2 - 1], B[M / 2 + 1]) ) return MO3(B[M // 2], B[M // 2 - 1], A[0]) elif (N == 2) : if (M == 2) : return MO4(A[0], A[1], B[0], B[1]) if (M & 1 != 0) : return MO3 (B[M / 2], max(A[0], B[M / 2 - 1]), min(A[1], B[M / 2 + 1])) return MO4 (B[M / 2], B[M / 2 - 1], max( A[0], B[M / 2 - 2] ), min( A[1], B[M / 2 + 1] )) idxA = ( N - 1 ) / 2 idxB = ( M - 1 ) / 2 if (A[idxA] <= B[idxB] ) : return findMedianUtil(A + idxA, N / 2 + 1, B, M - idxA ) return findMedianUtil(A, N / 2 + 1, B + idxA, M - idxA ) def findMedian(A, N, B, M) : if (N > M) : return findMedianUtil( B, M, A, N ); return findMedianUtil( A, N, B, M )
Time Complexity: O(log(min(N,M)) where N and M is the size of the array A[] and B[].Space Complexity: O(1), as no extra space is used.
Practice Question
Frequently Asked Questions
Do the arrays need to be sorted?
Yes, both the arrays need, else you cannot apply the binary search technique to find the median.
What is the median of an array?
The middle element is found by ordering all elements in sorted order and picking out the one in the middle (or if there are two middle numbers, taking the mean of those two numbers).