Translate

Friday, March 21, 2014

Rearrange a string so that all same characters become d distance away

Given a string and a positive integer d. Some characters may be repeated in the given string.Rearrange characters of the given string such that the same characters become d distance away from each other. Note that there can be many possible rearrangements, the output should be one of the possible rearrangements. If no such arrangement is possible, that should also be reported.
Expected time complexity is O(n) where n is length of input string.
Examples:
Input:  "abb", d = 2
Output: "bab"

Input:  "aacbbc", d = 3
Output: "abcabc"

Input: "geeksforgeeks", d = 3
Output: egkegkesfesor

Input:  "aaa",  d = 2
Output: Cannot be rearranged
We strongly recommend to minimize the browser and try this yourself first.
Hint: Alphabet size may be assumed as constant (256) and extra space may be used.
Solution: The idea is to count frequencies of all characters and consider the most frequent character first and place all occurrences of it as close as possible. After the most frequent character is placed, repeat the same process for remaining characters.
1) Let the given string be str and size of string be n
2) Traverse str, store all characters and their frequencies in a Max Heap MH. The value of frequency decides the order in MH, i.e., the most frequent character is at the root of MH.
3) Make all characters of str as ‘\0′.
4) Do following while MH is not empty.
a) Extract the Most frequent character. Let the extracted character be x and its frequency be f.
b) Find the first available position in str, i.e., find the first ‘\0′ in str.
c) Let the first position be p. Fill x at p, p+d,.. p+(f-1)d
Following is C++ implementation of above algorithm.
// Rearrange a string so that all same characters become at least d
// distance away
#include <iostream>
#include <cstring>
#include <cstdlib>
#define MAX 256
using namespace std;
 
// A structure to store a character 'c' and its frequency 'f'
// in input string
struct charFreq {
    char c;
    int f;
};
 
// A utility function to swap two charFreq items.
void swap(charFreq *x, charFreq *y) {
    charFreq z = *x;
    *x = *y;
    *y = z;
}
 
// A utility function to maxheapify the node freq[i] of a heap
// stored in freq[]
void maxHeapify(charFreq freq[], int i, int heap_size)
{
    int l = i*2 + 1;
    int r = i*2 + 2;
    int largest = i;
    if (l < heap_size && freq[l].f > freq[i].f)
        largest = l;
    if (r < heap_size && freq[r].f > freq[largest].f)
        largest = r;
    if (largest != i)
    {
        swap(&freq[i], &freq[largest]);
        maxHeapify(freq, largest, heap_size);
    }
}
 
// A utility function to convert the array freq[] to a max heap
void buildHeap(charFreq freq[], int n)
{
    int i = (n - 1)/2;
    while (i >= 0)
    {
        maxHeapify(freq, i, n);
        i--;
    }
}
 
// A utility function to remove the max item or root from max heap
charFreq extractMax(charFreq freq[], int heap_size)
{
    charFreq root = freq[0];
    if (heap_size > 1)
    {
        freq[0] = freq[heap_size-1];
        maxHeapify(freq, 0, heap_size-1);
    }
    return root;
}
 
// The main function that rearranges input string 'str' such that
// two same characters become d distance away
void rearrange(char str[], int d)
{
    // Find length of input string
    int n = strlen(str);
 
    // Create an array to store all characters and their
    // frequencies in str[]
    charFreq freq[MAX] = {{0, 0}};
 
    int m = 0; // To store count of distinct characters in str[]
 
    // Traverse the input string and store frequencies of all
    // characters in freq[] array.
    for (int i = 0; i < n; i++)
    {       
        char x = str[i];
 
        // If this character has occurred first time, increment m
        if (freq[x].c == 0)
            freq[x].c = x, m++;
 
        (freq[x].f)++;
        str[i] = '\0'// This change is used later
    }
 
    // Build a max heap of all characters
    buildHeap(freq, MAX);
 
    // Now one by one extract all distinct characters from max heap
    // and put them back in str[] with the d distance constraint
    for (int i = 0; i < m; i++)
    {
        charFreq x = extractMax(freq, MAX-i);
 
        // Find the first available position in str[]
        int p = i;
        while (str[p] != '\0')
            p++;
 
        // Fill x.c at p, p+d, p+2d, .. p+(f-1)d
        for (int k = 0; k < x.f; k++)
        {
            // If the index goes beyond size, then string cannot
            // be rearranged.
            if (p + d*k >= n)
            {
                cout << "Cannot be rearranged";
                exit(0);
            }
            str[p + d*k] = x.c;
        }
    }
}
 
// Driver program to test above functions
int main()
{
    char str[] = "aabbcc";
    rearrange(str, 3);
    cout << str;
}
Output:
abcabc
Algorithmic Paradigm: Greedy Algorithm
Time Complexity: Time complexity of above implementation is O(n + mLog(MAX)). Here n is the length of str, m is count of distinct characters in str[] and MAX is maximum possible different characters. MAX is typically 256 (a constant) and m is smaller than MAX. So the time complexity can be considered as O(n).
More Analysis:
The above code can be optimized to store only m characters in heap, we have kept it this way to keep the code simple. So the time complexity can be improved to O(n + mLogm). It doesn’t much matter through as MAX is a constant.
Also, the above algorithm can be implemented using a O(mLogm) sorting algorithm. The first steps of above algorithm remain same. Instead of building a heap, we can sort the freq[] array in non-increasing order of frequencies and then consider all characters one by one from sorted array.

No comments:

Post a Comment