Generate all longest common prefix between a list and all its own suffixes

let lcp be a function of generating the number of longest common prefix between two lists.

for example lcp [1;2;3;1;2] [1;2] = 2.


For a given list, it has n non-empty suffixes, including itself.

For example,

all suffixes of [1;2;3] are:

[1;2;3] [2;3] [3]


The problem here is generating all the lcp between a list and all its own suffixes.

Here we are talking about list A and A's all suffixes. There are no list B existing.

for example, for list [1;2;3], I want a list of values and each value is

llcp [1;2;3] [1;2;3]

llcp [1;2;3] [2;3]

llcp [1;2;3] [3]

respectively. i.e., the result should be [3;0;0]


It is straightforward to use brute-force, but it take O(n^2).

How to do it in O(n)?

Don't use suffix tree or suffix array.


edit

I think it is definitely possible to do O(n). I have not figured out fully yet, but I find something interesting.

For example, we have [x1;x2;x3;x4;x5;x6], then its all suffixes are

row1: [x1;x2;x3;x4;x5;x6]

row2: [x2;x3;x4;x5;x6]

row3: [x3;x4;x5;x6]

row4: [x4;x5;x6]

row5: [x5;x6]

row6: [x6]

Our goal is to calculate

lcp row1 row1 (obviously it is n)

lcp row1 row2

lcp row1 row3

lcp row1 row4

lcp row1 row5

lcp row1 row6


So let's try lcp row1 row2 first.

If let's say lcp row1 row2 = 3, this means x1=x2; x2=x3; x3=x4; x4<>x5, right?

Then what we can get from the above comparisons?

we can know

  • x1 = x3; x2 = x4; x3 <> x5, right? then immediately we get lcp row1 row3 = 2, right?
  • x1 = x4; x2 <> x5, right? then immediately lcp row1 row4 = 1
  • x1 <> x5, then immediately lcp row1 row5 = 0

And only lcp row1 row6 is left to do.

The key here I guess is equality is transferable and also inequality can be inducted via partial equality information

Answers


Compute z-function for the list(treating is like a string and each element as one character). Then the answer for the suffix starting at the i-th position is z(i). It should be easy to find z-function algorithm implementation in O(n). Here is my O(n) implementation.

   public static List<Integer> getLcp(List<Integer> input) {
            Integer[] lcp = new Integer[input.size()];
            Arrays.fill(lcp, input.size());
            int left = 1;
            int right = -1;
            for (int pos = 1; pos < input.size(); pos++) {
                    if (pos <= right)
                            lcp[pos] = Math.min(lcp[pos - left], right - pos + 1);
                    else
                            lcp[pos] = 0;
                    while (pos + lcp[pos] < input.size() &&
                                    input.get(pos + lcp[pos]) == input.get(lcp[pos]))
                            lcp[pos]++;
                    if (pos + lcp[pos] - 1 > right) {
                            left = pos;
                            right = pos + lcp[pos] - 1;
                    }
            }
            return Arrays.asList(lcp);
    }

I dont think that this is possible to solve in O(n). But you can make a good optimization.

I would propose something linke this:

  • First get the maximal suffix of element 1 and 2
  • Then get the maximal suffix between the previous result and element 3
  • go on until you reach the end

In most cases the suffix should become a really short string after you have processed the first few elements, and then you don't have to compaare many chars. The idea is that when element 1 and 2 have a n-long suffix the global suffix can maximal be n-long (but it can be shorter).


Edit:

Okay i think i understood your problem now :D But i can't think of a O(n)-way. Here is my best try (unfortunately in C#, because currently i have no java-IDE on this computer...)

static void Main(string[] args)
{
    Console.WriteLine(string.Join("  ", llcp(new List<char> { 'A', 'B', 'C', 'A', 'A', 'B' }).Select(p => p.ToString()).ToArray()));

    Console.ReadLine();
}

static List<int> llcp(List<char> elements)
{
    List<int> result = new List<int>();
    for (int i = 0; i < elements.Count; i++)
    {
        int j;
        for (j = 0; j < elements.Count; j++)
            if (i+j >= elements.Count || elements[i+j] != elements[j])
                break;
        result.Add(j);
    }
    return result;
}
//Output: 6  0  0  1  2  0

It should be pretty optimized and im not sure if you can make ist significant faster (idea-wise, probably you can optimize a few thinks of the implementation)

MfG Mike


Need Your Help

Obtain inserted value to update a column of another table inside a for loop

c# postgresql

I am basically trying to achieve what would be the equivalent of this c# in postgres pretending that tables are some lists

Java clearing a jFrame in netbeans

java swing netbeans jframe clear

I am creating a game currently and ran into a large question how do you clear jFrames then add new content i have seen all the other questions and answers but they don't work heres my code. i want my