'(a b c d e f g h i j)
We'll add some elements:
'(a b c d e f g h i j k l m n o)
Delete some:
'(a b c d e i j k l m n o)
Add some more:
'(x y z a b c d e i j k l m n o)
Maybe delete one:
'(x y z b c d e i j k l m n o)
We want to compare the original sequence with the end sequence and
summarize the difference with a set of "indels", which specify the
indexes at where elements were inserted or deleted:
(diff '(a b c d e f g h i j) '(x y z b c d e i j k l m n o)) (#S(INDEL :LEFT-SUBSEQ-START 0 ;; 'a' :LEFT-SUBSEQ-LIMIT 1 :RIGHT-SUBSEQ-START 0 ;; 'x y z' :RIGHT-SUBSEQ-LIMIT 3) #S(INDEL :LEFT-SUBSEQ-START 5 ;; 'f g h' :LEFT-SUBSEQ-LIMIT 8 :RIGHT-SUBSEQ-START 7 ;; nothing :RIGHT-SUBSEQ-LIMIT 7) #S(INDEL :LEFT-SUBSEQ-START 10 ;; nothing :LEFT-SUBSEQ-LIMIT 10 :RIGHT-SUBSEQ-START 9 ;; 'k l m n o' :RIGHT-SUBSEQ-LIMIT 14))Exercise 6 (hard): Implement this.
For an extreme challenge, implement this in a way that is not hopelessly inefficient.
I suppose it would be hard if you tried to come up with it yourself:
ReplyDeletehttps://en.wikipedia.org/wiki/Longest_common_subsequence_problem
@Jason: isn't Edit distance a better starting point for this task?
ReplyDeleteWhat about a solution in O(length(a)+length(b))?
ReplyDelete(defstruct indel left-subseq-start left-subseq-limit right-subseq-start right-subseq-limit)
(defun diff (a b)
(list
(make-indel :left-subseq-start 0 :left-subseq-limit (length a)
:right-subseq-start 0 :right-subseq-limit (length b))))
The problem is ill-specified.