Proceeding from the earlier post on the same topic, we have the following code for a Dynamic Programming bottom-up approach to the Levenshtein “Edit Distance” algorithm where all the weights are simply “1” for insert, delete, substitute operations. Note that this would be also referred to the “tabular” approach as well since we fill up a matrix as we go along.
In a later post, we would attempt to reduce the space requirement from O(m.n) to a constant. In another post, we would provide a way to configure the weights and thereby providing weighted edges to the DAG.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
/** * A little different: we convert s2 into s1. * * @param s1 * @param s2 * @return */ public int cost1(char[] s1, char[] s2) { int lenS1 = s1.length; int lenS2 = s2.length; // we convert s2 into s1. int[][] cost = new int[lenS1][lenS2]; for (int i = 0; i < lenS1; i++) { cost[i][0] = i; } for (int j = 0; j < lenS2; j++) { cost[0][j] = j; } // now we begin for (int i = 1; i < lenS1; i++) { for (int j = 1; j < lenS2; j++) { if (s1[i] == s2[j]) { // whatever the cost was for the previous indices cost[i][j] = cost[i-1][j-1]; } else { int costIns = cost[i-1][j] + 1; int costDel = cost[i][j-1] + 1; int costSub = cost[i-1][j-1] + 1; cost[i][j] = Math.min(costIns, Math.min(costDel, costSub)); } } } return cost[lenS1 - 1][lenS2 -1]; } |
And the test code:
1 2 3 4 5 6 7 8 9 10 |
@Test public void testCost1_A() { String str1 = "adcklllkl"; String str2 = "abcjfkerjfgggg"; int cost1 = new DPTabulateLevenshtein().cost1(str1.toCharArray(), str2.toCharArray()); int costWiki = new PlainLevenshtein(str1, str2).LevenshteinDistance(str1.length(), str2.length()); System.out.println("cost 1: " +cost1); System.out.println("cost Wiki: " +costWiki); assertEquals(cost1, 11); } |