This PDF details my attempt to understand this interesting topic.
Please see this issue that I created for LogBack and was accepted to be taken care of in v1.3.0:
https://jira.qos.ch/browse/LOGBACK-1288GK
The fix should allow for a stack trace or some warning message if the level value is incorrect as it was when I had inadvertently used a space character as a suffix as in “WARN “ instead of “WARN” and it led to the library defaulting to “DEBUG”.
InetAddress: This address maps to a single IP address for a host name even though the host name might resolve to multiple IP addresses.
One way of creating an InetAddress:
1 |
InetAddress add = InetAddress.getByName(String host) |
Here the “host” argument can be either a string representation of the IP Address (as in “x.x.x.x.” for IPv4) or a host name. If it is an IP Address then an InetAddress
instance is returned with that IP. In such cases, the InetAddress
instance would be bound to the IP specified in the host. If that host has multiple IP addresses then this InetAddress
instance would remain bound to the IP address passed in. That returned instance could be an instance of the two subclasses of InetAddress
for IPv4 and IPv6.
1 |
InetSocketAddress(String host, int port) |
=> This constructor, if passed an IP address would invoke the InetAddress.getByName(String host)
with the IP address and there would not be any reverse lookups.
Difference between map()
and flatMap()
:
The map()
operates on the individual elements of the stream and returns a value as a Stream with the results. The flatMap()
operates on the individual elements of the stream and returns a value as a Stream
with the results. In this they are similar.
However the map()
takes in a function that returns any R
(could be a Stream
as well) whereas the flatMap()
operation takes in a function that returns a Stream
.
Streams employ the CRTP (Curiously Recurring Template Pattern).
For instance:
1 2 3 4 5 6 |
public interface BaseStream<T, S extends BaseStream<T, S>> where: the type of the stream elements <s> the type of of the stream implementing {@code BaseStream} public interface Stream extends BaseStream<T, Stream> public interface IntStream extends BaseStream<Integer, IntStream> </s> |
If we miss out on the “public” scope modifier for the test case method annotated with @Test then that is not identified as a test case by the framework and is not run.
Ensure “public” for any test case (annotated with @Test)
While attempting to do a substitution of word characters “[a-zA-Z0-9_]” in a string, I noticed using a “\w*” pattern lead to 2 matches:
For instance: see: https://regex101.com/r/ptz8Cm/1
The explanation is at: http://www.regular-expressions.info/zerolength.html
In short, we should use “\w+” instead of “\w*” so that zero-length matches are avoided.
Note: if you make the quantifier lazy as in https://regex101.com/r/ptz8Cm/2 then you have 3 matches.
If we ever use “max-age” for content that is partially dynamic in the sense that it may or may not change in “X” minutes and we use a CDN to cache that and we also have a “max-age” header then we need to remember to do this:
The reasoning is that the clients (browsers etc.) would use these two values to decide on refreshing the cache.
Consider an OCSP response which is served through a CDN. We also know that the OCSP response has a “nextUpdate” temporal value and the RFC 5019 clearly states that the “max-age” should be less (or equal to since it seems that the clients allow for equality as a positive case as well) than the “nextUpdate“.
The issue is when the “max-age” is not counted down by the CDN and the value in the “Date:” header is – we get an issue where the OCSP response might be stale but it is not timed out of the cache either for validation or a fetch.
Proceeding from the earlier post on the same topic, we have the following code for a Dynamic Programming bottom-up approach to the Levenshtein “Edit Distance” algorithm where all the weights are simply “1” for insert, delete, substitute operations. Note that this would be also referred to the “tabular” approach as well since we fill up a matrix as we go along.
In a later post, we would attempt to reduce the space requirement from O(m.n) to a constant. In another post, we would provide a way to configure the weights and thereby providing weighted edges to the DAG.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
/** * A little different: we convert s2 into s1. * * @param s1 * @param s2 * @return */ public int cost1(char[] s1, char[] s2) { int lenS1 = s1.length; int lenS2 = s2.length; // we convert s2 into s1. int[][] cost = new int[lenS1][lenS2]; for (int i = 0; i < lenS1; i++) { cost[i][0] = i; } for (int j = 0; j < lenS2; j++) { cost[0][j] = j; } // now we begin for (int i = 1; i < lenS1; i++) { for (int j = 1; j < lenS2; j++) { if (s1[i] == s2[j]) { // whatever the cost was for the previous indices cost[i][j] = cost[i-1][j-1]; } else { int costIns = cost[i-1][j] + 1; int costDel = cost[i][j-1] + 1; int costSub = cost[i-1][j-1] + 1; cost[i][j] = Math.min(costIns, Math.min(costDel, costSub)); } } } return cost[lenS1 - 1][lenS2 -1]; } |
And the test code:
1 2 3 4 5 6 7 8 9 10 |
@Test public void testCost1_A() { String str1 = "adcklllkl"; String str2 = "abcjfkerjfgggg"; int cost1 = new DPTabulateLevenshtein().cost1(str1.toCharArray(), str2.toCharArray()); int costWiki = new PlainLevenshtein(str1, str2).LevenshteinDistance(str1.length(), str2.length()); System.out.println("cost 1: " +cost1); System.out.println("cost Wiki: " +costWiki); assertEquals(cost1, 11); } |
This is my first post on Levenshtein Algorithm: a way to calculate the edit distance between two strings. A way to calculate the cost to transmute a string1 to a string2.
The comments in the codebase are useful for comprehension on their own. I start with the recursing and building a matrix where the answer (the cost to transmute or convert str1 into str2) lies at index [m][n] for the first implementation. For the second case, the cost would be at index [0][0].
This is the gateway to elucidating DP through the implementation of this algorithm. We would implement in ways that classify as DP (both memoization and tabulation) as well as plain and simple recursion or iteration with no DP components. We start with vanilla recursion in the following codebase. Please review the comments in the codebase to understand why this is not DP. A future post would elucidate DP.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 |
public class PlainLevenshtein { private final char[] str1; private final char[] str2; public PlainLevenshtein(String s1, String s2) { str1 = s1.toCharArray(); str2 = s2.toCharArray(); } /** * my code 1: A top-down approach that can benefit from memoization. It * recurses and creates a matrix where the answer lies at [m][n] * <p> * The goal is to transmute "str1" into "str2". * <p> * The initial call is {@code cost1(m, n)} where m = size of "str1" <br> * and n = size of "str2". * * @param i * the index (the 0th based index from 0 to m - 1) * @param j * the index (the 0th based index from 0 n - 1) * @return: the final cost */ public int cost1(int i, int j) { if (i == 0) { return j; } if (j == 0) { return i; } assert i >= 0; assert j >= 0; int c = 1; if (str1[i - 1] == str2[j - 1]) { c = 0; } int ins = cost1(i, j - 1) + 1; int del = cost1(i - 1, j) + 1; int sub = cost1(i - 1, j - 1) + c; return Math.min(ins, Math.min(del, sub)); } /** * my code 2: The difference is that this recurses and builds a matrix and * the answer lies at [0][0]. This could also benefit from memoization as * well. * * <p> * However in order for it to qualify as DP, they key is to solve the * smaller sub-problems first and then move on to solver bigger sub-problems * with the results of the smaller sub-problems that are already computed * and available at that point. Therefore this is not DP either. * <p> * The initial call is cost2(0,0) * * @param i * the index (the 0th based index from 0 to m - 1) * @param j * the index (the 0th based index from 0 n - 1) * @return: the final cost */ public int cost2(int i, int j) { if (i == str1.length) { return str2.length - j; } if (j == str2.length) { return str1.length - i; } assert i >= 0; assert j >= 0; int c = 1; if (str1[i] == str2[j]) { c = 0; } int ins = cost2(i, j + 1) + 1; int del = cost2(i + 1, j) + 1; int sub = cost2(i + 1, j + 1) + c; return Math.min(ins, Math.min(del, sub)); } } |
A snippet of the test code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
@Test public void testCost1_A() { String str1 = "adcklllkl"; String str2 = "abcjfkerjfgggg"; int cost1 = new PlainLevenshtein(str1, str2).cost1(str1.length(), str2.length()); System.out.println("cost 1: " +cost1); assertEquals(cost1, 11); } @Test public void testCost1_B() { String str1 = "abc"; String str2 = "xya"; int cost1 = new PlainLevenshtein(str1, str2).cost1(str1.length(), str2.length()); System.out.println("cost 1: " +cost1); assertEquals(cost1, 3); } |