Omar Khayam’s quatrains

This is something that I internalized after reading it a long time ago. Can’t remember the book or the transaltor.

Come forth in the first of spring

The winter garment of repentance fling

The bird of time has but a little way to fly

And the bird is on the wing

============================

Dreaming when Dawn’s Left Hand was in the Sky

I heard a Voice within the Tavern cry

"Awake, my Little ones, and fill the Cup

Before Life’s Liquor in its Cup be dry."

===============================

Binary Trees

Binary Tree:

A binary tree (T) is defined as a set of elements called nodes such that:

  1. T is empty (called the empty tree or null tree), or
  2. T contains a distinguished node R, called the root of T, and the remaining nodes of T form an ordered pair of disjoint binary trees T1 and T2.

This implies that a binary tree could have just a Root node, or it could have a root node and a empty T1 but a concrete T2 and so on.

The height of the empty binary tree is -1. The level (which starts from 0) is also referred to the depth of that element. The height of the nonempty binary tree is the depth of the farthest leaf.

2-Tree, Full and Complete Binary Tree:

A binary tree is a 2-tree if each node N has either 0 or 2 children. 

A binary tree is full if

  • T is empty or
  • There is a distinguished node R with either 0 or 2 children. This implies that the height of the children is the same.

A complete binary tree is full till the next to last level and the lowest level has leaves farthest to the left. All full binary tress are complete but the converse is not true.

For a complete binary tree, we have the following formulaes:

  • The maximun no of elements => 2^(i+1) -1 where i = height of the tree.
  • The maximum no of nodes at a particular level =  2^k where k represents the level

Memory Representation: 

A binary tree can be represented in the memory in the following ways:

  1. As a linked list.
  2. As an indexed array.

I would prefer the indexed array for a complete or a nearly complete binary tree since access time to look up the children of any node is constant. If the tree is far from complete then space would be wasted and this approach would not meet the space requirements for an algorithm.

For instance, in case of the array:

  • The root node is at Tree[0]
  • For a node at Tree[k], the left child if any is at Tree[2*k+1] and the right child if any is at [2*k+2]

 External Path Length of a binary tree:

It is the sum of all root to leaf path lenghts. Formally:

If t is a non empty binary tree, then the external path length E(t), is the sum of the depths of all leaves in that tree. This result is important for sorting algorithms.

TODO: the external path length (E(t) >= (k/2) floor(logk) proof.

 Traversals:

InOrder

Characterized by LNR (Left, Node, Right). It gets its name from the fact that for the binary search tree, this algo will traverse the elements in order (ascending)

PreOrder (also called DFS)

Characterized by NLR. It produces the prefix notation. To implement iteratively use a stack and then push right tree and then left tree. Note that the order of pushing is reversed.

PostOrder

Characterized by LRN. Produces the postfix notation. 

BFS

Use a queue to implement this one.

SQL – Subqueries and correlated queries

A subquery involves nesting however when the column from an relation (table) in an outer query is also used in an enveloped query then the queries become correlated.

What can be achieved by a correlated query can sometimes not be achievable by simply using joins or nested queries.

For instance: in the Supplier-Parts database from CJ Date’s Intro to DBMS, the following is an example of a correlated query:

Find all the supplier name for suppliers that supply part "P2":

select SNAME from S

where ‘P2′ IN

(select P# from SP

where S.S# = SP.S#)

 Note: The outer query is executed once every time the inner query returns a result.

The Clone method

The clone method is a protected method in class Object. Therefore object.clone() is visible to only its subclasses which is every class in the Java hierarchy.

Also note that there is a marker interface: Cloneable; which specifies if the instances of the classes implementing this interface can be cloned or not => if the clone method can be invoked on them.

What is a shallow copy/clone of an object:

A shallow copy comes into play if the object to be cloned references other objects. The shallow copy will reference the same objects and not copies of them. If the object to be cloned referenced primitives or immutable objects then the cloning behavior provided by invoking Object.clone() in a derived class would lead to a deep copy/cloning.

A shallow copy means that the copy object itself is different, but if the original object referenced other objects, the copy will reference those same objects, and not a copy of them. 

How does it work:

The subclass needs to override the Object.clone() by providing a public clone() method. Also, the marker Cloneable interface also needs to be implemented. The first call in the overridden clone method should be super.clone() and that ultimately travels up to Object.clone().

The object.clone() does a byte by byte (or whatever the word size is) copy of the classes in the hierarchy. This leads to shallow cloning for reference types since the references are copied into the newly cloned instance. It would be a good idea for the overriding clone method in the derived classes to provide for any deeper cloning if necessary.

 Why use a copy constructor if clone is present:

Final fields in a class cannot be set to a value once the clone has been created therefore a copy constructor might be needed.

Binary Search Tree

A binary search tree is useful since it requires only Log(n) time on average, for inserting, removing or searching for an element. The worst is of the order of n.

However, if the binary search tree is balanced then the worst is of the order of Log (n). There is no data structure that provides better search performance than that.

Threads – tips and tricks

Difference between notify and notifyAll

NotifyAll wakes up all the threads that are waiting on the specified object lock. Please note that it might be necessary to use notifyAll instead of notify in cases when a specific or a specific subset out of a set of threads need to be notified. For instance: in a producer – consumer scenario, the producer should use notifyAll to wake up all the consumers out of which only one would be able to utilize the product that the producer has available. The producer might have 2 quantities of products available for consumption and there might be consumers that require more than 2 and so just using notify might wake up those and there would be no progress.

Also note that notify will not relenquish the lock like wait does. Notify will only relenquish the lock once the synchronized section is exited. The thread invoking notify might continue to run although the thread that is waiting has been notified and is now in a RUNNABLE state ready to acquire the lock.

 

Thread Scheduling

Note that the round-robin scheduling might appear to be fair but it is not in the following case (in case of a single CPU machine):

  • Intermediate results from a calculation are not required – all that is required are the final results. For instance, if there are 5 threads started to do a calculation that provides a result in 5 seconds, then no matter what kind of time slicing is done, the average time for producing the final result is equal to or more than the average time in a non-timesliced version.

On the other hand, if intermediate results are required then the round-robin (timesliced) scheduling is the fairest. It the timeslice is 1 second, then the average time for each thread to produce an answer would be (1+2+3+4+5)/5 = 3 seconds. On the other hand, for a non-timesliced version, the average would be (1+6+11+16+21)/5 = 11 seconds.

 For a multiple CPU machine, it gets complicated.

If intermediate results are not required (on a 4 CPU machine), we have:

  • Average for a non-timesliced version = (5+5+5+5+10)/5 = 6 secs.
  • Average for a timesliced version = (5+5+5+7+8)/5 = 6 secs (for a timeslice of 2 secs)

If intermediate results are required:

  • Average for a non-timesliced version would be (1+1+1+1+6)/5 = 2 secs
  • Average for a timesliced version would be (1+1+1+1+2)/5 = 1.2 secs (for a timeslice of 1 sec)

Note that for intermediate results, the timesliced version of scheduling is the most fairest.

Dependency Inversion Principle

DIP:

Depend upon abstraction. Code to interfaces and not to concrete implementations.

This is similar to the OO principle: "program to an interface and not to an implementation". However, the DIP makes an even stronger statement about abstraction. It means any component (high or low level) within an application should depend upon abstractions. A high level component is one whose behavior is defined in terms of other lower level components.

The following guidelines help in avoiding OO designs that violate the DIP:

  1. No variable should hold a reference to a concrete class. Do not use new, use factories.
  2. Do not derive from concrete classes. Derive from abstract or interfaces. If your are deriving from concrete classes then there is a dependence on the concrete class.
  3. Do not override base class methods. Those concrete methods in the base class are meant to be shared between all the derived classes.

Inverson of control:

This generic term is also synonymous with Dependency Injection.

The assembler injects the dependency and reduces tight coupling between classes.

Java Collections and Insertion Sort

Insertion Sort:

It is used for smaller size arrays. In Java Collections, insertion sort is applied to an array of size 7 or else. This cutoff is based on empirical studies. The best choice for the cutoff is actually machine ]dependent.

Method implementation:

public void insertionSort(int x[])

{

   int off = 0, len = x.length;

   for (int i = off; i < len; i++)

      for (int j = i; j > off && x[j-1] > x[j]; j–)

          swap(x, j, j-1);

}

public void swap (int x[]; int a, int b)

{

    int t = x[a]; x[a] = x[b]; x[b] = t;

}

Analysis:

The outer for loop runs n times (where n is the length of the array).

The inner for loop runs 0, 1, 2, 3 … (n-1) for the worst case scenario wherein the array is in descending order.

Therefore the worst case scenario = > n(n-1)/2

At every inner for loop step a decision is made whether to swap. If we assume that at each step there is a 50% chance of swapping, then the average case scenario would be: (n(n-1)/2 ) / 2 approx.

The best case behavior would be visible when the array is already sorted => insertion sort would be linear in n.

Bubble Sort:

Unlike insertion sort wherein the smaller element flows towards the beginning, here the larger element flows towards the end. Although the name might be catchy but it is not used industrially – there are other sorting algorithms that provide better perfomance.

Visitor Family of Design Patterns

Driver: The visitor family allows separation of concerns and allows functionality (basically one or more methods) to be added to a hierarchy of classes without modifying anything in the hierarchy.

For instance: in the visitor pattern, the visitable classes would just define a single method (accept) in the simplest case and within that a ploymorphic call would be made to the passed in Visitor instance’s visit() method.

OCP principal: (the Open Closed Principle): The visitor family allows changing the behavior of existing classes (for instance, in the case of the Visitor Pattern, these classes are termed as Visitable) without updating the existing codebase.

The main patterns of interest in this family are:

  • Visitor
  • Decorator

The Visitor Pattern:

There is a dual dispatch happening wherein the first dispatch is to invoke the correct accept method in case there are more than one.

The second dispatch is to invoke the correct visit() method on the Visitor instance.

One of the main benefits of the visitor pattern is that it is extremely fast as it involves only two polymorphic dispatches.

The Decorator Pattern:

This one is different from visitor although they are members of the same family of patterns – the Visitor family. It is primarily different  ’cause it does not require the interface of the Visitable to be updated with the addition of the accept() method. There are no dual dispatches.

This pattern is also called the toolkit pattern since new tools (new behavior) can be acquired at run time => the application therefore can be termed to be using new tools.

The SRP principle (Single Responsibility Principle) underlines the separation of concerns in a good design. This implies, in a decorator context, that a class (such as one representing the entity: CAR)  should not burden itself with a characteristic (such as the alloy wheels) that is not intrinsic to its behavior. In the case of the preceding example of a CAR, the decorator can implement the interface of the CAR and delegate all calls to the CAR instance but "decorate" the one for getPrice() by taking into account the presence of alloy wheels and adding that to the base price and thereafter delegating to the CAR instance.

 

Generics

Have been studying on Generics in Java and have evaluated the following:

  • Thinking in Java has a comprehensive chapter on Generics although towards the end, I found some inaccuracies.
  • The Generics tutorial linked from Java Docs is thorough but for someone who is hitting Generics aka templates in C++ for the first time, I would recommend starting with "Thinking in Java".
  • Any doubts that you might have can be quickly cast aside by checking out the immensely useful site: http://angelikalanger.com/GenericsFAQ/FAQSections/ParameterizedTypes.html

Hopefully some of you might find this information useful.