## Jiye to jiye

जीयें तो जीयें कैसे बिन आपके

लगता नहीं दिल कहीं बिन आपके

कैसे कहूँ, बिना तेरे, ज़िंदगी ये क्या होगी

जैसे कोई सज़ा, कोई बद्दुआ होगी

## Why did I let you go?

## How to extract and use constraints from source code?

`public class MethodVisitor extends ASTVisitor { List methods = new ArrayList(); @Override public boolean visit(MethodDeclaration node) { ...your code here... } }`

**where**

*IdentifierLength(id, l)***is the identifier (consider an identifier as a fancy term for the name of a variable) and**

*id***is the number of characters in the identifier. With such predicates and the standard logical operators, it is trivial to define FOL constraints.**

*l*`(assert (> IdentifierLength 10))`

*must be greater than 10. Let another formula be*

**IdentifierLength**`(assert (< IdentifierCaseChanges 2))`

*should be less than 2. Of course, these*

**IdentifierCaseChanges***and*

**IdentifierLength***are what we define as functions with information extracted from source code. I talk about source code here. You may apply the same idea over text as well. Once you have the predicates, just apply*

**IdentifierCaseChanges**`(check-sat)`

## Keeping pace with programming languages!

Keep a watch on programming languages, folks! They are evolving very fast.

In Java, the following code:

`collections.sort (student, new Comparator<Student>() {`

public int compare(Student laurel, Student hardy) {

return laurel.getWeight().compareTo(hardy.getWeight());

}

});

can now be written in simply one line:

`student.sort(comparing(Student::getWeight));`

Wonderful!

## Counting in Probability

Question: In the card game bridge, the 52 cards are dealt out equally to 4 players – called East, West, North and South. If North and South have total of 8 spades among them, what is the probability that East has 3 of the remaining 5 spades?

Discussion: East gets 3 spades out of remaining 5 in ways. The sample space should have all possible ways East can get spades. All possible outcomes are i.e., East gets no spade, East gets exactly one spade, …and so on till East gets all 5 remaining spades.

Therefore the desired probability is . But this does not agree with the answer Sheldon Ross gets! So, where is the problem?

The problem is in counting. The number of ways three spades can be selected from five, in this case is not exactly . It is in fact, because there could be more ways to rearrange the rest of the cards. I cannot ignore this factor since it is different proportionately when compared with the elements in the denominator. In the denominator, the number of ways East can get no spades is not . Instead, it is because now he can choose 13 cards from remaining 21 non-spade cards. Similarly, when he has one spade, he can rearrange rest of the cards in ways. Of course, East has five choices to pick that one particular spade. Therefore, East has ways to have one spade!

So, this works:

.

Two morals from this story: 1) In most introductory probability questions, we make the mistake of counting incorrectly. 2) There are often multiple ways of solving the same problem. Try the problem yourself in your own way and double check with the answer to ensure your approach is correct. Do not read probability book like a novel.

## How to parse code-like elements from free-form text?

Partial programs such as uncompilable incomplete code snippets appear in discussion forums, emails, and such informal communication media. A wealth of information is available in such places and we want to parse such partial programs from informal documentation. Lightweight regular expressions can be used based on our knowledge of naming conventions of API elements or other programming constructs. Miler is a technique based on the regex idea. But Miler’s precision is only 33% and varies based on programming language.

Another tool used in this problem of parsing parts of source code is Island Parser. The idea is to see certain parts of code (as Islands) and parse them out ignoring text and rest of content (the water). To parse a snippet, you do not need to know the whole grammar. Unimportant parts can be defined in very relaxed terms such as just a collection of characters. Parsers based on such grammars are known as island parsers. ACE tool uses island parsers that are heuristics based implemented as a bunch of ordered regular expressions. But instead of depending on a collection of source code elements as in the normal regex-based parsers, ACE uses large collections of documents as input. In ACE tool, parts of language that specify control flow are ignored (such as if, for, while). ACE uses island parser to capture code-like elements such as fully qualified API names. In Java, API names are of the form SomeType.someMethod(). For example, SAXParseException.getLineNumber(). Knowledge of such heuristics can help identify code-like elements from text.

Once extracted, ACE attempts to map these items to language elements such as package, class, method, type and variables. It uses specification document to match known items to parsed items. If a match cannot be found, the parsed items are dropped.

Island parsers as implemented in ACE can only find code-like elements which are remarkably different in presentation than normal text. For instance, there is no way we can differentiate a variable “flag” from a word in free-form text, “flag”. ACE website as of today claims that it works on postgres form of stackoverflow only. While the idea should apply to any free-form text, if you wish to play around with this state of the art, you must be ready to make your hands dirty with some setup of their source code.

Hope the programming language design community takes note of this problem and makes it easier to write high quality island parsers.