Smart Tech: 2013

Friday, August 9, 2013

Artificial Intelligence to Software Testing

Machine Learning (ML), as a sub domain of AI, is widely used in the various stages of the software development life-cycle, especially for automating software testing processes. Machine Learning algorithms have proven to be of great practical value in a variety of application domains. They are particularly useful for (a) poorly understood problem domains where little knowledge exists for humans to develop effective algorithms; (b) domains where there are large databases containing valuable implicit regularities to be discovered; or (c) domains where programs must adapt to changing conditions. Not surprisingly, the field of software engineering turns out to be a fertile ground wherein many software development tasks could be formulated as learning problems and approached in terms of learning algorithms.

Machine Learning techniques vary a great deal in terms of their underlying theory, assumptions, and model representations. Techniques also differ in terms of the research community they originate from, ranging from traditional statistics to heuristic learning algorithms and neural networks. The question is whether appropriate data can be collected and used by machine learning algorithms in order to help decision making in software engineering.

Various types of data can be collected while testing software. Execution traces and coverage information for test cases can be captured at different levels. Failure data that captures where and why a failure occurs is also of interest..

Questions that arise when working on ML implementation in software testing:

What type of machine learning methods can be effective in different aspects of software testing?
What are the strengths and weaknesses of each learning method in the context of testing?
How can we determine the proper type of learning method for the stages of a testing process? Where are the critical points in a software testing process in which ML can positively contribute?

In the Machine Learning area, various types of learning methods have been introduced - Decision Trees (DT), Artificial Neural Networks (ANN), Genetic Algorithms (GA) and more, and can be a combination of several other learning methods. The two aspects of machine learning are: Training data and learning technique. Learning technique is classified as supervised and unsupervised learning.

Applying this to software testing, training data can be collected in different stages of the testing process or development life-cycle. The learning elements could be software metrics, software specification, test case, execution data, failure reports and/or coverage data.

Let us identify the different phases of testing and the tasks that can be implemented using ML!

In the test planning, cost estimation can help test managers predict testing process costs and time, and can provide a good testing plan to manage the testing process efficiently. Test case management includes several tasks such as test case design, which intends to generate high quality test cases; regression test suite to reuse the available test cases software system to the existing test cases in order to reuse the available test cases; and test case evaluation which intends to measure the quality of the generated test cases.

In the test execution sub-dimension, fault localization can help find the exact location of the program that is defected. In addition, bug prioritization intends to prioritize the revealed faults based on their severities

Consider one of the phase test case management here: To answer which learning method fits into testing we need to prepare the data, choose an algorithm, fit a model, examine the fitting results and then improve the model or choose another algorithm. This gives a best performance one.

The methodology that researches worked on for this phase of testing that uses ML algorithms is MELBA. MELBA (MachinE Learning based refinement of BlAck-box test specification) methodology is an iterative process and it consists of five main activities.

In Activity 1, test cases are transformed into the abstract level using the Category-Partition (CP) strategy, so that they would be ready to be used by the Machine Learning method in Activity 2. The CP method helps to model the input domain. This technique requires as input, both a test suite and a test specification in the form of Category-Partition (CP) categories and choices. The CP method seeks to generate test cases that cover the various execution conditions of a function. To apply the CP method, one first identifies the input and environment parameters of each function, the characteristics (categories) of each parameter and the choices of each category. Categories are properties of parameters that can have an influence on the behavior of the software under test (e.g., size of an array in a sorting algorithm). Choices (e.g., whether an array is empty) are the potential values of a category. Test frames and test data are generated according to the categories and choices defined. As the input domain is modeled using CP categories and choices, the test suite is then transformed into an abstract test suite (Activity 1) in order to enable effective learning. An abstract test case shows an output equivalence class and pairs category and choice, which characterize its inputs and environment parameters instead of raw inputs. That means we are using test specification to move test cases to a higher level of abstraction. Once abstract test suite is available, the machine algorithm to be used is decision trees for classifying the abstracted test cases. This comes under Activity 2. In Activity 3, using a number of heuristics determine potential problems that may indicate redundancy among test cases and the need for additional test cases. In Activity 4, learnt rules may indicate the need to update CP. In Activity 5, iteration continues till no problems are identified in the trees.

Let us look at another task of testing where ML can be applied in.

Regression testing: Regression testing is to check whether modified software fails test cases where its earlier versions would have passed.

These test cases constitute a database that can be taken into account over a software life cycle. Regression testing generally accounts for two thirds of the total verification cost. Hence, a careful selection of test suite is crucial for an efficient regression testing.

If we can identify the parts of the SUT affected by changes to a program, then we can focus on testing those parts and it may save testing costs by avoiding irrelevant test cases. To address this, program slicing technique can be combined with ML. A program slicing problem could be anything of this kind: Let us assume a function x and the corresponding test case that has x need to be considered and the rest is irrelevant.

Machine Learning techniques can be used to predict which test cases may reveal bugs in the modification. Assuming the availability of the training data with the test input, and the outputs as V, we also need the dependency of the test cases for a certain function and assume that is available from the previous database. This gives us a training set which can be given to an artificial neural network.

With further work on the different phases of software testing and with training data, we can use ML algorithms in testing. In addition, there are several AI techniques which can be used to manage several different tasks.

Saturday, April 27, 2013

Nash Equilibrium

Nash equilibrium is a fundamental concept in the theory of games and the most widely used method of predicting the outcome of a strategic interaction in the social sciences.

Any game (in strategic or normal form) consists of the
Following key ingredients:
1. a set of players
2. a set of actions (or pure-strategies) available to each player,
3. and a payoff(also said as utility function) for each player.

A normal form is generally represented in matrix form

The payoff functions represent each player’s preferences over action profiles,where an action profile is simply a list of actions, one for each player. A pure-strategy Nash equilibrium is an action profile with the property that no single player can obtain a higher payoff by deviating unilaterally from this profile.

For example :

Consider first a game involving two players,each of whom has two available actions, which we call A and B. If the players choose different actions, they each get a payoff of 0. If they both choose A, they each get 2,and if they both choose B, they each get 1. This “coordination” game may be represented as follows,(Why this is called a coordination is also explained below) :

If player 1 chooses a row, player 2 chooses a column, and the resulting payoffs are listed in parentheses, with the first component corresponding to player 1’s payoff:

Player1/Player 2	A	B
A	(2,2)	(0,0)
B	(0,0)	(1,1)

Figure1: Coordination Game

The action profile (B,B) is an equilibrium, since a unilateral deviation to A by any one player would result in a lower payoff for the deviating player. Similarly, the action profile (A,A) is also an equilibrium. In simple, it is if Player 1 chooses A that gives him payoff ‘2’ then Player 2 gets payoff of 2 ,if he go with A. If he deviates from the action of player 1 (here it is A) then they payoff here is 0 for both in this case both won’t want this.

So ,this is a coordination game where each player coordinates with the action of others and gets a payoff.

As another example, consider the game “matching pennies,” which again involves two players, each with two actions. Each player can choose either heads (H) or tails (T); player 1 wins a dollar from player 2 if their choices are the same, and loses a dollar to player 2 if they are not.

Player1/Player 2	H	T
H	(1,-1)	(-1,1)
T	(-1,1)	(1,-1)

Figure2: Matching Pennies Game

This game has no pure-strategy Nash equilibria.In some cases, instead of simply choosing an action, players may be able to choose probability distributions over the set of actions available to them. Such randomizations over the set of actions are referred to as mixed strategies. Any profile of mixed strategies induces a probability distribution over action profiles in the game.

Under certain assumptions, a player’s preferences over all such lotteries can be represented by a function (called a von Neumann-Morgenstern utility function) that assigns a real number to each action profile. One lottery is preferred to another if and only if it results in a higher expected value of this utility function, or expected utility. A mixed strategy Nash-equilibrium is then a mixed strategy profile with the property that no single player can obtain a higher value of expected utility by deviating unilaterally from this profile.

The American mathematician John Nash (1950) showed that every game in which the set of actions available to each player is finite has at least one mixed-strategy equilibrium. In the matching pennies game, there is a mixed-strategy equilibrium in which each player chooses heads with probability 1/2.Similarly, in the coordination game of the above example, there is a third equilibrium in which each player chooses action A with probability 1/3 and B with probability 2/3.

Saturday, March 30, 2013

Images Slide show - Image array in Java script

#Code to make images as slide show by using images in an array



<script language="JavaScript" type="text/javascript">

</script>

MAGIC SQUARE

Ramanujan Magic square:

An easy way of creating birthday magic square:

Eg :

16	3	19	90
91	18	0	19
1	18	92	17
20	89	17	2

All rows:

16+3+19+90 = 128

91+18+0+19 = 128

1+18+92+17 = 128

20+89+17+2 = 128

All Columns:

16+91+1+20 = 128

3+18+18+89 = 128

19+0+92+17 = 128

90+19+17+2 = 128

Middle Numbers: 18+18+92+0 = 128

All corners: 16+90+20+2 = 128

Each block:

16+3+91+18= 128

19+0+90+19 = 128

1+18+20+89 = 128

92+17+17+2 = 128

Sides :

91+1+19+17 = 128

3+19+89+17 = 128

Middle blocks:

91+18+1+18 = 128

0+19+92+17 = 128

Side diagonals:

91+3+92+19 = 128

19+19+1+89 = 128

Diagonals :

16+18+92+2 = 128

20+18+0+90 = 128

Let us take four columns with variables a1,b1,c1,d1

and follwoing the matrices the next rows are a2,b2,c2,d2.....till d4 as the last element.

First we will fill the first now with the DOB of the person.

Now a1 = 16 , b1 = 3 , c1 = 19 ,d1 = 90

Next rows are calculated as follows:

a2 = d1 +1; b2 = c1 - 1; c2 = b1 - 3; d2 = a1 + 3;

a3 = b1 - 2; b3 = a1 + 2; c3 = d1 + 2; d3 = c1 - 2;

a4 = c1 + 1; b4 = d1 - 1; c4 = a1 + 1; d4 = b1 - 1;

Check : Sum of a2 + a3 + a4 = d1 + 1 + b1 -2 + c1 +1 = d1 + b1 + c1 .

Adding this to a1 gives the sum a1 + b1 + c1 + d1 which is sum of our original numbers taken.

Same is applicable for other variables.

Logic : The value of the element position is sum of any other variable plus a constant of type integer.

At any point the sum of any set of four elements as mentioned in the above cases should give a constant sum of zero and no variable should be repeated.Then our square is a magic square.

a1	b1	c1	d1
a2	b2	c2	d2
a3	b3	c3	d3
a4	b4	d4	d4

a1	b1	c1	d1
d1 + 1	c1 – 1	b1 - 3	a1 + 3
b1 – 2	a1 + 2	d1 + 2	c1 – 2
c1 + 1	d1 – 1	a1 + 1	b1 - 1

By the above procedure we can generate 4 * 4 magic square.