COMP316A: Artificial Intelligence Techniques and Applications

- Assignment 4: Working with a joint probability distribution -

This assignment is marked out of 40, with a maximum of 20 marks for the first part and a maximum of 20 marks for the second part.

  1. Write a program that performs probabilistic inference for Boolean variables based on a given full joint distribution. The program must read the joint distribution from a file. The file name must be the first command-line argument of your program. The format of the file must be as follows. The first line in the file contains the number of variables. Each subsequent line specifies one probability in the joint distribution. To define the order of the probabilities we consider each assignment of values to the random variables as a bitstring where 0 represents the value false and 1 the value true. The value of the first variable is represented by the first bit, and so on. There are 2^N bitstrings for N variables. The entry on line i must be the probability of the values corresponding to the binary representation of the integer i-2. For three variables the file could look like this:

    3
    0.15
    0.05
    0.03
    0.27
    0.1
    0.1
    0.1
    0.2

    In this case, 0.15 is the probability that all values are false, 0.05 is the probability that the value of the last variable is true and all other values are false, and 0.2 is the probability that all values are true.

    The second command-line argument of your program must be a string specifying the query to perform. With N variables, the string must contain N characters. The first character must correspond to the first variable, and so on. The characters must be either "Q", "H", "T", or "F". A "Q" is used to indicate the query variable and an "H" is used for each hidden variable. "T" and "F" are used to specify the truth values of the evidence variables. Assume the above joint distribution is stored in a file called "jd.prob". Running your program with the command-line

    java ProbInf jd.prob TQH

    means that the evidence consists of the value true for the first variable, the query is the second variable, and the state of the third variable is hidden. You can assume that there is only one query variable.

    Your program must output the conditional probability for the value true of the query variable given the values of the evidence variables. For the above example it must output 0.6 because the conditional probability is given by (0.1 + 0.2) / (0.1 + 0.1 + 0.1 + 0.2).

     

  2. Write a program that can check for complete independence or conditional independence of two variables based on a given joint distribution. The first command-line argument must be the name of a file containing the full joint distribution in the format specified above. The second argument must be the index of the first variable being tested for independence (starting with index 1 for the first variable in the specification of the joint distribution), and the third argument must be the index of the second variable being tested for independence. All remaining (optional) arguments must be indices of attributes that we want to condition on (in case we are testing for conditional independence). For example, running your program with the command-line

    java CheckIndependence jd.prob 1 3 2

    must check whether the first and the last variable in the above problem are conditionally independent given the second one. The command-line

    java CheckIndependence jd.prob 1 2

    must check whether the first two variables are completely independent.

    If the variables are (conditionally) independent, your program must output "true" to standard-out, and "false" otherwise.

Make sure ALL the code you submit is your own code. Also, make sure your code is well-documented with lots of comments. Assignment submission is online, as usual.

Other Information

Value: 11% of the final grade
Due date: Wednesday, 7 May, midday

No extensions will be granted except for sound, documented, medical reasons.