A usual method for the estimation of password strength is the computation of entropy on every string that is a candidate password, see Wikipedia article. Passwords need to be complex so to be difficult for anyone to guess or discover it using some brute-force method. Entropy provides a measure of strength as it expresses the information content in a string S and our uncertainty about it.
Entropy is defined on x which is the probability of appearance of every symbol s in S and hence H(X)=-Σxlog(x); I used logarithm of base 2. Probability x is defined on the set of symbols that are available to the user for the formation of the string. In this consideration there is a vague issue. Does the user exploit all the range of the character set? In case where the user formats the string of the password using only a subset of the available character set then the entropy as described on x is not representative.
Especially in handheld devices, the user has to change the keyboard configuration in order to access different parts of the character set. The user has to specify if the keyboard will input upper or lower case characters, numbers or special symbols and the language of the characters. Hence the estimation of the password strength has to take this issue under consideration.
On this issue, I thought of using entropy. So for any symbol s I define y as the probability of s to be upper case, lower case, number or special symbol in S and H(Y)=-Σylog(y). Variables X and Y are independent as the probability of appearance of one symbol in S is not relevant to whether this symbol is upper / lower case, number or special symbol. In order to simplify things let us suppose that all the characters belong to the same set of locale character set. For independent variables we then compute the joint entropy H(X,Y)=H(X)+H(Y).