Pantelis Rodis: Computer Science

Showing posts with label Computer Science. Show all posts

Friday, 26 June 2015

System security as a computational problem

In this post I talk about system security and I examine it as a computational problem. The motivation of thinking about this came from a very interesting post by Dick Lipton and Ken Regan that was published on their blog a few days ago.
The problem of developing a secure system regards the development of a system that will maintain a set of information F which will only be accessible by authorised users. As a result, any unauthorised user will be excluded from accessing F. A good way to study the system is to model it using Turing thesis.
Any user has to provide some input like username and password in order to get clearance and usually F is represented by some text or binary information, so let's define F as a set of strings and w as a string which is the user input. Also let A be a Turing Machine that simulates the system we have developed and we consider it secure.
Some systems in order to identify authorised users demand extensive interaction which results that users have to fill questionnaires and forms. Without loss of generality, we may enclose all the user input in w also the acceptance or rejection of it by the system may be modeled efficiently by the operation of A. On this assumption, A will accept a valid w and output F or reject input from not authorized users and output "NO".
Let S be a set of strings so that if A decides that input w belongs to S then the user is considered as authorized and F is outputted. There are two cases where this model does not work as expected.
Case1. There is string e that does not belong to S and on input in A, it outputs F. So an intruder may use e in order to gain access to F without knowing w.
Case2. String w may be discovered algorithmically by unauthorized users.

Some explanation on these cases.
In case1 I model the cases of the real world where A is hacked. The intruder does not input the identification of some authorized user in order to access F, so user input does not belong to S. The intruder inputs a string that will cause A to get undesired behavior and this will result the output of F. In real world, sometimes the intruders exploit security holes and send data to the systems (like macros, telnet commands or Trojans) that cause such undesired behavior.

Case2 refers to the complexity of w and how easily it can be guessed.

Intentionally I leave out the cases where w has been stolen from its owner with techniques likes phishing or similar; this seems to be more of a social than a computational problem.

The computability of security and why hacking is easier that defending

The following language describes Case1 as a computational problem.
SECURITY1 = {A, S, F, e | There is string e that does not belong to set S, TM A outputs F and halts on input e.}

Theorem. Problem SECURITY1 is recognizable and undecidable.

Proof. SECURITY1 is recognizable. We input in A string e that does not belong to S and we get F. If SECURITY1 is also decidable then for any input s that does not belong to S, we may decide whether machine A halts and outputs F or “NO”. As A can be any TM and s any string, these assumptions stand only if the halting problem is decidable and this results a contradiction.

Although both the developer of A and an intruder face the same problem, as they both have to solve one instance of SECURITY1 in order to build or hack A. Yet, the job of the intruder is easier than the job of the developer. The intruder has to solve the recognizable side of the problem while the defender has to approach the undecidable side of it.
In other words, if you try to intrude A you build a process described by string e and once you input it to A you can find out if it works. If you are a defender you can never be sure that the system that you build will be strong enough to resist attacks. Building A is a far more complex task than discovering e and that is why you see systems that are developed by experienced developing teams to have been hacked by some technology enthusiast teenager.

About Case 2

Case2 refers to the complexity of w or else to the complexity of the procedure of entering user identification data. There is no point on examining the computability of such processes as most of them are in theory computable. For instance, when a user has to enter a 4 digit PIN in a system the number of possible PINs that may be entered is bounded and in theory it is computable. If we enter the restriction that the user has only three attempts to enter the correct PIN then it gets difficult for an intruder to guess the PIN.
The proposed approach by Lipton and Regan actually increases the complexity of the procedure of identification. It gets harder for an intruder to guess w which may be subject to frequent changes if the questions asked by the system change frequently.

The image on this post is freely available under Creative Commons License. Check the source for the original image and many more.

Friday, 3 April 2015

On the computability of causal inference

Causation (or else causality) refers to the cause-effect relation that may exist between two events or phenomena. The causal inference problem regards the determination of causation between events.
Let’s consider the case where there is an interaction between two events which is not affected by any other factor or event. The two events are observable and measurable and for each one we may record a set of data that describes it. Then the causal inference problem is equivalent to the existence of a computable relationship between the two data sets that describe the events.
As I show next, the general case of the problem of causal inference even with the above assumptions is undecidable and this has important implications in science. The undecidability of a problem means that there are some instances of it that may not be proven true or false, while other instances may be decidable.

Let’s formulate the general case of the causal inference problem as a decision problem. Let A and B be two sets of data.

CAUSAL = {A, B | There is computable function f(A) = B}
It is enough to use the formulation of CAUSAL in order to describe causal inference problem. If there is a cause-effect relationship between A and B then there is function f so that f(A) = B and if CAUSAL was proven decidable then f would be computable in any case.
Let me restrict this discussion in data sets and phenomena that their measurements are as accurate as it is demanded. The meaning of this assumption is that the data of the problem is not affected by factors that would make their computation inaccurate like the principle of uncertainty. Also, this implies that the elements of both sets are computable. The earlier assumption that the two events interact and are not affected by any other factor simplifies the definition of the problem, as it implies that if there is computable function f between A and B this describes a cause-effect relationship and it is not coincidental.
In the general case the problem is not trivial as A and B may be the products of complex or even chaotic systems and natural procedures. Here the term “trivial” is used in the sense of mathematical logic.

Theorem. The causal inference problem is logically and computationally undecidable.

Proof of logical undecidability. If the problem is decidable then we may construct a system using which we may prove every instance of the problem. We may encode using Gödel numbering and arithmetic each element a and b of A and B in a formal system of logic P so that for every sentence S of the form a -> b, S or -S is provable so that eventually the sentences A -> B or -A -> B will be provable. The problem is complex so P will be sufficiently strong.
This consideration results a contradiction, as proven by Kurt Gödel (Gödel, 1931) in such a system there are true sentences that are not provable. The problem is logically undecidable.

Proof of computable undecidability. If CAUSAL is decidable, then there is Turing Machine M which on input of a string e that contains the descriptions of A and B always halts and outputs “YES” or ”NO”.
As the problem is assumed decidable and M is computable, we may construct Turing Machine N that when inputting string w that contains the description of M and e verifies that M decides e; N accepts w in any case. There are no restrictions in the nature of A and B or their encoding and representation in e, hence w can be any string. Also there are no restrictions in the description of M and N, they can be any Turing Machines.
As a result, the language ATM = {N, w | N is a Turing Machine and accepts w} is decidable. This is a contradiction. As it is known from literature that ATM is undecidable; see (Sipser 2006, p.179). Hence CAUSAL is also computationally undecidable.
Another way to reason about the computational undecidability of CAUSAL is to consider Turing Machine N that simulates the phenomenon described by A. We input in N a string w that describes set A, then N produces a string that describes set B and then halts. If CAUSAL was decidable then N would be computable. The string that describes A may be any string as there is no restriction to the event that A describes; so N halts on any given input. Also N may be any Turing Machine.
Still these assumptions lead to the same contradiction; if CAUSAL was computable so would be language ATM.

Laplace’s demon

The undecidability of causal inference has an effect on the famous idea of the demon of Laplace.

“We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes.”

Pierre Simon Laplace, A Philosophical Essay on Probabilities 1820
source: Wikipedia, also in (Laplace, 1951, p.4)

The intellect that described in the text was later named “Laplace’s demon”. There has been criticism against this idea and some published proofs against it. The proofs presented above are also against this idea. Such an intellect or demon should be able to decide CAUSAL which is undecidable.

The extend of undecidability

In every undecidable problem there are some instances that are decidable and some that are not. To put it another way, each undecidable problem is undecidable to some extent. Logic and the theory of computation have not been able to determine this extent for every problem. For some problems like the halting problem this extent seems large; only very simple instances of the halting problem seem computable.
In causal inference, things must be different. Science and especially Artificial Intelligence has managed to build successful methods in computing quite a lot of instances of the problem. The importance in these methods is that they may be applied in the analysis of real world phenomena. In my opinion, some of the reasons of successful predictions in real world phenomena are bounded inputs, relaxation, approximation and strong hypotheses.

Bounded inputs. Real world phenomena usually don’t take arbitrary large or small values. The possible values in the parameters of a phenomenon are bounded. This results that the problem range is reduced making computations easier.
Relaxation. Relaxing some parameters in the definition of a problem might lead to the computation of an easier problem. In some cases the conclusions from the definition of the easier problem are as valuable as the solution of the initial problem is.
Approximation. It is quite often that the solutions of hard problems may be approximated, which is valuable when precision is not very important.
Strong hypotheses. Sometimes scientists have strong evidence and strong intuition about a certain universal truth that cannot be proven or it is at least very difficult to be proved. In these cases researcher state it as a hypothesis and accept it as an unproven true fact and they may work on it overriding the obstacle of finding a solid proof for it. The biggest part of this post is based on such a hypothesis; it is the Church - Turing Thesis (Turing, 1936).

References

Gödel, Kurt, 1931. Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme, I. and On formally undecidable propositions of Principia Mathematica and related systems I in Solomon Feferman, ed., 1986. Kurt Gödel Collected works. Vol. I. Oxford University Press, pp 144-195.

Laplace, Pierre Simon, 1951. A Philosophical Essay on Probabilities, translated into English from the original French 6th ed. by Truscott,F.W. and Emory,F.L., Dover Publications (New York, 1951)

Sipser, Michael, 2006. Introduction to the Theory of Computation, Second Edition. Thomson Course Technology, USA.

Turing, Alan M., 1936. On Computable numbers, with an application to the entscheidungsproblem. Proceedings of the London Mathematical Society. 42, 230–265.

Thursday, 19 March 2015

Why we cannot build a general method for algorithmic error correction

This post is part of this paper.

A few years ago, I was working as GIS programmer and administrator in a company that was mainly producing digital maps for navigators. At that time, we had to map the road network in big areas and cities from scratch and also update older maps in other areas. One of the issues that you have to deal in projects of this kind is the errors that occur during the mapping procedures. When there are 10 or more people that map a road network of 20.000 road segments it is unavoidable that some (few or quit a lot) of the roads will be mapped incorrectly for various reasons. For instance, a bridge may be accidentally mapped as a normal road or some road may be not characterized as unpaved while it is.
The only way to deal with this situation is to define some specifications for the mapped road network and then build some software that will detect the cases where your map is not compliant to the specifications. You may find the description of some methods of building such software in my paper (link). Then some experienced person has to look into the detected errors and correct them.
At that time I had a big disagreement with my employer as he believed that we could build software that could correct our maps without human intervention. He thought that as we know the specifications that the network should have and we also know the cases where the specifications are violated we could also correct them algorithmically. Besides him and his opinion, I have read some papers published in conferences in which their authors support more or less the same idea while in other papers in journals it is supported that this problem may be solved only partially and with some caution.
As I will show next such a problem is in generally undecidable by computers and it may be solved only in simple instances. I always believed that this should be common knowledge.
In fact the algorithmic correction of maps is a special case of a more general problem, the algorithmic correction of software bugs. In both cases you have some specification that may be written in some formal way, say logic. Then you may detect some errors or bugs and you need to manipulate the parameters and the structure of your program or dataset so as to meet your specifications.
Moreover we may model both problems as graphs. As it is known from software technology, a piece of software may be modeled as a graph. The nodes of the graph are the functions and the other units that consist the software and the edges model the ways that the units interact. The nodes and edges may be represent really complex software components or simpler components like databases with spatial information as in road networks. Studying the more general problem we decide the simpler problem as well. So in order to study this issue you have to answer one question.
Since we can build software that detects errors can we build software that algorithmically corrects these errors?
An empirical answer to this question would be “in general, no”, it is very difficult to build an algorithmic process that would take into account all the necessary parameters and lead to the desired result by correcting all the instances of errors in a piece of software or in a map. Let’s define the problem and prove its undecidability. On the proof, I use the Turing Machine (TM) as a model of general purpose computer and define the problem of algorithmic correction of software errors as an equivalent computational problem of algorithmic correction of the input string of a Turing Machine.
Let M be a Turing Machine that accepts input w. If w is build under certain specifications M prints a message of acceptance and then halts. In case w does not comply to specifications M might not halt or have unpredicted behavior. Now let N be another Turing Machine that examines any given input w. If w is incorrect according to specifications and the description of M, then N corrects w so M can run w properly and then halt.

Formally:

INPUT-CORRECTIONTM = {<M, N, w> | M and N are TM, N modifies w according to the description of M so that M halts on input of modified w}

In order to prove the undecidability of this problem it is necessary to define another undecidable problem the HALT problem. Undecidability implies that it cannot be decided by a Turing Machine and an algorithm that solves it for any given input w cannot be built.

HALTTM = {< M, w> | M is a TM and M halts on input w}

Theorem. The INPUT-CORRECTION problem is undecidable.

Proof: Turing Machine N modifies any input w, according to the description of M, so M can run w and halt. If w is correct N prints “YES”, M runs w produces an output and halts. If w is incorrect and M cannot halt while running w, N corrects w and prints “YES” then M runs w produces an output and halts.
Let H be a machine that simulates the operation of both M and N, then H halts on any given input and the HALT problem is decidable for H and any given input w. This is a contradiction since the HALT problem is undecidable.

Another approach

The approach of the proof might look difficult to understand and especially its connection with the correction of errors in spatial networks. Let’s look into it from a more practical perspective.
The specifications of every network define the valid spatial relationships of the network elements as well as the valid attributes of every element. Attributes and spatial relationships cannot be examined independently since the validation of both of them ensures network consistency. Let A be the set of all the possible valid combinations among the valid spatial relationships and the valid attributes. Let B be the set of all the possible combinations of spatial relationships and attributes that are considered invalid according to specifications.
Let f be an algorithmic function that may correct the errors in a given network. Then f accepts as input one element of B and outputs one element of A. This means that each element of B must correspond exactly to one element of A; f must be 1-1 function or many-to-one (n-1). But this can happen only to very special and simple cases, it cannot be applied in general. If an element of B corresponds to many elements of A then f may not be defined.
On the same way we may define sets A and B for any other software structure. An 1-1 or n-1 function among software with bugs and error free software may be found only in very simple instances.

A more intuitive example that concerns spatial networks.

Let N be a non-planar network of linear features. Let a and b be two elements of N where the edge of a crosses the middle of the edge of b. Suppose that the person who draw a and b neglected to apply on them the necessary parameters that denote their structure and state. We may consider the following possible cases.

The edge of a is on ground level and the edge of b overpasses a, edge b represents a bridge.
The edge of a is on ground level and the edge of b underpasses a, edge b represents a tunnel.
Both edges are not on ground level and edge b overpasses edge a, multilevel traffic junction.
Both edges are in ground level, a and b represent a crossroad. Both edges must split on the intersection point and form a proper crossroad.

There is no way how we can build an algorithm that will decide which of the four cases corresponds to the given instance, so as to correct the error.
If we consider a special case of network instance in which we may find an 1-1 function between sets A and B, we could perform algorithmic correction of a specific error case. As it is easy to see this error case will be very simple. For instance, if we consider a network where each element that participates in a specific computable spatial relationship must be characterized by an attribute of a single value. Then an algorithmic process that assigns this value to the attributes of the appropriate elements may be build. Nevertheless this would be of minor importance in the complex task of error correction.
As a conclusion, I have to say that important corrections of detected errors in maps and in general in computer software have to be manual and of course by experienced personnel.

Saturday, 31 January 2015

Information entropy as an anthropomorphic concept

After this post I wrote a paper on this subject, find it in arxiv.org

One of the most interesting opinions that I have heard about entropy has been formed by E.T. Jaynes and E.P. Wigner (link). According to them, entropy is an anthropomorphic concept in the sense that in a physical system there can be found many thermodynamic systems. The physical system can be examined from many points of view examining different variables each time and calculating entropy differently. In this post and in the paper in arxiv.org that followed it (link), I discuss how I think that we may apply this concept in information entropy, how Shannon’s definition of entropy can fit in Jayne’s and Bank’s opinion. I also present a case in which I have used this combination of ideas in the past in an Android application.
Information entropy has been defined by Claude Shannon (link). It quantifies the amount of information that is transmitted by a channel or more generally the information content that there is in a message M. Message M consists of a set of n symbols, let P(i) be the probability of appearance of any symbol i in M then for all n symbols stands that entropy H = -ΣP(i)logP(i). Probability P(i) equals to the frequency of appearance of i in M.
Let’s generalize this idea. Let S be a set of objects that we may describe their properties using v variables for each one. Let each variable take values from a range [a, b] of discrete values. Now we have a lot of information about S and we can use entropy in order to analyze it. Exactly as Jaynes describes the examination of physical systems we may choose any subset v’ of v in order to examine the information content of S. The choice of v’ depends on how we care to examine S and this provides the anthropomorphic characteristic to our analysis. Using information entropy we can examine the distribution of the properties described by v’ in the objects of S.
For simplicity reasons let us examine S considering two properties for its objects described by variables X and Y. For variable X and for probability P(x) in the appearance of any value x entropy is H(X)=-ΣP(x)logP(x). Same for Y, it stands that H(Y)=-ΣP(y)logP(y). In order to combine these two entropies we have to consider if they are dependent.
As it is known for independent X and Y stands that H(X,Y) = H(X) + H(Y).
For joint probability P(x, y) stands H(X,Y) = -ΣΣ P(x, y)log P(x, y).
For conditional entropy H(X | Y) which stands for the probability P(y, x) when X depends on Y entropy is H(X | Y) = -ΣP(y, x)log(P(y)/P(y, x)).
These considerations about entropy variables are different from the classical definitions of Shannon. Shannon defines conditional probability P(x, y) as the probability of appearance of symbol x which depends on the appearance of symbol y. While the above probabilities consider properties x and y of one discrete object.
Unconsciously I had used in the past all the above for the development of an android application; you may find it here or here. The goal of the application was to rate the strength of passwords for handheld devices. A strong password M must be complex but as we focus in handheld devices the frequency of appearance of each character in M is not the only property we may consider. The user of each device has to change the keyboard appearance of the device in order to write upper case characters, lower case, symbols or numbers. This makes the input of M more complex than in desktop computers and we may consider this property when rating the complexity of M.
The rating of password strength still regards characters of a simple string M but let us look into them as discrete objects of more than one property. Variable X is defined on the set of characters available to the user and P(x) is the frequency of appearance of each character x in M. Variable Y is the property of each symbol of being an upper case character, lower case, symbol or a number. Then P(y) is the probability of y being upper case character, lower case, symbol or number. The two variables are independent; X does not depend on Y. As a result H(X,Y)=H(X) + H(Y). Calculating entropy using the last equation provides a more accurate analysis of the information content of M.
For a more detailed presentation see my paper currently published in arxiv.org.

Friday, 1 August 2014

Alan Turing’s bibliography with links

The importance of the work of Alan Turing is well known to everybody. The most complete resources on Turing’s work are The Turing Digital Archive, the book “The essential Turing” of Jack Copeland and the volumes of his “Collected works”. The problem with the Digital Archive is that it contains scans of manuscripts from which is not so easy to read, yet I think it is very interesting to browse.

Below, I list most of Turing’s papers and some links for them. The links are freely accessible and they lead to easily read pdf files that are scans or reprints of the original publications or at least they are very close to it.

The links are listed as found on the web; any possible copyright issues or broken link issue should regard the owner of the site that hosts the linked files. In case you have any suggestions for the list please add a comment.
At the end you may find something that should be an anecdote but it is a real problem in academic publications.

Turing, Alan M. On computable numbers, with an application to the Entscheidungs problem. Proceedings of the London Mathematical Society, 2 s. vol. 42 (1936–1937), pp. 230–265.

Pages