by Long Zhang, KTH, on 25 Jun 2020 at 16:00 in zoom
Software systems contain resilience code to handle those failures and unexpected events happening in production. It is essential for developers to understand and assess the resilience of their systems. Chaos engineering is a technology that aims at assessing resilience and uncovering weaknesses by actively injecting perturbations in production. During this talk, I will introduce what we have researched about application-level chaos engineering. More specifically, I will talk about resilience evaluation at the level of try-catch blocks, methods, and system call invocations.
by Jonas Haglund, KTH, on 29 May 2020 at 16:00 in zoom
I/O devices having direct memory access (DMA) are common in computer systems, e.g. storage devices, USBs and network interfaces. Since memory is shared between critical and non-critical software, it is important that the DMA driver of the operating system does not configure the DMA controller (DMAC) such that the DMAC reads confidential data or overwrites sensitive data or code. In order to configure the DMAC such that it does not perform any unwanted memory accesses, it must be known what an allowed DMAC state is.
I will present an approach for formally verifying which DMAC states are allowed from the perspective of memory isolation and the common features of DMACs. This is currently work in progress, and will be presented with previous and related work.
by Suprovat Ghoshal on 23 Mar 2020 at 13:15 in room 4523
The problem of learning t-term DNF formulas (for t=O(1)) has been studied extensively in the PAC model since its introduction by Valiant (STOC 1984). A t-term DNF can be efficiently learnt using a t-term DNF only if t=1 i.e., when it is an AND, while even weakly learning a 2-term DNF using a constant term DNF was shown to be NP-hard by Khot and Saket (FOCS 2008). On the other hand, Feldman et al. (FOCS 2009) showed the hardness of weakly learning a noisy AND using a halfspace -- the latter being a generalization of an AND, while Khot and Saket (STOC 2008) showed that an intersection of two halfspaces is hard to weakly learn using any function of constantly many halfspaces. The question of whether a 2-term DNF is efficiently learnable using 2 or constantly many halfspaces remained open.
In this work we answer this question in the negative by showing the hardness of weakly learning a 2-term DNF as well as a noisy AND using any function of a constant number of halfspaces. In particular we prove the following: For any constants \nu,\zeta > 0 and \mathcal{l} \in \mathbbm{N}, given a distribution over point-value pairs {0,1}^n \times {0,1}, it is NP-hard to decide whether,
YES Case: There is a 2-term DNF that classifies all the points of the distribution, and an AND that classifies at least 1−\zeta fraction of the points correctly.
NO Case: Any boolean function depending on at most \ell halfspaces classifies at most 1/2+\nu fraction of the points of the distribution correctly.
Our result generalizes and strengthens the previous best results mentioned above on the hardness of learning a 2-term DNF, learning a intersection of two halfspaces, and learning a noisy AND.
This is joint work with Rishi Saket (IBM Research).
by Tony Gorschek, Blekinge Institute of Technology, on 03 Mar 2020 at 11:15 in room 1537
Software Engineering ReThought is a new research project with the aim to take on the next generation challenges facing companies developing software-intensive systems and products. We as an engineering lab are working on introducing 3:rd generation empirical software engineering – denoting close co-production of pragmatic problem-solving in close collaboration with our industrial partners as we perform engineering research. The focus is on how to affect change, today.
by Kilian Risse, KTH, on 21 Feb 2020 at 13:15 in room 4523
Polynomial calculus is a proof system that can be used to determine whether a given system of polynomials is consistent. In every step of the derivation we may either take a linear combination of two previously derived polynomials or multiply a derived polynomial by a variable. The goal is to derive 1, which shows that the given system of polynomials is inconsistent. An important measure of complexity is the minimum degree required to refute an inconsistent system.
In this seminar we show a degree lower bound for the pigeonhole principle formula claiming that m pigeons can be mapped one-to-one to n holes for m > n. The lower bound follows from a more general statement that precisely characterizes polynomials that are derivable in low degree.
This based is based on joint work with Jakob Nordström, Anastasia Sofronova, Dmitry Sokolov and Marc Vinyals.
17 Dec 2019 at 12:00 in 1537
Towards Optimal Algorithms For Single-Source Shortest Paths in Dynamic Graphs
(Maximilian Probst Gutenberg, University of Copenhagen)
In this talk, I am reviewing the State-of-the-Art for Single-Source Shortest Paths (SSSP) and discuss very recent advances in high-level that make promising progress on different variations of the problem. In particular, I will give a brief introduction to the framework that was used to obtain the currently best deterministic algorithm for SSSP in undirected graphs. We then outline progress in the directed setting that still seems to be rather poorly understood. Finally, I will talk about future directions towards solving the problem (near-)optimally.
16 Dec 2019 at 13:15 in 1537
Circular proofs with names for the hybrid mu-calculus
(Sebastian Enqvist, Stockholm University)
I present ongoing work on developing a circular proof system for the hybrid mu-calculus, in the style of Stirling's proof system for the modal mu-calculus which uses names for fixpoint unfoldings. Stirling's system has recently been used by Afshari and Leigh to provide a cut-free complete sequent system for the mu-calculus, and I am hoping to extend their result to the hybrid mu-calculus in future work. The hybrid mu-calculus extends the modal mu-calculus with nominals referring to points in a model and corresponding satisfaction operators that describe which formulas are true at the value of a nominal. While this is a relatively gentle extension of the mu-calculus, it presents some new challenges, and could provide a stepping stone towards proof theory for even richer languages like hybrid mu-calculus with converse modalities or guarded fixpoint logic.
28 Nov 2019 at 11:15 in 1537
Engineering a Safer World: introduction to Prover Technology
(Iago Abal, Prover Technology)
Prover Technology provides software products and services for development and V&V of railway signaling systems. Our customer base includes railway infrastructure managers and railway signaling system suppliers around the world, as well as tool developers that license our model-checking technology.
In this talk we will take a walk through our product offering, the technical challenges we are facing, and some of our current R&D projects.
22 Nov 2019 at 10:00 in 1537
DRaGen: a tool for generating random values for abstract data types
(Alejandro Russo, Chalmers)
In random testing, it is challenging to control random data generators' distributions---specially when it comes to complex data like user-defined algebraic data types (ADT). In this talk, we show how to adapt results from an area of mathematics known as branching processes, and show how they help to analytically predict (at compile-time) the expected number of generated constructors, even in the presence of mutually recursive or composite ADTs. Using our probabilistic formulas, we design heuristics capable of automatically adjusting probabilities in order to synthesize generators whose distributions are aligned with users' demands. We provide a Haskell implementation of our mechanism in a tool called DRaGen and perform case studies with real-world applications. Furthermore, we will also show that sometimes becomes necessary to generate structurally richer ADTs values in order to test deeper software layers. For that, we will leverage static information found in the codebase as a manner to improve the generation process. Namely, our generators are capable of considering how programs branch on input data as well as how ADTs values are built via interfaces---while still providing compile-time guarantees and heuristic that tries to adjust their distributions to users demands. When generating random values, DraGen's synthesized random generators show improvements in code coverage when compared with those automatically derived by state-of-the-art tools.
This talk is based on the following articles: http://www.cse.chalmers.se/~russo/publications_files/haskell2018.pdf (HASKELL 2018)
http://www.cse.chalmers.se/~mista/assets/pdf/ast19.pdf (AST@ICSE 2019)
20 Nov 2019 at 10:15 in 1537
Language support for consistency in distributed systems
(Xin Zhao)
Large-scale distributed systems replicate data in order to tolerate faults such as network partitions. However, frequent replica updates (e.g., due to write-heavy workloads) as well as geo-replication (i.e., geographically-distributed replicas) make replica synchronization very expensive. This challenge suggests two research directions: (1) lowering the cost of synchronization while satisfying given consistency requirements; and (2) building systems that weaken consistency requirements when possible.
In this talk, I will present our results which make progress in both of these directions. (1) A consistency model and a corresponding protocol which integrate conflict-free replicated data types with a strongly consistent system to reduce synchronization cost while providing observable atomic consistency guarantees. (2) A type-and-effect system for a higher-order distributed language which enables the safe use of different consistency levels within the same program.
11 Nov 2019 at 13:15 in 4523
30% Seminar: Secure and Trustworthy Configurations of DMA Devices
(Jonas Haglund)
I will give a presentation about the research I have done so far and my future research plans.
I will describe my contributions to a formally verified secure embedded Linux system with Internet connectivity. This system is secure in the sense that only Linux code (kernel and applications) with certain fingerprints (hash values) can be executed. Linux is executed as a guest on top of a formally verified hypervisor that isolates its guests, and alongside another formally verified guest that checks that executable Linux code has valid hash values. To make the system more usable, Internet connectivity is added by enabling Linux to use a network interface controller (NIC). However, since the NIC can read and write arbitrary memory regions, if Linux is in direct control of the NIC, Linux could use the NIC to inject untrusted code, modify the database of valid hash values, or modify the code of the hypervisor or the page tables to get access to all system resources. Therefore the NIC must be protected by the hypervisor so that the NIC can only access certain memory regions.
My contribution concerns the protection of the NIC. The NIC is protected by a NIC monitor of about 1000 lines of C code located in the hypervisor. The NIC monitor is invoked each time Linux attempts to reconfigure the NIC. The NIC monitor checks that the NIC reconfiguration satisfies a security invariant of the NIC, and only if so applies the reconfiguration. The NIC monitor code has not been verified. However, the security invariant of the NIC has been verified in the interactive theorem prover HOL4. This verification involves three components: A detailed formal model of the NIC that describes all memory accesses the NIC can perform; the definition of the security invariant stating what NIC states are secure; and the proof of that the invariant implies that each NIC operation preserves the invariant and that the NIC can access only certain memory regions.
Making a model of the NIC, identifying the security invariant, and prove that the invariant is correct is a time consuming task. Therefore, I plan to investigate how this work can be generalized and automated to include any I/O device that can perform direct memory accesses (DMA). This would involve identifying and formally describing the common operations of DMA devices, and automatically proving that their operations preserve a "universal security invariant." Furthermore, it would be helpful to be able to automatically generate models of hardware based on its formal design (written in e.g. VHDL or Verilog). This would speed up the verification process, and significantly enhance the trustworthiness of the verification.
11 Nov 2019 at 11:15 in 4523
On Benchmarks of Bugs for Studies in Automatic Program Repair
(Fernanda Madeiral)
Benchmarks play an important role in any research for evaluating competing methods and tools. In the context of automatic program repair, benchmarks of bugs are the mean to investigate the performance of repair tools in fixing bugs. Despite their big importance in empirical studies, benchmarks of bugs have received little attention in the field. In this talk, I will first discuss on the importance of the adoption of publicly available benchmarks of bugs for evaluating repair tools, followed by the presentation of the state-of-the-art of benchmarks of bugs for the field. Second, I will present the recently defined benchmark overfitting problem, and its investigation in the context of 11 repair tools and the existing five benchmarks of bugs for the Java language. Finally, I will discuss on random unexplored topics on benchmarks, which are ongoing works or opportunities for future research.
06 Nov 2019 at 11:15 in 4523
Supporting Continuous Engineering with Automated Log Clustering and Diagnosis
(Leon Moonen, Simula Research Laboratory)
Continuous engineering (CE) practices, such as continuous integration and continuous deployment, have become key to modern software development. They are characterized by short automated build and test cycles that give developers early feedback on potential issues. CE practices help to release software more frequently, and reduces risk by increasing incrementality. However, effective use of CE practices in industrial projects requires making sense of the vast amounts of data that results from the repeated build and test cycles.
In this talk, we will explore to what extent these data can be treated more effectively by automatically grouping logs of runs that failed for the same underlying reasons, and what effort reduction can be achieved. Our proposed approach builds on earlier work for system log clustering, and we empirically investigate how choices in the clustering pipeline affect the grouping of continuous deployment logs provided by our industrial collaborator. We conclude by discussing a log diagnosis technique that builds on event spectra can be used to highlight which events in a failure log are most likely to indicate the causes of the failure.
04 Nov 2019 at 10:15 in 4523
Deniability, Anonymity and Verifiability: Towards Democracy-Enhancing Technologies
(Daniel Bosk, KTH)
Democracy is a decentralized process, each individual participates in a common-knowledge formation process. There are several requirements for this to work, one of them is free speech. In this talk, we will discuss some technologies to support this process, "democracy-enhancing technologies".
Totalitarian regimes usually perform surveillance to, among other things, limit free speech by self-censorship to disrupt this common-knowledge formation. We introduce a messaging protocol that provides deniability against a stronger than usual adversary. We also investigate how to distribute messages to their recipients while hiding who communicates with whom and how to do robust anonymous communication without a reliable infrastructure.
However, providing channels for free speech with strong privacy properties is not enough. They can also be abused and turned against democracy: the adversary can "poison" the public space through disinformation campaigns, e.g. spreading "fake news" using these channels. Some cases of this problem can be mitigated by introducing mechanisms that provide verifiable statements. We introduce a protocol to verifiably ascertain the participation counts of protests, but in a privacy-preserving manner to counteract surveillance.
From a scientific point of view, these problems pose an interesting interplay between privacy (deniability, anonymity) on the one hand and verifiability on the other. We can achieve the desired properties by developing our own protocols and, whenever possible, modifying existing cryptographic protocols to work with the weaker assumptions (e.g. on the limits of the adversary) required in the context of state surveillance.
28 Oct 2019 at 13:15 in 4523
Automatic Unit Test Amplification for DevOps
(Benjamin Danglot, PhD candidate at Inria Lille)
In recent years, unit testing has become an essential part of any serious software project to verify its proper functioning. However, the tests are tedious and expensive for the industry: it is difficult to evaluate the return on investment. To overcome this problem, the research is investigating the automation of test creation. Although these automatically generated test suites are of high quality, there are still barriers to the industry adoption of such techniques. This is because of the difficulty in understanding, integrating, and managing the generated test suite. The objective of this thesis is to overcome these difficulties by proposing a new technique that produces new tests based on existing tests. This technique is called "test amplification". By construction, these new tests are close to the tests written by humans, and therefore are easy to understand, integrate and manage. This approach is implemented as a tool called DSpot.
21 Oct 2019 at 11:15 in 1537
(30% Seminar) Security and privacy of smart home-base IoT systems
(Md Sakib Nizam Khan, KTH)
In this talk, I will discuss two of our research works concerning the security and privacy of smart home-based IoT systems.
The first work that I will present is titled ‘chownIoT: Enhancing IoT Privacy by Automated Handling of Ownership Change’. With the increasing deployment of smart home IoT devices, their ownership is likely to change during their life-cycle. IoT devices, especially those used in smart home environments, contain privacy-sensitive user data, and any ownership change of such devices can result in privacy leaks. In this work, we present a system called chowIoT, capable of securely handling ownership change of IoT devices. This work is published at IFIP International Summer School 2018 on Privacy and Identity Management. For more information, please follow the link to the full paper https://link.springer.com/chapter/10.1007/978-3-030-16744-8_14
The second work that I will discuss is our upcoming survey paper on smart home-based Ambient Assisted Living (AAL) or elderly health monitoring systems and their security and privacy considerations and implications. In this work, we performed a systematic literature review on smart home/IoT based AAL systems and their privacy and security aspects. Besides literature, we also performed a survey on commercially available AAL systems to understand the gap in terms of privacy and security between the state of the art research and industry. I will discuss the method of our systematic literature review and also our findings.
Lastly, I will also shed some light on our future research directions concerning the privacy and security of IoT systems.
11 Oct 2019 at 11:15 in 4523
Failure-Driven Program Repair
(Davide Ginelli, PhD student at the University of Milano-Bicocca)
Program repair techniques can dramatically reduce the cost of program debugging by automatically generating program fixes. Although program repair has been already successful with several classes of faults, it also turned out to be quite limited in the complexity of the fixes that can be generated. This Ph.D. thesis addresses the problem of cost-effectively generating fixes of higher complexity by investigating how to exploit failure information to directly shape the repair process. In particular, this thesis proposes Failure-Driven Program Repair, which is a novel approach to program repair that exploits its knowledge about both the possible failures and the corresponding repair strategies, to produce highly specialized repair tasks that can effectively generate non-trivial fixes.
10 Oct 2019 at 13:15 in 4523
(50% seminar) Trustworthy Binary Analysis
(Andreas Lindner, KTH)
The safety and security of a computer system simply hinges on its hardware and the interactions with it. While binary program code is the software that describes these interactions, typical examples for critical software in this context are the implementations of device drivers and isolation abstractions like virtual memory. More advanced challenges emerge when software implementations are supposed to cover security flaws in the hardware design by limiting their hardware interactions in very peculiar ways. In this talk, I will present some of our research work related to theorem-prover-powered binary analysis. The main topics will be a summary of our previous work on our binary verification framework HOLBA, a generalization of Hoare triples for binary code and worst case execution time analysis using symbolic execution.
07 Oct 2019 at 13:15 in 4523
TripleAgent: Monitoring, Perturbation and Failure-obliviousness for Automated Resilience Improvement in Java Applications
(Long Zhang, KTH)
In this talk, I would like to present TripleAgent, a novel resilience improvement system for Java applications. The unique feature of this system is to combine automated monitoring, automated perturbation injection, and automated resilience improvement. The latter is achieved thanks to the failure-oblivious computing, a concept introduced in 2004 by Rinard and colleagues. We design and implement the system as agents for the Java virtual machine. We evaluate the system on two real-world applications: a file transfer client and an email server. Our results show that it is possible to automatically improve the resilience of Java applications with respect to uncaught or mishandled exceptions. Paper if you feel interested: https://arxiv.org/pdf/1812.10706.pdf
17 Sep 2019 at 10:15 in 1537
(50% seminar) d-Galvin Families and Separating Arithmetic Circuit Classes
(Joseph Swernofsky)
I will present results from my paper "d-Galvin Families" (with Johan Håstad and Guillaume Lagarde). The Galvin problem asks for the minimum size of a family F of sets from [n] so that for any set A of size n/2 we can find S in F that is balanced on A. We generalize this, finding a polynomial size F so that for any A we can partition [n] into d pieces, each balanced on A.
This generalization is motivated by an approach to separating arithmetic circuit classes. Specifically, it rejects an approach to a super-polynomial separation between the size of syntactically multilinear arithmetic branching programs and arbitrary arithmetic circuits. In this talk I also present questions about identity testing for arithmetic circuits that I have considered in my research.
One approach to checking if a circuit evaluates to 0 is to check its value at certain (often random) inputs. I show a lower bound on the size of matrix inputs that are required to find a nonzero point of a polynomial. I also discuss questions related to derandomizing deterministic identity testing.
09 Sep 2019 at 13:15 in 4523
mCoq: Mutation Proving for Analysis of Verification Projects
(Karl Palmskog, The University of Texas at Austin)
Mutation analysis, which introduces artificial defects into software systems, is the basis of mutation testing, a technique widely applied to evaluate and enhance the quality of test suites. However, despite the deep analogy between tests and formal proofs, mutation analysis has seldom been considered in the context of deductive verification. We propose mutation proving, a technique for analyzing verification projects that use proof assistants. We implemented our technique for the Coq proof assistant in a tool dubbed mCoq. mCoq applies a set of mutation operators to Coq definitions of functions and datatypes, inspired by operators previously proposed for functional programming languages. mCoq then checks proofs of lemmas affected by operator application. To make our technique feasible in practice, we implemented several optimizations in mCoq such as parallel proof checking. We applied mCoq to several medium and large scale Coq projects, and recorded whether proofs passed or failed when applying different mutation operators. We then qualitatively analyzed the mutants, and found several examples of weak and incomplete specifications. For our evaluation, we made many improvements to serialization of Coq code and even discovered a notable bug in Coq itself, all acknowledged by developers. We believe mCoq can be useful both to proof engineers for improving the quality of their verification projects and to researchers for evaluating proof engineering techniques.
Joint work with Ahmet Celik, Marinela Parovic, Emilio Jésus Gallego Arias, and Milos Gligoric.
05 Sep 2019 at 11:15 in 1537
Neural machine translation for automatic software repair: the Transformer architecture
(Orestis Floros)
In neural machine translation (NMT), state of the art developments are impacted by the wide application of the Transformer architecture, which outperforms older LSTM-based models. This architecture has also been used more recently to achieve significant results in other natural language tasks by utilizing transfer learning. In automatic software repair, the process of automatically fixing software bugs, LSTM-based sequence-to-sequence learning models have been applied to correct buggy lines in a manner similar to the machine translation task. This presentation introduces the recent advances in the area of NMT, discusses their application on automatic software repair, presents some preliminary results and proposes ideas for future steps that can improve the effectiveness of NMT methods for software repair.
04 Sep 2019 at 15:15 in 4523
Bisimulations for logics of strategic reasoning
(Catalin Dima, Université Paris-Est Créteil)
The notion of bisimulation is of central importance in modal logics, as it frequently allows characterizing the expressive power of a logic within a class of Kripke models. We present some recent work towards the characterization of the expressive power of some logics of strategic reasoning in multi-agent systems with perfect or imperfect information with the aid of appropriate notions of bisimulations. The talk will also provide a variety of counterexamples of systems related by some type of bisimulation but not satisfying the same formulas in one of the strategy logics under consideration. If time allows, we will also provide some applications in the verification of coercion-freeness properties in a simple voting system or as the statement of the elimination of dominated strategies in multi-player games as a bisimulation property.
21 Aug 2019 at 11:01 in 4523, Lindstedtsvägen 5 - floor 5
Dependencies Gone Wild: Tales from automating 10,000+ project's dependency updates
(Rhys Arkins)
Renovate Bot is a tool used to automate dependency updates across dozens of package managers, in use by 10's of thousands of active projects. This presentation will focus on lessons learned working with dependencies in the wild: vulnerabilities, different approaches across languages, and the use and misuse of semver and other versioning approaches.
20 Aug 2019 at 14:30 in Room 4523
(50% seminar) Algebraic Data-Structures for Distances in Graphs
(Jan van den Brand)
I will present results from my two upcoming FOCS papers "Dynamic Approximate Shortest Paths and Beyond: Subquadratic and Worst-Case Update Time" (with Danupon Nanongkai) and "Sensitive Distance and Reachability Oracles for Large Batch Updates" (with Thatchaphol Saranurak). In both papers we use algebraic techniques to create new data-structures which can efficiently maintain distances in a graph that undergoes changes to its edges. We answer several open questions: * When we want to maintain distances from a single source node, a fundamental question is to beat trivially calling Dijkstra's algorithm whenever the graph changes, taking O(n^2) time on dense graphs. We answer this question by giving an O(n^1.823) upper bound. * We also obtain the first data-structure with subquadratic update time for nearly 1.5-approximating the diameter. * When we want to maintain distances between all-pairs of nodes explicitly, the O(n^2) amortized update time by Demetrescu and Italiano [STOC'03] already matches the trivial Omega(n^2) lower bound. A fundamental question is whether it can be made worst-case. We answer this question by giving an O(n^2) upper bound for undirected unweighted graphs. * We give the first failure sensitive distance oracle that can maintain the distance in a graph for any number of edge failures in subquadratic time.
19 Aug 2019 at 13:15 in 4523, Lindstedtsvägen 5 - floor 5
Testing Small Vertex Connectivity in Near-Linear Queries.
(Sorrachai Yingchareonthawornchai, Aalto University, Finland)
Vertex connectivity is a classic extensively-studied problem. Given an integer k, its goal is to decide if an n-node m-edge graph G can be disconnected by removing k-1 vertices (otherwise, G is k-connected). In the study of testing graph properties, we want to distinguish between when G is k-connected and when it is "far" from having such property. It has been open for many years whether the bounds from [Goldreich, Ron'02], [Orenstein, Ron'11], [Yoshida, Ito'11, and '12] which are exponential in k can be made polynomial (this was asked in e.g. [Orenstein, Ron'11]). In this talk, we give a testing algorithm that queries edges that is near-linear in k. The key technique is from the recent local vertex connectivity algorithm [Nanongkai, Saranurak, Yingchareonthawornchai'19] in an appropriate form for property testing. This talk is based on joint work with Sebastian Forster, Danupon Nanongkai, Thatchaphol Saranurak, and Liu Yang.
Note: Sorrachai will be visiting Danupon between August 19-30
14 Aug 2019 at 13:15 in 4523, Lindstedtsvägen 5 - floor 5
Fast Approximate Shortest Paths in the Congested Clique
(Michal Dory, Technion, Israel)
I will discuss our recent shortest paths algorithms in the distributed congested clique model. Our first contribution is a (2+epsilon)-approximation algorithm for all-pairs shortest paths (APSP) requiring only polylogarithmic number of rounds. This is the first sub-polynomial constant approximation for APSP in this model. We also signiﬁcantly improve upon the state-of-the-art for approximate multi-source shortest paths and diameter, as well as exact single-source shortest paths (SSSP). Our main techniques are new distance tools that are obtained via improved algorithms for sparse matrix multiplication, which we leverage to construct eﬃcient hopsets and shortest paths. Joint work with Keren Censor-Hillel, Janne H. Korhonen and Dean Leitersdorf.
Note: Michal will be visiting Danupon between Aug 12-30
20 Jun 2019 at 09:00 in 4423, Lindstedtsvägen 5 - floor 4
Aspects of Cyber Situational Awareness - Ensuring cyber mission success with relevant insights
(Stefan Varga)
In order to ensure success in cyber related activities, such as upholding a desired level of availability and quality of IT-services, organizations in general and IT management in particular, need to have an adequate level of knowledge and insights pertaining to threats and other circumstances that may affect the ability to reach the goals.
Cyber situational awareness is about being sufficiently well informed about factors that will influence one’s cyber mission. IT staff on different levels have varying but overlapping information requirements that will enable them to solve their respective tasks successfully. One example of a cyber mission would be to uphold the security stance of an IT system. On the strategic or the national level it may prove useful to have knowledge of the modus operandi and activity patterns of adversaries in general, whereas more specific information, such as signatures and other indicators of compromise, is of greater use on the tactical level. We hypothesize that the level of correct awareness to some extent corresponds to decision-makers' ability to make rational choices that ultimately affects the security. Hence, to gain cyber situational awareness, it is of importance to carefully select the underlying needed information, process it, and present it in an effective way for decision-makers.
During the seminar Stefan will present the context of cyber situational awareness as well as the results of research done so far. He will also outline planned forthcoming research activities.
27 May 2019 at 13:15 in 4523, Lindstedtsvägen 5
Automated Approaches for Formal Verification of Simulink Models
(Predrag Filipovikj, Mälardalen University)
Simulink is the de facto standard model-based-development environment in many domains such as the automotive domain, avionics domain, railways domain, etc. For quality assurance of the Simulink models the industrial state-of-practice resorts to simulation, which gives insight in the system's behavior, yet does not provide a high degree of assurance that the model behaves correctly. With the ever-increasing size and complexity, as well as safety relevance of the software functions developed in Simulink, the industry has an imperative need for automated and mathematically-rigorous approaches for verification of Simulink models, which provide higher degree of confidence of the verification result.
20 May 2019 at 14:15 in 4523, Lindstedtsvägen 5 - floor 5
Recent Progress in Distributed Edge Connectivity
(Mohit Daga)
In this talk, we present an overview of our recent result in distributed edge connectivity. In particular, we present the first sublinear-time algorithm for finding edge connectivity k in CONGEST model as long as there are no parallel edges. Our algorithm takes O(n^{1-1/353}D^{1/353}+n^{1-1/706}) time to compute k and a cut of cardinality k with high probability, where n and D are the number of nodes and the diameter of the network, respectively. This running time is sublinear in n (i.e. O(n^{1-\epsilon})) whenever D is.
This is joint work with Monika Henzinger, Danupon Nanongkai and Thatchaphol Saranurak. This talk will count as Mohits 30% seminar.
08 May 2019 at 14:15 in 4523
Breaking Quadratic Time for Small Vertex Connectivity
(Thatchaphol Saranurak)
06 May 2019 at 13:15 in 4523, Lindstedtsvägen 5 - floor 5
Coming: a Tool for Mining Change Pattern Instances from Git Commits
(Matias Sebastian Martinez, Université Polytechnique Hauts-de-France (France))
Software repositories such as Git have become a relevant source of information for software engineer researchers. For instance, the detection of commits that fulfill a given criterion (e.g., bugfixing commits) is one of the most frequent tasks done to understand the software evolution. However, to our knowledge, there is no open-source tool that, given a Git repository, returns all the instances of a given code change pattern. In this talk, I will present Coming, a tool that takes as input a Git repository and mines instances of code change patterns present on each commit. For that, Coming computes fine-grained code changes between two consecutive revisions, analyzes those changes to determine if they correspond to an instance of a change pattern (specified by the user using XML), and finally, after analyzing all the commits, it presents a) the frequency of code changes and b) the instances found in each commit.
29 Apr 2019 at 12:00 in 4523, Lindstedtsvägen 5 - floor 5
Extended Formulation Lower Bounds for Refuting Random CSPs
(Jonah Brown-Cohen)
PRACTICALITIES
Lunch is served at 12:00 noon (register at http://www.csc.kth.se/tcs/seminars/lunch/2019-03-29 at the latest the Saturday before, but preferably earlier). The presentation starts at 12:10 pm and ends at 1 pm. Those of us who wish reconvene after a short break for 1½-2 hours of more technical discussions.
There will be a more detailed technical discussionafter the seminar.
ABSTRACT
Random constraint satisfaction problems (CSPs) such as random 3-SAT are conjectured to be computationally intractable. The average case hardness of random 3-SAT and other CSPs has broad and far-reaching implications for problems in approximation, learning theory and cryptography. In many cases, the best known algorithms for CSPs come from linear or semi-definite programs, so a natural question is to prove lower bounds on the size of such programs for random CSPs.
In this talk, we will give sub-exponential lower bounds on the size of linear programming extended formulations for refuting random instances of constraint satisfaction problems. In particular, our results imply that any LP extended formulation that refutes a constant fraction of random k-SAT instances (with constant clause density) must have size exponential in n^{1 - 2/k}.
We use the technique of pseudo-calibration to directly obtain extended formulation lower bounds from the distribution on instances with a planted satisfying assignment. This approach bypasses the need to construct Sherali-Adams integrality gaps in proving general LP extended formulation lower bounds. As a corollary, we obtain a self-contained proof of sub-exponential Sherali-Adams LP lower bounds for these problems.
25 Apr 2019 at 13:15 in 4523, Lindstedtsvägen 5 - floor 5
High-dimensional estimation via sum-of-squares proofs
(David Steurer)
Estimation is the computational task of approximately recovering a hidden parameter x associated with a distribution D_x given a draw y from the distribution D_x. Numerous interesting questions in statistics, machine learning, and signal processing are captured in this way, for example, sparse linear regression, Gaussian mixture models, topic models, and stochastic block models.
In many cases, there is currently a large gap between the statistical guarantees of computationally efficient algorithms and the guarantees of computationally inefficient methods; it is an open question if this gap is inherent in these cases or if better computationally efficient estimation algorithms exist.
In this talk, I will present a meta-algorithm for estimation problems based on the sum-of-squares method of Shor, Parrilo, and Lasserre. For some problems, e.g., learning mixtures of spherical Gaussians, this meta-algorithm is able to close previous long-standing gaps and achieve nearly optimal statistical guarantees. Furthermore, it is plausible that, for a wide range of estimation problems, the statistical guarantees that this meta-algorithm achieves are best possible among all efficient algorithms.
This talk is based on an ICM proceedings article with Prasad Raghavendra and Tselil Schramm.
Coauthors: Prasad Raghavendra and Tselil Schramm
11 Apr 2019 at 13:15 in Room 1625, Lindstedtsvägen 3 - floor 6
Distributed system development with ScalaLoci
(Guido Salvaneschi, Assistant Professor, TU Darmstadt, Germany)
Distributed applications are traditionally developed as separate modules, often in different languages, which react to events, like user input, and in turn produce new events for the other modules. Separation into components requires time-consuming integration. Manual implementation of communication forces programmers to deal with low-level details. The combination of the two results in obscure distributed data flows scattered among multiple modules, hindering reasoning about the system as a whole.
The ScalaLoci distributed programming language addresses these issues with a coherent model based on placement types that enables reasoning about distributed data flows, supporting multiple software architectures via dedicated language features and abstracting over low-level communication details and data conversions. As we show, ScalaLoci simplifies developing distributed systems, reduces error-prone communication code and favors early detection of bugs.
28 Mar 2019 at 13:15 in 1625, Lindstedtsvägen 3
Industrial Continuous Integration (CI) environment for 5G Radio at Ericsson
(Han Fu)
A world-class Continuous Integration (CI) machinery using state of the art technology is one of the key enablers for Ericsson to have faster, more, better 5G radio product deliveries to our customers. In order to achieve that, we build up a fully automated CI loop for fully 5G Radio functional test. One delivery of 5G Radio software can be an hourly base. In this session, the elaborate setup of Ericsson Radio Software CI loop will be introduced, along with the intricacy setup of Radio test tower for functional test purpose. Furthermore, in our future work, we see an inspiring opportunity for Ericsson to leverage and extend ongoing most advanced research in AI-driven program-repair to drastically reduce system development cost.
26 Mar 2019 at 11:15 in 1537, Lindstedtsvägen 3 - floor 5
Machine Learning for Large Scale Code Analysis
(Hugo Mougard)
"Software is eating the world". Yet, applying Machine Learning techniques to code is challenging: to fully take advantage of its structured nature requires clever models, and to fully take advantage of the enormous quantities of code available requires precise engineering. We'll see in this talk how the Machine Learning team at source{d} leverages a stack to transform code as a first class citizen in Machine Learning to deliver insight on code. We'll then focus on one of its latest experiments: a model to fix formatting mistakes in code.
11 Mar 2019 at 12:00 in 4523, Lindstedtsvägen 5 - floor 5
Some new applications of Razborov's pseudo-width method
(Kilian Risse)
PRACTICALITIES
Lunch is served at 12:00 noon (register at http://www.csc.kth.se/tcs/seminars/lunch/2019-03-11 at the latest the Saturday before, but preferably earlier). The presentation starts at 12:10 pm and ends at 1 pm. Those of us who wish reconvene after a short break for 1½-2 hours of more technical discussions.
There will be a more detailed technical discussionafter the seminar. This is Kilian's 30% seminar.
ABSTRACT
Resolution is arguably the most well-studied proof system for certifying unsatisfiability of Boolean formulas. The size-width bound in the celebrated paper by Ben-Sasson and Wigderson shows that for resolution proofs linear lower bounds on the width, measured in the number of variables, imply exponential size lower bounds. This method does not extend to formulas where such strong width lower bounds are simply not true --- e.g., for the so called weak pigeonhole principle formulas claiming that m pigeons can be mapped one-to-one to n holes for m >> n. In a break-through result in 2001 Raz proved exponential size lower bounds for such formulas, which Razborov subsequently strengthened and generalized with his so-called pseudo-width method developed in a series of papers. In this seminar we discuss this framework and show how to generalize it further in order to obtain size lower bounds for other formulas. This seminar is based on joint work with Susanna F. de Rezende, Jakob Nordström and Dmitry Sokolov.
06 Mar 2019 at 09:00 in 4523, Lindstedtsvägen 5, floor 5
Experiences and use of assessment in introductory computer science
(Emma Riese)
As teachers, we can use assessment as a way to guide the students to the most important parts of the course material and to provide feedback on the students’ progress. How the assessment is planned and carried out can, therefore, be said to influence the students’ learning. Understanding students’, TAs’ and teachers’ experiences and use of the assessment could give us insights into how to further improve our courses in order to enhance learning among our students. During this seminar, Emma will give a brief background to her research and present two of her conducted studies. Both studies had a focus on experiences, but from different aspects and in different contexts. The first study uses a phenomenographic approach to categorize students' experiences of assessment throughout an online course. The second study, identified themes in how TAs experience their role and responsibilities in the context of lab sessions in campus-based introductory computer science courses. Emma will also present her ongoing research project about TAs and assessment, as well as her ideas for future studies to build on the presented results towards a thesis.
Emma Riese is a PhD student at KTH, now giving her 50% seminar. Prof. Arnold Pears from the Department of Learning in Engineering Sciences, ITM school, will act as the opponent during the seminar.
04 Mar 2019 at 13:15 in 4523, Lindstedtsvägen 5 - floor 5
Extending finite-memory determinacy by Boolean combination of winning conditions
(Stephane Le Roux, LSV Cachan)
We study finite-memory (FM) determinacy in games on finite graphs, a central question for applications in controller synthesis, as FM strategies correspond to implementable controllers. We establish general conditions under which FM strategies suffice to play optimally, even in a broad multi-objective setting. We show that our framework encompasses important classes of games from the literature, and permits to go further, using a unified approach. While such an approach cannot match ad-hoc proofs with regard to tightness of memory bounds, it has two advantages: first, it gives a widely-applicable criterion for FM determinacy; second, it helps to understand the cornerstones of FM determinacy, which are often hidden but common in proofs for specific (combinations of) winning conditions. This is joint work with Arno Pauly and Mickael Randour.
27 Feb 2019 at 13:15 in 4523
Cutting Planes in Pseudo-Boolean SAT Solving
(Stephan Gocht)
Modern conflict-driven clause learning (CDCL) SAT solvers are able to solve problems with millions of variables. To achieve this, the solver must be capable of efficient reasoning. Especially, it has to be able to find a relatively short proof of unsatisfiablity if there are no solutions, as checking the exponential number of possible assignments to the variables is not feasible. Surprisingly, modern solvers are still based on Resolution, which is a rather weak form of reasoning and an extension to stronger proof systems such as cutting planes, as is done in some pseudo-Boolean (PB) solvers, could in theory result in exponential improvements. However, such gains are rarely realized in practice. A possible explanation for this is that the PB solvers only use restricted versions of cutting planes that actually limit the reasoning power.
In this presentation I will give an introduction to the cutting planes proof system and show on an example how it can be used for efficient reasoning. Then we will discuss how different restrictions made in practice effect the reasoning power of cutting planes. Especially, I will present the latest results on the use of the division rule vs. the saturation rule, which is joint work with Jakob Nordström and Amir Yehudayoff.
There will be a more detailed technical discussion on how these results can be obtained after the seminar. This is Stephan's 30% seminar.
18 Feb 2019 at 13:15 in 4523, Lindstedtsvägen 5 - floor 5
Optimal stateless model checking under the release-acquire semantics
(Tuan Phong Ngo)
We present a framework for the efficient application of stateless model checking to concurrent programs running under the Release-Acquire fragment of the C/C++11 memory model. Our approach is based on exploring the possible program orders, which define the order in which instructions of a thread are executed, and read-from relations, which specify how reads obtain their values from writes. This is in contrast to previous approaches, which also explore the possible coherence orders, i.e., orderings between conflicting writes. Since unexpected test results such as program crashes or assertion violations depend only on the read-from relation, we avoid a potentially significant source of redundancy.
Our framework is based on a novel technique for determining whether a particular read-from relation is feasible under the Release-Acquire semantics. We define a stateless model checking algorithm which is provably optimal in the sense that it explores each program order and read-from relation exactly once. This optimality result is strictly stronger than previous analogous optimality results, which also take coherence order into account. We have implemented our framework in the tool Tracer. Experiments show that Tracer can be significantly faster than state-of-the-art tools that can handle the Release-Acquire semantics.
11 Feb 2019 at 12:00 in 4523, Lindstedtsvägen 5 - floor 5
Restricted arithmetic progressions and correlated spaces in the product model.
(Jan Hązła, MIT)
PRACTICALITIES
- Lunch is served at 12:00 noon (register at https://tinyurl.com/yacnb46v at the latest the Saturday before, but preferably earlier). - The presentation starts at 12:10 pm and ends at 1 pm. - Those of us who wish reconvene after a short break for 1½-2 hours of more technical discussions.
ABSTRACT An arithmetic progression in F_3^n is a triple of elements (x-d, x, x+d) for some d not equal to 0^n. In a recent breakthrough the polynomial method has been used to establish that a set in F_3^n that does not contain any progressions must have at most alpha^n elements for some absolute alpha < 3. It turns out that this result is equivalent to a probabilistic version: A set with mu * 3^n elements must contain at least mu^C * 9^n progressions for some absolute finite C.
I will talk about a variant of this problem where we look at restricted progressions (x-d, x, x+d), where d is constrained to lie in {0, 1}^n. In this case it is known that a set that does not contain restricted progression must have at most o(1) * 3^n elements for a very slowly decaying o(1). The best known lower bound is of the form alpha^n, similar as for unrestricted progressions. The probabilistic version is even more open: It is not known if there exists any positive c(mu) such that a set with mu * 3^n elements must contain at least c(mu) * 6^n restricted progressions, and the best known lower bound is mu^C * 6^n.
I will also put this problem in a broader context, where one considers "correlated spaces": more general structures over finite alphabet Omega that are then "tensorized" in the product space Omega^n. I will talk about some open problems and partial results, including joint work with Thomas Holenstein and Elchanan Mossel.
06 Feb 2019 at 13:15 in 4523, Lindstedtsvägen 5 - floor 5
Rational Synthesis
(Rodica Condurache, Department of Computer Science, “A.I.Cuza” University of Iasi, Romania.)
The synthesis problem aims at automatically designing a program from a given specification. Several applications for this formal problem can be found in the design of interactive systems i.e., systems interacting with an environment. From a formal point of view, the synthesis problem is traditionally modeled as a zero-sum turn-based game and then the system and the environment are modeled by two players with opposite interest. Hence, a strategy that allows the system to achieve its goal against any behavior of the environment is a winning strategy and is exactly the program to synthesize.
However, in the real-life cases, the system may be composed by several components (agents) that may not necessarily have as objective to defeat the system to synthesize and they play rationally, meaning that they first want to satisfy their objectives and then they may think to harm the others. Therefore the system to be synthesized has to fight only against rational behaviors (modeled as equilibria). This problem was introduced by Kupferman, Vardi and co-authors for perfect information case and LTL objectives under two settings: cooperative and non-cooperative.
We first study the computational complexity of the rational synthesis problems (in turn-based and concurrent setting) and provide tight results for most of the classical omega-regular objectives, and show how to solve those problems optimally. The idea is to either study the existence of equilibria in the provided game, or, given an instance of the rational synthesis problem, to construct a zero-sum turn-based game that can be adapted to each one of the class of objectives and then solved efficiently.
04 Feb 2019 at 13:15 in 4523, Lindstedtsvägen 5 - floor 5
Cyber Situational Awareness - Aspects of gaining insights for the cyber mission
(Stefan Varga, KTH)
In today’s world, organizations are heavily dependent on IT systems. Such systems may contain valuable information, but also control physical processes. System owners want to maintain a desired level of functionality, and to preserve the confidentiality and integrity of critical information. Taking the current cyber threat landscape into account, there is a need to secure networks to an adequate level.
Cyber situational awareness, CSA, is about being sufficiently well informed about factors that will influence one’s cyber mission. One example would be to uphold the security stance of an IT system. We hypothesize that the level of correct awareness to some extent corresponds to decision makers' ability to make rational choices that ultimately affect the security stance. Hence, to gain CSA, it is of importance to carefully select the underlying needed information, process it, and present it in an effective way for decision makers.
During the seminar Stefan will present the context of cyber situational awareness as well as the result of two studies. The first study used an exploratory research approach made in the form of a survey among the participants of a large national information assurance exercise. The second study was a case study that sought to examine the usefulness of a text mining tool for solving a realistic intelligence assessment task. This study was conducted in the form of an experiment with an operational unit within the Swedish Armed Forces. Stefan Varga is a PhD student at KTH, now giving his 30% seminar.
29 Jan 2019 at 10:15 in 1537, Lindstedtsvägen 3 - floor 5
Program State Coverage: A Test Coverage Metric Based on Executed Program States (SANER' 19 paper)
(Khashayar Etemadi Someoliayi)
In software testing, different metrics are proposed to predict and compare test suites effectiveness. In this regard, Mutation Score (MS) is one of most accurate metrics. However, calculating MS needs executing test suites many times and it is not commonly used in industry. On the other hand, Line Coverage (LC) is a widely used metric which is calculated by executing test suites only once, although it is not as accurate as MS in terms of predicting and comparing test suites effectiveness. In this paper, we propose a novel test coverage metric, called Program State Coverage (PSC), which improves the accuracy of LC. PSC works almost the same as LC and it can be calculated by executing test suites only once. However, it further considers the number of distinct program states in which each line is executed. Our experiments on 120 test suites from four packages of Apache Commons Math and Apache Commons Lang show that, compared to LC, PSC is more strongly correlated with normalized MS. As a result, we conclude that PSC is a promising test coverage metric.
18 Dec 2018 at 13:15 in 4523
KTH:s CD pipeline based on Docker
(Patric Jansson, KTH IT Department)
KTH's IT department development teams handle about 50 RESTapis and web services used by the university and the public. In connection with IT departments switching over to Cloud First 2-3 years ago, we decided to simultaneously create a completely new automated Continuous Deliveryy pipeline based on Open Source. The main components are Ansible, Python, Docker, Azure and Git (and a bound of other stuff of course).
The basic principal with the pipeline is that developers should not need to know anything about their target environment. They only run their apps using Docker and push to Git. The rest should be the CD-pipeline handle. How Jenkins does integration tests, how metrics or logs are collect for services, all that should just work. Everything is built according to Infrastructure as Code and developed as usual system development occurs.
11 Dec 2018 at 13:15 in 4523
Mutation Testing Advances
(Mike Papadakis)
Mutation testing realises the idea of using artificial defects to support testing activities. Mutation is typically used as a way to evaluate the adequacy of test suites, to guide the generation of test cases and to support experimentation. Mutation has reached a maturity phase and gradually gains popularity both in academia and in industry. This talk will survey the advances related to the fundamental problems of mutation testing and will set out the challenges and open problems for the future development of the method.
10 Dec 2018 at 13:15 in 4523
AMD Epyc Microprocessor Architecture
(Steve Kommrusch)
Datacenters have rapidly evolved, with software innovation challenging the ability of hardware technology to efficiently support it. The Internet of Things, the Cloud, and an ever increasing mobile workforce have transformed datacenters of all sizes to become service providers, providing intelligible data on demand in a secure manner. Epyc is AMD’s x86 server processor line based on the company’s Zen microarchitecture. Epyc provides new features related to Virtual machines, power and frequency management, and IPC (instructions per clock) improvements. Epyc uses MCM (multi-chip modules) to reduce the cost of a single-package 32-core server solution. This talk will give a high-level overview of the core, cache, and memory subsystem architecture of this 14nm processor.
19 Nov 2018 at 13:15 in Room 4523
ML on Code: Techniques for Learning from Syntactic and Semantic Representations of Programs
(Jordan Henkel)
In this talk I will discuss two techniques for integrating machine learning with program analysis and repair. The first technique, Code Vectors, explores the use of rich semantic information (abstracted symbolic traces) and the challenges that arise from learning over structured artifacts. The second technique, Learning to Rank Code Repairs, focuses on situations where access to semantic information is limited and, instead, considers how to utilize mostly syntactic representations. This is joint work with Shuvendu Lahiri (Microsoft Research), Ben Liblit, and Thomas Reps.
10 Oct 2018 at 11:15 in Room 1537
Concurrent Determinism Using Lattices and the Object Capability Model
(Ellen Arvidsson)
Parallelization is an important part of modern data systems. However, the non-determinism of thread scheduling introduces the difficult problem of considering all different execution orders when constructing an algorithm. Therefore deterministic-by-design concurrent systems are attractive. A new approach called LVars consists of using data which is part of a lattice, with a predefined join operation. Updates to shared data are carried out using the join operation and thus the updates commute. Together with limiting the reads of shared data, this guarantees a deterministic result. The Reactive Async framework follows a similar approach but has several aspects which can cause a non-deterministic result. The goal with this thesis is to explore how we can amend Reactive Async in order to guarantee a deterministic result. First an exploration into the subtleties of lattice based data combined with concurrency is made. Then a formal model based on a simple object-oriented language is constructed. The constructed small-step operational semantics and type system are shown to guarantee a form of determinism. This shows that LVars-similar system can be implemented in an object-oriented setting. Furthermore the work can act as a basis for future revisions of Reactive Async and similar frameworks.
09 Oct 2018 at 13:15 in Room 4523
Hoare Logic Contracts in Denotational Semantics
(Dilian Gurov, KTH)
We sketch a simple theory of Hoare logic contracts for programs with procedures, presented in denotational semantics. In particular, we give a simple semantic justification of the usual procedure-modular treatment of such programs. The justification is given by means of a proof of soundness of a contract-relative denotational semantics against the standard denotational semantics of procedures in the context of procedure declarations. The suggested formal development can be used as an inspiration for more ambitious contract theories.
02 Oct 2018 at 13:15 in 4523
Behavioral Changes Detection with Test Amplification
(Benjamin Danglot)
With the rise of the DevOps and Continuous Integration, pull request based development is becoming a trendy way for developers to avoid breaking an existing part of the application while it evolves. To do this, developers use several tools and approaches: test selection to detect regression, the evolution of code coverage and so on. However, the existing test suite might fail to encode the behavioral changes introduced by the proposed patch. In this paper, we propose to explore an automatic way to obtain test methods that characterize behavioral changes of a patch, i.e. the behavioral difference between two versions of the same program can be encoded by test methods that pass on a given version but fail on the other one. This approach is a toolchain based on test suite amplification and is meant to be exploited by the Continuous Integration. We study real applications from GitHub to ensure the applicability and the scalability of the approach.
01 Oct 2018 at 13:15 in 4523
The Essence of Similarity in Redundancy-based Program Repair
(Zimin Chen)
Recently, there have been original attempts to use the concept of similarity in program repair. Given that those works report impressive results in terms of repair effectiveness, it is clear that similarity has an important role in the repair process. However, there is no dedicated work to characterize and quantify the role of similarity in redundancy-based program repair. This is where our paper makes a major contribution, we perform a deep and systematic analysis of the role of similarity during the exploration of the repair search space. We show that with similarity analysis, only 0.65% of the search space (on average over 2766 bugs) must be explored before finding the correct patch. And it is capable of ranking the correct ingredient first in the most of the time (65%).
18 Sep 2018 at 11:00 in Room 4523
How can the Eclipse Foundation help you promote your research in open source
(Gaël Blondelle, Eclipse Europe)
Eclipse has been famous for the last 17 years for the Eclipse IDE. But nowadays, the Eclipse Foundation hosts more than 350 projects in very different domains, some related to the Eclipse platform like the Eclipse Modeling stack, other focused on other domains like the IoT Working Group. With Java EE migrating to the Eclipse Foundation under the name of Jakarta EE, that’s a whole new ecosystem for enterprise Java that joins the ecosystem. Since 2013, the Eclipse Foundation Europe has been a partner of several European research projects, with the main mission to help the partners create an open source community to foster dissemination and make the results sustainable. This presentation will quickly introduce the principles of the Eclipse ecosystem, the technology landscape of the Eclipse Foundation, and how the Eclipse Foundation helps universities in Europe to publish their research results in open source.
17 Sep 2018 at 13:45 in 4523
Comparing Risk Identification in Hazard Analysis and Threat Analysis
(Hideaki Nishihara, AIST)
In the context of cyber-physical systems, safety and security have been discussed and dealt with separately in the past, since security was not a critical issue of safety and vice versa. They are similar in some points, and it is natural to try dealing with them in parallel or in a uni ed manner. This talk considers symmetrical treatment of safety and security, especially in identifying possible harms. We compare the result of hazard analysis and threat analysis for a single model of a small IoT system. It shows that identfied harms have much overlaps, which indicates the two analyses can be unified.
17 Sep 2018 at 13:15 in 4523
Optimal Test Suite Generation for Modified Condition Decision Coverage using SAT solving
(Takashi Kitamura, AIST)
Boolean expressions occur frequently in descriptions of com- puter systems, but they tend to be complex and error-prone in complex systems. The modified condition decision coverage (MCDC) criterion in system testing is an important testing technique for Boolean expression, as its usage mandated by safety standards such as DO-178 [1] (avionics) and ISO26262 [2] (automotive). In this paper, we develop an algorithm to generate optimal MCDC test suites for Boolean expressions. Our algo- rithm is based on SAT solving and generates minimal MCDC test suites. Experiments on a real-world avionics system confirm that the technique can construct minimal MCDC test suites within reasonable times, and improves significantly upon prior techniques.
13 Sep 2018 at 13:15 in Room 4523
Security and Quality Assurance for Deep Learning System
(Lei Ma)
Over the past decades, deep learning (DL) systems achived tremendous success and gained great popularity in various applications, e.g., robotics, image processing, speech processing, and medical diagnostics. A deep neural network (DNN), as a type of deep learning systems, is the key driving force behind its recent success. However, the security and quality assurance techniques for DL are still as the early stage, and a plethora of studies have shown that the state-of-the-art DL systems suffer from various vulnerabilities which can lead to severe consequences when applied to real-world applications. In this talk, we will touch the current state-of-the art and discuss our ongoing work towards proposing general purpose security and quality assurance technique for DL systems.
12 Sep 2018 at 10:00 in 4523, Lindstedtsvägen 5
From Runtime Failures to Patches: Study of Patch Generation in Production
(Thomas Durieux, Inria, http://durieux.me/)
This presentation presents two new patch generation techniques that aim to remove the human intervention for patch generation. The core idea of these two techniques is to put as close as possible the patch generation to the production environment. The production environment contains the data and executions that can be analyzed to understand the behavior of the application and identify the bugs. This presentation presents our approaches to exploit this information to create patches for client and server-side applications.
10 Jul 2018 at 13:15 in 4523
Posimodular Function Optimization
(Kaz Makino, Professor at Research Institute for Mathematical Sciences, Kyoto University)
A function f: 2^V -> R on a finite set V is posimodular if f(X)+f(Y) >= f(X-Y)+f(Y- X) for all X,Y\subseteq V. Posimodular functions often arise in combinatorial optimization such as undirected cut functions. We consider the problem of finding a nonempty subset X minimizing f(X), when the posimodular function f is given by oracle access. We show that posimodular function minimization requires exponential time, contrasting with the polynomial solvability of submodular function minimization that forms another generalization of cut functions. On the other hand, the problem is fixed-parameter tractable in terms of the size of the image (or range) of f.
In more detail, we show that Omega(2^{0.3219n} T_f) time is necessary and O(2^{0.92n}T_f) sufficient, where T_f denotes the time for one function evaluation. When the image of f is D={0,1, ...,d},O(2^{1.271d}nT_f) time is sufficient and Omega(2^{0.1609d}T_f) necessary. We can also generate all sets minimizing f in time 2^{O(d)} n^2 T_f.
Finally, we also consider the problem of maximizing a given posimodular function, showing that it requires at least 2^{n-1}T_f time in general, while it has time complexity Theta(n^{d-1}T_f) when D={0,1,..., d} is the image of f, for integer d.
This is a joint work with Magnus Halldorsson, Toshimasa Ishii, and Kenjiro Takazawa.
09 Jul 2018 at 13:15 in 4523
Counterfactual Mean Embedding: A Kernel Method for Nonparametric Causal Inference
(Krikamol Muandet, Research group leader at the Empirical Inference Department, Max Planck Institute for Intelligent Systems, Germany)
In this talk, I will introduce a novel Hilbert space representation of a counterfactual distribution---called counterfactual mean embedding (CME)---with applications in nonparametric causal inference. Counterfactual prediction has become a ubiquitous tool in machine learning applications, such as an online advertisement, recommendation systems, and medical diagnosis, whose performance relies on certain interventions. To infer the outcomes of such interventions, we propose to embed the associated counterfactual distribution into a reproducing kernel Hilbert space (RKHS) endowed with a positive definite kernel. Under appropriate assumptions, the CME allows us to perform causal inference over the entire landscape of the counterfactual distribution. The CME can be estimated consistently from observational data without requiring any parametric assumption about the underlying distributions. We also derive a rate of convergence which depends on the smoothness of the conditional mean and the Radon-Nikodym derivative of the underlying marginal distributions. Our framework can deal with not only real-valued outcome, but potentially also more complex and structured outcomes such as images, sequences, and graphs. Lastly, I will demonstrate how we can apply the proposed estimator to off-policy evaluation tasks in an online advertisement and recommendation systems.
This is a joint work with Motonobu Kanagawa (MPI-IS), Sorawit Saengkyongam (Agoda), and Sanparith Marukatat (NECTEC). The preprint of this work can be found at https://arxiv.org/abs/1805.08845.
02 Jul 2018 at 13:15 in 4523
Heat kernels in graphs: A journey from random walks to geometry, and back
(He Sun, senior lecturer, University of Edinburgh)
Heat kernels are one of the most fundamental concepts in physics and mathematics. In physics, the heat kernel is a fundamental solution of the heat equation and connects the Laplacian operator to the rate of heat dissipation. In spectral geometry, many fundamental techniques are based on heat kernels. In finite Markov chain theory, heat kernels correspond to continuous-time random walks and constitute one of the most powerful techniques in estimating the mixing time. In this talk, we will briefly discuss this line of research and its relation to heat kernels in graphs. In particular, we will see how heat kernels are used to design the first nearly-linear time algorithm for finding clusters in real-world graphs. Some interesting open questions will be addressed in the end.
21 Jun 2018 at 09:00 in 4523
Models and systems in motion: Verification and visual analytics for system executions
(Cyrille Artho, TCS/EECS/KTH)
Networked systems are difficult to verify. A single component cannot be fully tested in isolation; however, testing the whole system requires orchestrating different types of components and checking their behavior. Because each component is independent, knowledge of the system state is limited to local observations, which cannot be perfectly coordinated.
We first present a scalable approach to model-based testing, which separates the semantics of a distributed system from its concurrent execution. Our work gives an easy-to-understand trace at a high level, and has found a very subtle defect in Apache ZooKeeper, a widely used cloud computing component. However, there is still a gap when trying to go from the high-level test model to the full complexity of the underlying system.
To this end, we want to adapt visual analytics to execution traces that are generated by such systems. Unlike normal visualization, visual analytics combines visualization with filtering, and uses different types of graphs to show different aspects of a system. We show a prototype that embodies work in progress for analyzing traces generated by the Java Pathfinder tool. Early results show that we can indeed get a selective overview of events, and that what you hide is more important than what you show.
19 Jun 2018 at 13:15 in room 1537
Understanding Configuration Constraints
(Thorsten Berger)
Software systems often need to be customized -- commonly through configurations options, set at build or startup time, to tailor systems to specific needs (e.g., hardware, functionality, performance). Such highly configurable systems can easily have thousands of options, together with intricate configuration constraints among them. Declaring options and constraints in higher-level representations, so-called variability models, tackles complexity. Such models facility development, configuration, and validation and verification of large, highly configurable systems. Unfortunately, creating and maintaining variability models is laborious and error-prone, given the complexity of constraints. Automated techniques would be helpful.
In the talk, I will present our work on extracting configuration constraints from highly configurable systems using static analysis. I will explain the semantics of variability models, how their constraints (the main ingredients of variability models) are represented in source code, and how static analyses to extract them can be conceived. I will show the implementation of the static analysis tooling we conceived as well as the results of empirical studies trying to understand how many and what constraints we can extract from code. Time permitting, I will also show some recent tooling of other researchers building upon our foundations.
18 Jun 2018 at 13:15 in 4523
Online Bipartite Matching with Amortized O(log^2 N) Replacements
(Aaron Bernstein, TU Berlin)
In the online bipartite matching problem with replacements, all the vertices on one side of the bipartition are given, and the vertices on the other side arrive one by one with all their incident edges. The goal is to maintain a maximum matching while minimizing the number of changes (replacements) to the matching. We show that the greedy algorithm that always takes a shortest augmenting path from the newly inserted vertex (denoted SAP) uses at most amortized O(log^2 n) replacements per insertion, where n is the total number of vertices inserted. This is the first analysis to achieve a polylogarithmic number of replacements for any strategy, almost matching the Ω(log n) lower bound. The previous best strategy achieved amortized O(√n) replacements [Bosek, Leniowski, Sankowski, Zych, FOCS 2014]. For SAP in particular, nothing better than the trivial O(n) bound was known except in special cases.
Joint work with Eva Rotenberg and Jacob Holm at the University of Copenhagen.
12 Jun 2018 at 12:00 in 4523
Regression Proving in Large-Scale Verification Projects: Theory and Practice
(Karl Palmskog, The University of Texas at Austin)
Order food via http://www.csc.kth.se/tcs/seminars/lunch/2018-06-12
Many large-scale verification projects rely on proof assistants, such as Coq, to construct and check formal proofs. However, such proofs must be checked after every change to a project to ensure that properties still hold. This process of regression proving can require substantial machine time, which is detrimental to productivity and trust in evolving projects.
In this talk, I will present some novel techniques to speed up regression proving in large-scale verification projects. Our techniques are based on tracking dependencies between files, definitions, and proofs, and re-proving, in parallel, only those files or proofs affected by a change. Our work is inspired by previous work on analogous techniques for Java-like programming languages; in particular, we build on an analogy between _proofs_ and _tests_.
As I will describe, we have implemented our techniques in a tool, piCoq, which supports Coq projects. Using piCoq, we measured the speedups compared to proof checking from scratch and when using file-based selection (e.g., using Makefiles) for the revision histories of several large-scale open source Coq projects. The speedups range from 2x all the way up to 28x, depending on the project and degree of parallelism. Our results indicate that our techniques and tool can substantially increase the productivity of proof engineers, especially when they perform regression proving using continuous integration services such as Travis CI on GitHub.
Based on work published in ASE '17, and work to appear in ICSE-Demo '18, and ISSTA '18.
04 Jun 2018 at 14:30 in 4523
Model Checking: the Interval Way
(Angelo Montanari, University of Udine, Italy)
Model checking with interval temporal logics is emerging as a viable alternative to model checking with standard point-based temporal logics, such as LTL, CTL, CTL*, and the like. The behavior of the system is modelled by means of (finite) Kripke structures, as usual. However, while temporal logics which are interpreted ``point-wise'' describe how the system evolves state-by-state, and predicate properties of system states, those which are interpreted ``interval-wise'' express properties of computation stretches, spanning a sequence of states. A proposition letter is assumed to hold over a computation stretch (interval) if and only if it holds over each component state (homogeneity assumption).
The most well-known interval temporal logic is Halpern and Shoham's modal logic of time intervals HS, which features one modality for each possible ordering relation between a pair of intervals, apart from equality. In the seminar, we provide an overview of the main results on model checking with HS and its fragments under the homogeneity assumption. In particular, we show that the problem turns out to be non-elementarily decidable and EXPSPACE-hard for full HS, but it is often computationally much better for its fragments. We conclude by briefly comparing the expressiveness of HS in model checking with that of LTL, CTL, CTL* and by outlining a generalization of the proposed
model checking framework that allows one to use regular expressions to define the behavior
of proposition letters over intervals in terms of the component states.
28 May 2018 at 13:15 in 4523
Correctness Attraction: A Study of Stability of Software Behavior Under Runtime Perturbation
(Martin Monperrus, KTH)
Can the execution of a software be perturbed without breaking the correctness of the output? In this paper, we devise a novel protocol to answer this rarely investigated question. In an experimental study, we observe that many perturbations do not break the correctness in ten subject programs. We call this phenomenon "correctness attraction". The uniqueness of this protocol is that it considers a systematic exploration of the perturbation space as well as perfect oracles to determine the correctness of the output. To this extent, our findings on the stability of software under execution perturbations have a level of validity that has never been reported before in the scarce related work. A qualitative manual analysis enables us to set up the first taxonomy ever of the reasons behind correctness attraction.
15 May 2018 at 13:15 in 4523, Lindstedtsvägen 5 - floor 5
Spoon - Source Code Analysis and Transformation for Java
(Martin Monperrus, KTH)
Spoon is an open-source library to transform and analyze Java source code. Spoon provides a complete and fine-grained metamodel where any program constructs (classes, methods, fields, statements, expressions...) can be accessed both for reading and modification. Spoon can be used for checking coding and architectural rules, instrumentation before dynamic analysis, code generation, software metrics, transpilation and more. This talk will present the key concepts and usages of the Spoon library.
14 May 2018 at 13:15 in 4523
Excitement, frustration and hope in today’s machine learning area
(Saikat Chatterjee)
All of us possible aware of ‘deep learning’ term. Deep learning mainly in the architecture of deep neural network provides unprecedented performance in classification and prediction. Excitement is evident in all conferences and workshops. However, it comes with high challenges to understand the stuff and then eventual frustration. We will find little use of mathematics and knowledge from vast area of signals and systems. In this seminar, I will discuss some issues in neural networks. For example, why neural network has a standard architecture, regularization and overfitting issues, why convolutional neural net got so famous, certain things that are very difficult to understand (for example, dropout), a very simple network with very good performance which almost anybody can implement within 10 minutes (called extreme learning machine) and still frustration, etc. Then we will look for some hope. I will show some some examples where we can use knowledge from signals and systems to develop new architectures and systems.
09 May 2018 at 14:15 in 4523, Lindstedtsvägen 5 - floor 5
Language support for geo-replicated systems
(Xin Zhao, KTH)
Geo-replicated systems play a crucial role in modern software infrastructure, but they are notoriously difficult to implement correctly and efficiently. Highly available distributed systems must weaken the consistency of distributed data due to fundamental impossibility results. However, programming with weakly consistent data is difficult, even for experts. What's more, many distributed applications are developed by composing existing systems that provide various consistency models, making it even more difficult to reason about the consistency of distributed data. This talk aims to address the difficulty of providing consistency guarantees for distributed systems and the gap between application development and system architecture. First, I will review existing approaches for distributed consistency protocols and languages. Then, I will introduce my on-going work on a novel protocol for observable atomic consistency, including its definition, a system implementing the protocol, as well as initial experimental results. Finally, I will give an overview of my plans for future work.
This is Xin's 30% seminar.
09 May 2018 at 11:00 in OV6, floor 4, Osquldas väg 6
[KTH Security Seminar] Browser fingerprinting: past, present and possible future
(Pierre Laperdrix, Stony Brook University)
This is a KTH Security Seminar (https://www.kth.se/securityseminar)
Browser fingerprinting has grown a lot since its debut in 2010. By collecting specific information in the browser, one can learn a lot about a device and its configuration. It has been shown in previous studies that it can even be used to track users online, bypassing current tracking methods like cookies.
In this presentation, we will look at how this technique works and present an overview of the research performed in the domain. We will then see how this technique is currently used online before looking at its possible future.
08 May 2018 at 13:15 in 1440 (library), Lindstedtsvägen 5 - floor 4
Concurrent games and semi-random determinacy.
(Stéphane Le Roux)
Consider concurrent, infinite duration, two-player win/lose games played on graphs. If the winning condition satisfies some simple requirement, the existence of Player 1 winning (finite-memory) strategies is equivalent to the existence of winning (finite-memory) strategies in finitely many derived one-player games. Several classical winning conditions satisfy this simple requirement.
Under an additional requirement on the winning condition, the non-existence of Player 1 winning strategies from all vertices is equivalent to the existence of Player 2 stochastic strategies winning almost surely from all vertices. Only few classical winning conditions satisfy this additional requirement, but a fairness variant of omega-regular languages does.
07 May 2018 at 14:30 in 4523
Reaching the Unreached and their Empowerment through ICT and other Scalable Methods
(Kannan Moudgalya, IIT Bombay)
The speaker will present a summary of three successful projects carried out from IIT Bombay in the general area of ICT training.
Spoken Tutorial is a method to provide ICT training through 10 minute long videos created through screencast. These tutorials are created for self learning; spoken part dubbed into 22 Indian languages; can be used offline, which makes it accessible to students who don't have access to Internet. We work only with open source software. Using this method, 4 million students have been trained in the past four years, in about 5,000 colleges in India, affiliated to about 100 universities, through 36,000 lab courses. It has received several awards, including Google MOOCs Focused Research Award. URL for this work: http://spoken-tutorial.org.
FOSSEE stands for Free and Open Source Software for Education. While Spoken Tutorial looks at training the bottom of the pyramid, FOSSEE looks at training at a higher level, and also content generation through crowdsourcing. This project works in niche software tools in various engineering domains, some examples being Scilab, Python, eSim (electronic circuit simulation), DWSIM (chemical process flowsheeting), OpenFOAM (computational fluid dynamics), OpenModelica (general purpose modelling), and Osdag (steel structure design). Through Textbook Companions, students from across the country have generated one of the largest source of solved examples using Scilab and Python. URL for this work: https://fossee.in.
The Talk will also introduce an affordable laptop that the speaker has been promoting. At Rs. 10,000, it is an excellent laptop for most engineering students. A demo of this laptop will be given during the talk. More details of the laptop can be seen at this link: https://www.linkedin.com/pulse/affordable-laptop-kannan-moudgalya/.
07 May 2018 at 13:15 in Room 4523, Lindstedtsvägen 5
The Verificatum Project 10-year Anniversary
(Douglas Wikström, KTH)
Register for cake before May 2: https://bit.ly/2qDqoUp
This is a KTH Security Seminar (https://www.kth.se/securityseminar)
A mix-net is a way for a group of servers to decrypt a set of ciphertexts in such a way that it is infeasible to find any correspondencebetween the individual inputs and output plaintexts. This idea originates in theseminal work of Chaum 1982 and the main application is electronicelections and related notions.
The Verificatum Mix-Net (VMN) was the first fully distributed implementation of a provably secure mix-net. The first version was completed 2008. Today it is licensed under AGPL.
Some key cryptographic protocols were developed at KTH, but the main contribution is what can be learned from the project as a whole and its application in real electronic voting systems. VMN is completely faithful to cryptographic theory, yet adopts best practices from real-world software engineering.
This was one of the first research projects with this focus (2007) and paved the way to what today is often referred to as real-world crypto. It is the only implementation of a mix-net from the research community that has been considered fit for use by national election authorities.
We first give an overview of a class of electronic voting systems and explain the role of mix-nets in the greater whole. Then we describe the state of the project today. We also give a historical account with afocus on mistakes and what we can learn from them.
Then we take a pause. The historical account leads us in anatural way into difficult tradeoffs and the uncharted territory wheretheory and practice are supposed to meet. This may be a Q&A session or a more mutual discussion depending on what the audience prefers.
You are all very welcome!
Project: https://www.verificatum.org Talk at RWC 2018: https://bit.ly/2HswjFH
04 May 2018 at 11:00 in 4523, Lindstedtsvägen 5 - floor 5
On symbolic heas modulo permission theories
(Stéphane Demri)
In program verification, proving properties related to pointers manipulation and memory management is usually quite cumbersome due to possible aliasings. Separation logic is a framework for proving heap-manipulating programs that aims at simplifying proofs of such programs. It is also a program logic for concurrent heap-manipulating programs in which the ``ownership'' of a memory region is formalized by separation. Separation logic with permissions is an extension of separation logic that can express that the ownership of a given heap region is shared with other threads. In this work, we study the entailment problem for separation logic with symbolic heaps admitting list predicates and permissions. Herein, we design new decision procedures that are parameterised by the permission theories. This permits the use of solvers dealing with the permission theory at hand, independently of the shape analysis. We prove that the entailment problem is coNP-complete when the entailment problem for permission formulae is in coNP, assuming the write permission can be split into as many read permissions as desired. This is a joint work with Etienne Lozes (I3S, Nice) and Denis Lugiez (LIS, Marseille)
25 Apr 2018 at 13:15 in room 4523, Lindstedtsvägen 5 - floor 5
ASTOR: A Program Repair Library for Java
(Matias Martinez, University of Valenciennes, http://www.martinezmatias.com/)
During the last years, the software engineering research community has proposed approaches for automatically repairing software bugs. Unfortunately, many software artifacts born from this research are not available for repairing Java programs and to reimplement those repair approaches from scratch is costly. In this talk we will present Astor, a publicly available program repair library that includes 6 Java implementation of repair approaches, including one of the pioneer: GenProg. Astor provides 12 extension points that form the explicit decision space of program repair. Researcher have been already defining new approaches over Astor, proposing improvements of those built-in approaches by using the extension points, and executing approaches implementations from Astor in their evaluations. The implementations of the repair approaches built over Astor are capable of repair, in total, 98 real bugs from 5 large Java programs.
24 Apr 2018 at 13:15 in 4523, Lindstedtsvägen 5
Tensor rank is hard to approximate
(Joseph Swernofsky, KTH)
My group researches approximation algorithms and hardness of approximation. Tensor rank is an NP hard generalization of matrix rank. In this talk I will show that it is also NP hard to approximate the rank of a tensor to within some constant factor C. I do this by reduction from bounded occurrence MAX-2SAT.
This is original research from my first year and a half at KTH and counts as 30% seminar.
16 Apr 2018 at 10:55 in Lindstedtsvägen 3, room 1440 (library)
Test suite minimization and tree-structured test models in combinatorial testing
(Takashi Kitamura, Senior Researcher at AIST, Japan)
In this talk, we explain two features of our combinatorial testing tool, called Calot. The first feature is about minimization of combinatorial t-way test suites. The size of a test suite is one of important concerns in practice in conducting a testing, as the number of test suites directly or indirectly affects the cost of the testing. We propose a technique to minimize combinatorial t-way test suites using SAT-solving, especially using incremental SAT-solving. We evaluate the performance of the technique by experiments in comparison with existing techniques. The second feature is about its tree-structured modeling technique. Tree structured modeling languages, such as Classification Trees, are widely used in industry to describe test models in combinatorial. However, the current tools that support such tree-structured modeling do not perform well in test generation in our case studies, such as for its computation time and the sizes of generated test suites. Moreover, actual algorithms for such tree-structured models are not clarified in existing studies. we explain our algorithm for it called flattening algorithm. We apply a tool based on our algorithm to an actual ticket gate system that is used by several large transportation companies in Japan. Experimental results show that our technique outperforms existing techniques.
16 Apr 2018 at 10:00 in Lindstedtsvägen 3, room 1440 (library)
Using a satisfiability solver to analyze cascading failures in electric power systems
(Tatsuhiro Tsuchiya, Professor at Osaka University, Japan)
A Boolean satisfiability solver is a versatile tool for combinatorial problems. Its applications include model checking, test case generation, digital tomography, etc. In this talk, we propose a new application: an analysis of cascading failures in electric power systems. We consider a power system consisting of two interdependent networks, namely, a power line network and a SCADA network. We assume a failure model that represents failure propagation as a result of the interaction between these two networks. Given a condition on the initial attack, we first construct a Boolean formula that symbolically represents all possible failure propagation scenarios; then we identify the worst-case scenario by solving the satisfiability of the formula.
13 Apr 2018 at 11:00 in 4523, Lindstedtsvägen 5
Code Defenders: Crowdsourcing Effective Tests and Subtle Mutants with a Mutation Testing Game
(José Miguel Rojas, University of Leicester)
Writing good software tests is difficult and not every developer’s favorite occupation. Mutation testing aims to help by seeding artificial faults (mutants) that good tests should identify, and test generation tools help by providing automatically generated tests. However, mutation tools tend to produce huge numbers of mutants, many of which are trivial, redundant, or semantically equivalent to the original program; automated test generation tools tend to produce tests that achieve good code coverage, but are otherwise weak and have no clear purpose. In this paper, we present an approach based on gamification and crowdsourcing to produce better software tests and mutants: The Code Defenders web-based game lets teams of players compete over a program, where attackers try to create subtle mutants, which the defenders try to counter by writing strong tests. Experiments in controlled and crowdsourced scenarios reveal that writing tests as part of the game is more enjoyable, and that playing Code Defenders results in stronger test suites and mutants than those produced by automated tools.
26 Mar 2018 at 13:15 in Room 4523, Lindstedtsvägen 5 - floor 5
An ownership-based order reduction theorem for concurrent systems.
(Jonas Oberhauser, Saarland University)
The vast number of preemptive interleavings in concurrent code make reasoning about such code hard, if not impossible. Instead one would like to have cooperative schedulers, where a thread is only preempted when it reaches designated interleaving points, thereby considerably reducing the number of interleavings to be considered. Sadly multi-core processors do not behave this way, and can preempt the execution of a thread at instruction level; these additional interleavings often introduce new behaviours that may break the program.
We show a set of software conditions under which a program that has been analyzed under the assumption of a cooperative scheduler will not show new behaviours when executed by a preemptive scheduler. The conditions only have to be satisfied when the program is run by a cooperative scheduler, and so preemptive scheduling becomes completely invisible.
Our conditions also work in the presence of interrupt handler threads for inter-processor interrupts, which in a preemptive scheduler can preempt the interrupted thread after any instruction but on a cooperative scheduler only at the designated interleaving points. Such interrupt handler threads are particularly exciting because unlike other threads, they access and modify local state of the preempted thread, and some of these modifications remain visible when the preempted thread is resumed. One of our key results is a new ownership formalism, which is both more permissive than existing ownership policies for order reduction, and allows us to deal effectively with interrupt handler threads.
27 Feb 2018 at 13:15 in 4523, Lindstedtsvägen 5 - floor 5
System-level non-interference of constant-time cryptography
(Juan Diego Campo)
Cache based attacks are a class of side-channel attacks that are particularly effective in virtualized or cloud-based environments, where they have been used to recover secret keys from cryptographic implementations. One common approach to thwart cache based attacks is to use constant-time implementations, which do not branch on secrets and do not perform memory accesses that depend on secrets. However, there was no rigorous proof that constant-time implementations are protected against concurrent cache attacks in virtualization platforms; moreover, many prominent implementations are not constant-time.
We will show a new information-flow analysis that checks if an x86 application executes in constant-time and prove that constant-time programs do not leak confidential information through the cache to other operating systems executing concurrently on virtualization platforms. The soundness proofs are based on new theorems of independent interest, including isolation theorems for virtualization platforms, and proofs that constant-time implementations are non-interfering with respect to a strict information flow policy which disallows that control flow and memory accesses depend on secrets. We formalize our results using the Coq proof assistant and we demonstrate the effectiveness of our analyses on cryptographic implementations, including PolarSSL AES, DES and RC4, SHA256 and Salsa20.
22 Feb 2018 at 15:00 in 4523, Lindstedtsvägen 5 - floor 5
Optimal Quasi-Gray Codes: Efficient Generation using Simplicity of Even Permutations
(Diptarka Chakraborty)
A quasi-Gray code of dimension $n$ and length $\ell$ over an alphabet $\Sigma$ is a sequence of distinct words $w_1,w_2,\dots,w_\ell$ from $\Sigma^n$ such that any two consecutive words differ in at most $c$ coordinates, for some fixed constant $c>0$. In this paper we are interested in the read and write complexity of quasi-Gray codes in the cell-probe model, where we measure the number of symbols read and written in order to transform any word $w_i$ into its successor $w_{i+1}$.
We present construction of quasi-Gray codes of dimension $n$ and length $3^n$ over the ternary alphabet $\{0,1,2\}$ with worst-case read complexity $O(\log n)$ and write complexity $2$. This generalizes to arbitrary odd-size alphabets. For the binary alphabet, we present quasi-Gray codes of dimension $n$ and length at least $2^n - 20n$ with worst-case read complexity $6+\log n$ and write complexity $2$.
Our results significantly improve on previously known constructions and for the odd-size alphabets we break the $\Omega(n)$ worst-case barrier for space-optimal (non-redundant) quasi-Gray codes with constant number of writes. We obtain our results via a novel application of algebraic tools together with the principles of catalytic computation.
21 Feb 2018 at 09:15 in Room 4523, Lindstedtsvägen 5
Students’ Experiences of Assessment and Pair Programming
(Emma Riese, KTH)
As teachers, we can use assessment and make certain methods mandatory in order to guide the students to the most important parts of the course material and to give them tools for how to work with the material. How we use assessment and which methods we introduce could, therefore, be said to have an impact on the students’ learning. Understanding the students’ experiences of the assessment and methods introduced could give us insights into how to further improve our courses in order to enhance learning among our students. During this seminar, Emma will present the results from two of her studies. Both studies had a focus on students’ experiences, but from different aspects and in different contexts. The first study uses a phenomenographic approach to categorize students' experiences of assessment throughout an online course. The second study maps out differences in students’ experiences from a voluntary respectively mandatory introduction to the method of pair programming, in two campus courses.
14 Feb 2018 at 14:00 in Room 1537, Lindstedtsvägen 3
Trustworthy execution and temporal isolation
(Andreas Lindner)
Modern cyber physical systems are becoming increasingly integrated with software as a pervasive part. Examples are control systems in mobile phones, cars, avionics, space equipment and even nuclear reactors, which all require high assurance. The cost-effective implementation of such systems requires extensive sharing of hardware platforms, leading to mixed-criticality systems. These allow for reusing off-the-shelf software and mixing software components of different provenance. Our approach to provide high assurance for such systems is the application of formal methods to obtain machine checkable proofs for sound separation of criticality domains on real hardware. This task is generally laborious and therefore we develop automation strategies to reason on the existing complex hardware execution models. As first steps towards this goal, we designed and implemented a memory isolation platform based on existing memory protection hardware for microcontrollers. The platform's minimal design and support for real-time execution enables the possibility for formal verification. As a second step, we designed and implemented a proof-producing transpiler for assisting the binary analysis of machine code in a theorem prover. This tool eases the manual steps involved in reasoning on code running on hardware. Unlike other approaches such as verifying compilers, we can use standard build tools and reason on code changing hardware configurations. The latter is required for system software that changes the memory configuration. Currently, we are extending the transpiler with an analysis algorithm and applying it to an AES encryption binary for establishing functional correctness. In this talk, I will first discuss our results so far and finally introduce my plans for periodically scheduled systems and discuss them in the context of temporal isolation. The expected outcome is a temporal isolation platform enabling mixed-criticality applications.
05 Feb 2018 at 13:15 in room 4523
Securing Applications by Information Flow Tracking
(Musard Balliu, KTH)
Society increasingly relies on applications that handle sensitive information: private individuals and businesses all use software applications that manipulate confidential or untrusted data. My research makes it easier to build applications that handle sensitive data securely and uncover vulnerabilities in existing applications. To achieve this, I use programming language concepts and methods to develop tools and techniques that allow programmers to express application-specific security policies and enforce those policies efficiently, ultimately providing security guarantees that rely on solid foundations.
In this talk I will give an overview of my research interests with special focus on web application security. I will present some recent work on how to achieve end-to-end web application security, by tracking information flows through the client, the server, and the database. Afterwards, I will discuss a combination of formal and empirical methods to find vulnerabilities in existing web applications. Finally, I will conclude with a few highlights on my future reseach directions in the areas of security and privacy. The talk is self-contained and, for the most part, no prior knowledge is required.
26 Jan 2018 at 11:30 in 4423
Residual Investigation: Predictive and Precise Bug Detection
(Christoph Reichenbach)
There are many static program analyses that can identify potential program faults. However, such analyses are notorious for producing false positives, which substantially reduces their utility to end users. To address this problem, we introduce `residual investigation', a technique for refining static analysis results through predictive dynamic analysis.
The purpose of residual investigation is to find evidence that indicates whether the statically predicted program fault is likely to be realisable and relevant. The key feature of a residual investigation is that it has to be much more precise (i.e., with fewer false warnings) than the static analysis alone, yet significantly more general (i.e., reporting more errors) than the dynamic tests in the program's test suite that are pertinent to the statically reported error. That is, a good residual investigation encodes dynamic conditions that, when considered in conjunction with the static error report, increase confidence in the existence or severity of an error without needing to directly observe a fault resulting from the error.
We have enhanced the static analysers FindBugs and JChord with residual investigations, appropriately tuned to the analysers' static error patterns. Examining a number of open source systems, we find that residual investigation is effective at reducing the false positive rate.
25 Jan 2018 at 13:15 in 4423
Sum of squares lower bounds from symmetry and a good story
(Aaron Potechin)
The sum of squares hierarchy is a hierarchy of semidefinite programs which has the three advantages of being broadly applicable (it can be applied whenever the problem can be phrased in terms of polynomial equations over R), powerful (it captures the best known algorithms for several problems including max cut, sparsest cut, and unique games), and in some sense, simple (all it is really using is the fact that squares are non-negative over R). The sum of squares hierarchy can also be viewed as the Positivstellensatz proof system.
However, the sum of squares does have a few known weaknesses. One such weakness, as shown by Grigoriev's sum of squares lower bound on the knapsack problem, is that the sum of squares hierarchy is poor at capturing integer arguments. In this talk, we further explore this weakness for symmetric problems. In particular, we discuss the following Turan type problem whose true solution was proved by Razborov: Minimize the number of triangles a graph of a given density contains. We then describe a pretend solution (called pseudo-expectation values) for this problem and describe features of our pretend solution which allow us to show that it fools the sum of squares hierarchy
Note: Prior knowledge of the sum of squares hierarchy is helpful but will not be assumed.
19 Jan 2018 at 14:15 in 1440 (library), Lindstedtsvägen 5 - floor 4
Verifying Concurrent Programs with VerCors
(Wytse Oortwijn, University of Twente)
In this talk, I will give an overview of the VerCors tool set and of research that is related to the VerCors project. VerCors is a tool set for practical mechanised verification of parallel and concurrent software under different concurrency models, notably heterogeneous and homogeneous threading. VerCors has verification support for multiple high-level languages with concurrency features, including Java, OpenCL (i.e. GPU kernels), and OpenMP for C (i.e. compiler directives for automated parallelisation). During the talk I will explain the verification capabilities of VerCors and demonstrate this on several example verification problems.
19 Jan 2018 at 11:30 in 4523, Lindstedtsvägen 5
Scalable Fine-Grained Dynamic Binary Analysis -- Methods and Applications
(Ulf Kargén)
Being able to analyze binary (i.e. machine code) programs is important, for example when auditing closed-source applications for security flaws, or for defending against malicious code. Static analysis is challenging to apply on the binary level due to e.g. the lack of type information and structured control-flow. Dynamic analysis, on the other hand, can provide fine-grained insight into the behavior of a binary. However, scalability often becomes an issue when performing fine-grained dynamic analysis due to the large data volumes produced, even for modestly sized programs and inputs. In this talk, I will briefly describe some of our work on scalable dynamic analysis and its applications, with a focus on security. I will devote some extra time to our work on aligning instruction traces, which was recently presented at ASE’17. I will also discuss potential approaches to generalize our trace alignment method to align e.g. Java bytecode traces.
18 Jan 2018 at 13:15 in Room 4523
Expected time complexity of QuickSelect (Hoare’s selection)
(Bengt Aspvall, Blekinge Tekniska högskola)
Analyzing the expected time complexity of QuickSelect (Hoare’s selection) used in order statistics is quite involved. Many researchers have analysed the algorithm over the years. Averaged over all element ranks the expected number of comparisons is 3n + O(log(n)). We show how the result can be derived in a simple and intuitive way when the algorithm is introduced. Our approach is "self-contained", quite short, and easy enough to be included in a lecture or a textbook.
18 Dec 2017 at 13:15 in 4523, Lindstedtsvägen 5
Verification of data-centric dynamic systems (and some related aspects)
(Riccardo De Masellis)
Data-centric dynamic systems are theoretical frameworks composed by agents and a data component. Each agent is characterized by a (possibly different) lifecycle model specifying its behavior, namely, when it can perform its actions, and the data component is a relational or knowledge base. At each time instant, the system is described by the current state of the agents as well as the state of data: thus, when actions modify the data, the system evolves to a new state. Given their generality, data-centric dynamic systems can be used for modeling multi-agent systems, robots, (company) business processes or software components. The goal is to verify that a data-centric dynamic system meets temporal specification of interest (expressed by some temporal logic formulas): this can be done offline (à la model-checking) or at runtime. Challenges are both theoretical and technical: for example, there is the need of abstracting from the infinitely many values that can populate the system. Also, in many practical cases agent lifecycles eventually terminate, thus making the traditional infinite-trace semantics for temporal logics inadequate. Finally, there is the need of actual tools to perform the verification tasks. The seminar will provide an high-level overview of the data-centric dynamic systems frameworks, their challenges, solutions and some application scenarios.
05 Dec 2017 at 15:15 in room 4523
Model Checking Strategic Ability: Why, What, and Especially: How?
(Wojtek Jamroga)
Automated verification of discrete-state systems has been a hot topic in computer science for over 35 years. Model checking of temporal and strategic properties is one of the most prominent and most successful approaches here. I will present a brief introduction to the topic, and mention some relevant properties that one might like to verify this way. Then, I will talk about some recent results on approximate model checking and model reductions, which can be applied to facilitate verification of notoriously hard cases.
24 Nov 2017 at 11:15 in Seminar room 4523, Lindstedtsvägen 5, level 5
Machines that read and reason: Multi-hop reading comprehension across documents
(Pontus Stenetorp, Senior Research Associate at University College, London (UCL) and former student at KTH.)
While being one of the most active areas of research in Natural Language Processing, existing Reading Comprehension (RC) datasets — SQuAD, TriviaQA, etc. — are dominated by queries that can be answered based on the content of a single paragraph or document. However, enabling models to combine pieces of textual information from different sources would drastically extend the scope of RC – effectively allowing models to even answer queries whose answers are never explicitly stated. In this talk, we will introduce a novel Multi-hop RC task, [1] where a model has to learn how to find and combine disjoint pieces of textual evidence, effectively performing multi-step (alias multi-hop) inference. We present two datasets, WikiHop and MedHop, from different domains — both constructed using a unified methodology. We will then discuss the behaviour of several baseline models, including two established end-to-end RC models, BiDAF and FastQA. For example, one model is capable of integrating information across documents, but both models struggle to select relevant information. Overall the end-to-end models outperform multiple baselines, but their best accuracy is still far behind human performance, leaving ample room for improvements. It is our hope that these new datasets will drive future RC model development, leading to new and improved applications in areas such as Search, Question Answering, and scientific text mining.
[1]: https://arxiv.org/abs/1710.06481
10 Oct 2017 at 13:15 in room 4618
Formal goal-oriented development of adaptable and resilient systems
(Elena Troubitsyna, Åbo Akademi, Finland)
Goal-oriented development facilitates structuring complex requirements. To ensure resilience, the designers should guarantee that the system achieves its goals despite changes, e.g., caused by failures of system components. In my talk, I will present a formalisation of basic concepts of goal-oriented multi-agent systems and their essential properties. I will focus on describing reasoning about resilience and discuss different mechanisms of enhancing system adaptability and resilience. The proposed formalisation allows us to establishing connections between goals at different levels of abstraction, system architecture and agent responsibilities. The proposed systematic treatment of the involved concepts provides the designers the guidelines for a systematic structured development of adaptive and resilient systems. In my talk, I will also discuss how by augmenting the proposed models with the stochastic information, we can quantitatively evaluate system resilience.
26 Sep 2017 at 13:15 in room 1440 (library)
Free-form and code-to-code search - Leveraging Q&A Data towards Accurate Matching of Diverse Solutions
(Tegawendé Bissyandé, Research Associate @SnT / Univ. Luxembourg)
Code search is an unavoidable activity in software development. Although various approaches have been explored in the literature to support code search tasks, two major issues challenge the accuracy and scalability of state-of-the-art techniques.
First, source code terms such as method names and variable types are often different from conceptual words mentioned in a free-form search query. This vocabulary mismatch problem can make code search inefficient. We will introduce COde voCABUlary (CoCaBu), an approach to resolving the vocabulary mismatch problem when dealing with free-form code search queries. Our approach leverages common developer questions and the associated expert answers to augment user queries with the relevant, but missing, structural code entities in order to improve the performance of matching relevant code examples within large code repositories.
Second, a large body of research on searching for code clones has focused on identifying (nearly) similar pieces of code either (1) statically by recognizing syntactic and structural similarities or (2) dynamically by observing when code fragments produce the same outputs for the same inputs or present similar execution traces. Unfortunately, the former approaches cannot detect code fragments that are functionally similar although syntactically different, while the latter approaches cannot scale to the search of (partially) equivalent code fragments in large code bases. We present FaCoY (Find Another Code other than Yours), a novel approach for statically finding code fragments which may be semantically similar to user input code. FaCoY implements a query alternation strategy: instead of directly matching code query tokens with code in the search space, FaCoY first attempts to identify other tokens which may be also relevant in implementing the functional behavior of the input code.
11 Sep 2017 at 13:15 in room 4523, Lindstedtsvägen 5
Pseudo-Boolean Constraints: Reasoning and Compilation
(Romain Wallon, Université d'Artois)
Pseudo-Boolean constraints, and their associated cutting-planes proof system, are theoretically more powerful than the usual resolution approach. However, even if pseudo-Boolean solvers benefit from some of the last improvements made in SAT solvers, the practical results are disappointing compared to the promises of the theory.
To try to get a better understanding of this phenomenon, an interesting point of view is to consider pseudo-Boolean constraints as a compilation language.
In propositional logic, the Knowledge Compilation Map proposed by Adnan Darwiche and Pierre Marquis in 2002 enables to find out which compilation language is suitable to represent knowledge according to what is intended to be done with it.
In this talk, we will extend this map by adding the pseudo-Boolean constraints to it, and use these results to try to find ways to improve pseudo-Boolean reasoning.
27 Jun 2017 at 13:15 in room 4523, Lindstedtsvägen 5
Rethinking Large-Scale Consensus through Blockchains
(Rafal Pass, Cornell)
The literature on distributed computing (as well as the cryptographic literature) typically considers two types of players---honest ones and corrupted ones. Resilience properties are then analyzed assuming a lower bound on the fraction of honest players. Honest players, however, are not only assumed to follow the prescribed the protocol, but also assumed to be *online* throughout the whole execution of the protocol. The advent of “large-scale” consensus protocols (such as e.g., the blockchain protocol) where we may have millions of players, makes this assumption unrealistic. In this work, we initiate a study of distributed protocols in a “sleepy” model of computation, where players can be either online (awake) or offline (asleep), and their online status may change at any point during the protocol. The main question we address is:
Can we design consensus protocols that remain resilient under “sporadic participation”, where at any given point, only a subset the players are actually online?
As far as we know, all standard consensus protocols break down under such sporadicparticipation, even if we assume that 99% of the online players are honest. Our main result answers the above question in the affirmative, assuming only that a majority of the online players are honest. Perhaps surprisingly, our protocol significantly departs from the standard approaches to distributed consensus, and we instead rely on key ideas behind Nakamoto’s blockchain protocol (while dispensing with “proofs-of-work”).
Based on joint work with Elaine Shi.
09 Jun 2017 at 11:00 in room 4523, Lindstedtsvägen 5
Pointer chasing via triangular discrimination
(Amir Yehudayoff, Technion university)
We shall see how to use information theory to prove lower bounds for the distributional communication complexity of the pointer chasing problem. A key part of the proof is using triangular discrimination instead of total variation distance; this idea may be useful elsewhere.
30 May 2017 at 13:15 in room 4523, Lindstedtsvägen 5
Revisiting Diffusion Process for Hypergraph Laplacian
(Hubert Chan, University of Hong Kong)
Hypergraph Laplacian has been defined via a diffusion process [Louis STOC 2015]. The spectral properties of the Laplacian is used to achieve a variant of the Cheeger's inequality for hyperedge expansion and higher-order Cheeger-like inequalities for multi-way expansion.
In this talk, we take a closer look at this diffusion process, in which each hyperedge tries to transfer measure from vertices having maximum measure to those having minimum. In order for the diffusion process to be well-defined, we see that the hyperedges must be coordinated to essentially solve a densest subset problem. Consequently, we can recover the basic Cheeger's inequality, but higher-order spectral properties of the Laplacian do not hold in hypergraphs in general.
This is joint work with Anand Louis, Zhihao Gavin Tang and Chenzi Zhang.
29 May 2017 at 12:00 in room 4523, Lindstedtsvägen 5
Unified and optimal lower bounds for monotone computation
(Robert Robere, University of Toronto)
A classic counting argument due to Claude Shannon shows that almost all Boolean functions have high complexity — more formally, all but an exponentially small fraction of Boolean functions on n bits require strongly exponential (i.e. 2^Ω(n)) size circuits. On the other hand, the best lower bounds on circuit size that we have proven for any explicit function is on the order of 5n - o(1), which is not even superlinear! The state of affairs is not much better for Boolean formulas, for which we merely have cubic lower bounds. Even for monotone circuits and formulas, which are much more understood, the best lower bounds for explicit functions are weakly exponential (i.e. of the form 2^{n^c} where c < 1).
In this talk, we discuss some recent work in which we have nearly matched Shannon's lower bound for computing an explicit function by monotone circuit models. In particular, we prove that a monotone variant of the SAT problem requires monotone formulas of size 2^{c*n} for some universal constant c > 0. Our lower bounds are tight (up to the constant c) for any monotone Boolean function, and thus represent the first example of any explicit monotone Boolean function with strongly exponential size lower bounds. Furthermore, our techniques are very general and apply to a wide variety of monotone circuits models.
This work is joint with Toniann Pitassi
24 May 2017 at 12:00 in room 4523, Lindstedtsvägen 5
"Secure" execution platforms for application software — some definitions of "secure"
(Frank Piessens, Katholieke Universiteit Leuven)
Software applications run on top of execution platforms consisting of hardware (processors, devices, communication networks, and so forth) as well as software (operating systems, compilers, virtual machines, language runtimes, databases, and so forth). It is obviously desirable that these platforms are "secure", but it is less obvious what "secure" means in this context. It is an empirical fact that attacks against application software often rely at least to some extent on aspects of the execution platform, but often the possibility of an attack is a joint responsibility between application code vulnerabilities and execution platform characteristics. For instance, a memory safety vulnerability is only a security concern if the compiler or operating system do not implement adequate countermeasures against exploitation of such vulnerabilities. As another example, an attacker sniffing the network is less of a concern if either the application or the platform provide adequate cryptographic protection.
So what exactly should we require of a platform for it to be secure? This seminar will discuss the problem of platform security by proposing and analyzing a number of different formal definitions for platform security, building on classical concepts from programming language theory.
22 May 2017 at 10:00 in room 1537
Part I: Future-Proof Software and Part II: How to test the universe? - Efficient Testing of Software Product Lines
(Ina Schaefer, Technische Universität Braunschweig, Germany)
Part I: Modern software systems are extremely long-lived and evolve over time in order to adapt to changing user requirements, application contexts or technological changes. This leads to challenges for software design, implementation, quality assurance and re-engineering in order to make those software systems future-proof. In this talk, I will present an overview of ongoing research in my group at the Institute of Software Engineering and Automotive Informatics at TU Braunschweig addressing the challenges of long-living and evolving software systems. I will focus on two areas of our work: (1) applying machine learning techniques for efficient system-level regression testing and (2) similarity-based variability model mining in order to transform grown software systems into well-structured software product line.
Part II: Software product line engineering has gained considerable momentum in recent years, both in industry and in academia. A software product line is a family of software products that share a common set of features. Software product lines challenge traditional analysis, test and verification techniques in their quest of ensuring correctness and reliability of software. Simply creating and testing all products of a product line is usually not feasible, due to the potentially exponential number of valid feature combinations. In this talk, I present different strategies for software product line testing and focus on our recent work on incremental test case selection and test case prioritization of efficient product line testing.
16 May 2017 at 13:00 in Room 4423, Lindstedtsvägen 5
Decremental Shortest Paths and Beyond
(Danupon Na Nongkai, Theoretical Computer Science (TCS) Department- KTH)
This lecture focuses on a fundamental problem in dealing with graphs that evolve over time called decremental single-source shortest paths. This problem concerns maintaining the distances and shortest paths between a given source node and every other node in a graph undergoing edge deletions. In this talk, I will discuss the recent development on this problem which leads to a randomized (1+?)-approximation algorithm with near-linear total update time on undirected graphs. I will also discuss the main technique called hop-set which has found many new applications beyond the decremental shortest paths problem. The talk will also touch some lower bound developments and other variations of the problem.
16 May 2017 at 10:15 in room 4523, Lindstedtsvägen 5
An Interactive journey through combinatorial optimization: resilient solutions, multiple objectives, and logic-based approaches for timetabling
(Emir Demirović, Technische Universität Wien)
In this presentation, a range of combinatorial optimization topics will be covered, where the audience will dictate the flow according to their interest. Therefore, it is meant to be an interactive presentation. The topics are as follows:
1) Solving the general high school timetabling (XHSTT) problem using logic-based approaches: maxSAT, SMT-bitvector theory, and a hybrid algorithm which combines maxSAT with large neighborhood search (metaheuristic algorithm). These approaches are state-of-the-art for XHSTT, outperforming the competition on numerous benchmarks.
2) Computing the "representative" subset of all Pareto optimal solutions for multi-objective optimization problems is known to be sigma^P_2-hard. We will see a shift in complexity to NP-hardness when considering an approximation using the notion of egalitarian solutions, and discuss algorithms for its computation.
3) For practical applications, it is important to compute solutions that remain "stable" after unexpected changes to the problem happen (e.g. new constraints are added post-solving). We will formally define the problem for set covering and set partitioning, prove sigma^P_3- and sigma^P_2-hardness (respectively), and analyze algorithms for computing such "resilient" solutions.
15 May 2017 at 13:15 in room 4523, Lindstedtsvägen 5
Representations of Monotone Boolean Functions by Linear Programs
(Mateus de Oliveira Oliveira)
We introduce the notion of monotone linear-programming circuits (MLP circuits), a model of computation for partial Boolean functions. Using this model, we prove the following results.
1. MLP circuits are superpolynomially stronger than monotone Boolean circuits.
2. MLP circuits are exponentially stronger than monotone span programs.
3. MLP circuits can be used to provide monotone feasibility interpolation theorems for Lovász-Schrijver proof systems, and for mixed Lovász-Schrijver proof systems.
4. The Lovász-Schrijver proof system cannot be polynomially simulated by the cutting-planes proof system. This is the first result showing a separation between these two proof systems.
Finally, we discuss connections between the problem of proving lower bounds on the size of MLPs and the problem of proving lower bounds on extended formulations of polytopes.
08 May 2017 at 12:00 in room 4523, Lindstedtsvägen 5
Information-theoretic thresholds
(Amin Coja-Oghlan, Goethe University)
Vindicating a sophisticated but non-rigorous physics approach called the cavity method, we prove a formula for the mutual information in statistical inference problems induced by random graphs. This general result implies the conjecture on the information-theoretic threshold in the disassortative stochastic block model [Decelle et al.: Phys. Rev. E (2011)] and allows us to pinpoint the exact condensation phase transition in random constraint satisfaction problems such as random graph coloring, thereby proving a conjecture from [Krzakala et al.: PNAS (2007)]. As a further application we establish the formula for the mutual information in Low-Density Generator Matrix codes as conjectured in [Montanari: IEEE Transactions on Information Theory (2005)]. Joint work with Florent Krzakala, Will Perkins and Lenka Zdeborova.
02 May 2017 at 10:15 in room 4523, Lindstedtsvägen 5
A meta-algorithm for Bayesian estimation
(Sam Hopkins, Cornell University)
We propose a simple and efficient meta-algorithm for Bayesian estimation problems (i.e. hidden variable, latent variable, or planted problems). Our algorithm uses low-degree polynomials together with new and highly robust tensor decomposition methods. We focus on the question: for a given estimation problem, precisely how many samples (up to low-order additive terms) do polynomial-time algorithms require to obtain good estimates of hidden variables? Our meta-algorithm is broadly applicable, and achieves statistical or conjectured computational sample-complexity thresholds for many well-studied problems, including many for which previous algorithms were highly problem-specific.
We discuss two our meta-algorithm in two concrete contexts. In the stochastic block model -- a widely studied family of random graphs used to model community structure in networks -- we recover and unify the best-known algorithms for the so-called "partial recovery" problem, and obtain (conjecturally) tight sample complexity results for more realistic blockmodel variants. For sparse PCA, a useful but computationally challenging primitive in high-dimensional statistics, we prove an optimality result for our meta-algorithm, showing that it is at least as powerful as a very strong family of convex programs called the Sum-of-Squares hierarchy.
Based on joint works with Pravesh Kothari, Aaron Potechin, Prasad Raghavendra, Tselil Schramm, and David Steurer.
24 Apr 2017 at 12:00 in room 4523,, Lindstedtsvägen 5
Dynamic (Minimum) Spanning Forest with Worst-case Update Time
(Thatchaphol Saranurak, TCS group - KTH)
We will discuss recent algorithms for dynamically maintaining a (minimum) spanning forest of a graph undergoing edge insertions and deletions. This problem played a central role in the study of dynamic graph algorithms.
We provide the "first polynomial improvement" over the long-standing O(sqrt{n}) bound of [Frederickson STOC’83, Eppstein et. al FOCS’92] for dynamic spanning forest problem. Independently, Wulff-Nilsen shows the first polynomial improvement for dynamic minimum spanning forest problems. Both works will be presented in STOC'17.
The new crucial techniques are about expanders: 1) an algorithm for decomposing a graph into a collection of expanders in near-linear time, and 2) an algorithm for "repairing" the expansion property of an expander after deleting some edges of it. These techniques can be of independent interest.
21 Apr 2017 at 12:00 in room 4523
Witnessing: From Optimization to Verification to Optimization
(Lenore Zuck, University of Illinois, Chicago)
An optimizing compiler is commonly structured as a sequence of passes each transforming a source program into a target one. The proof that a such a transformation pass is correct consists of a relation between source and target, often with the aid of invariants. Those invariants also enable further optimizations. Thus, the two tasks of proving the validity of a transformation and propagating invariants to enable further optimizations are closely related. The two can be addressed by augmenting each transformation with A witness — a stuttering simulation relation between the target and source programs which guarantees correctness for that pass. This stuttering simulation relation forms a sound and complete witness format. Witnesses can be generated either automatically or by (sound) external static analyzers.
The talk will describe both witness generation and witness propagation, and demonstrate the feasibility of the approach on the LLVM compiler. Each optimization pass is augmented with a witness generator. This allows to construct an independent translation validation framework for the compiler as well as to generate an independent verification for each of its runs. The automatically generated invariants, as well as some obtained by Frama-C, can then propagated to enhance optimizations. This is demonstrated on experiments where propagated witnesses are used to substantially improve the effectiveness of several optimizations.
10 Apr 2017 at 12:00 in room 4523, Lindstedtsvägen 5
Complexity results in automated planning
(Meysam Aghighi, Linköping University)
Planning is the problem of finding a sequence of actions that achieve a specific goal from an initial state. This problem is computationally hard in the general case. Propositional planning is PSPACE-complete and first-order planning is undecidable.
Cost-optimal planning (COP) is a type of planning where each action has a cost and the goal is to find a satisfying plan with minimal cost. We show how under different restrictions, COP varies from being polynomial-time solvable to PSPACE-complete. Net-benefit planning (NBP) is a type of planning where the goal is to find a plan that maximizes the achieved utility minus the cost spent. We present a method for efficiently compiling NBP into the ordinary plan existence problem.
We also show the effect of numeric domains for action costs in COP using parameterized complexity. Planning is PSPACE-complete regardless of whether action costs are positive or non-negative, integer or rational. We distinguish between these cases and show that COP is W[2]-complete for positive integer costs, but it is para-NP-hard if the costs are non-negative integers or positive rationals.
27 Mar 2017 at 12:00 in room 4523, Lindstedtsvägen 5
Complexity theory beyond deterministic exponential time and applications
(Igor Carboni Oliveira, Charles University)
In the first part of the talk, I'll survey a variety of connections between non-uniform circuit lower bounds, non-trivial proof systems, non-trivial learning algorithms, and variants of natural properties. In particular, I plan to discuss how one can prove new unconditional lower bounds for standard complexity classes such as REXP, ZPEXP, BPEXP, and NEXP by designing algorithms with a non-trivial running time.
After the break, we will focus on a recent application of some of the complexity-theoretic techniques behind these results. I'll describe in more detail recent progress on the problem of efficiently generating large canonical prime numbers, via pseudodeterministic pseudorandomness. If there is enough time and interest, I'll also discuss some non-constructive aspects of the result, and connections to derandomization.
Based on joint work with Rahul Santhanam.
20 Mar 2017 at 13:15 in Biblioteket 1440, Lindstedtsvägen 3
Logical modelling, specification, verification and synthesis of multi-agent systems: an introduction
(Valentin Goranko, Stockholm University)
In this talk I will introduce concurrent game models for multi-agent systems and the Alternating-time temporal logic ATL as the currently most popular logical system for formalizing and reasoning about strategic abilities in such systems.
I will illustrate the semantics of ATL with some examples and will discuss briefly the logical decision problems of model checking and constructive satisfiability testing of ATL formulae, and how these can be used for specification, verification, and synthesis of concurrent game models.
16 Mar 2017 at 13:15 in room 4523, Lindstedtsvägen 5
Dag-like communication and its application
(Dmitry Sokolov, St. Petersburg Department of V. A. Steklov Institute of Mathematics)
n this talk we consider a generalization of communication protocols to dag-like case (instead of classical one where protocols correspond to trees). We prove an analogue of Karchmer-Wigderson Theorem for this model and boolean circuits. We also consider a relation between this model and communication PLS games proposed by Razborov in 1995 and simplify the proof of Razborov's analogue of Karchmer-Wigderson Theorem for PLS games.
We also consider a dag-like analogue of real-valued communication protocols and adapt a lower bound technique for monotone real circuits to prove a lower bound for these protocols. We use real-valued dag-like communication protocols to generalize ideas of "feasible interpolation" technique, which helps us to prove a lower bound on the Cutting Planes proof system (CP) and adapt it to "random CP" by using recent result of Krajíček.
Our notion of dag-like communication games allows us to use a Raz-McKenzie transformation from Goos and Pitassi paper, which yields a lower bound on the real monotone circuit size for the CSPSAT problem.
15 Mar 2017 at 10:00 in roon 4523
On polynomial approximations to AC0
(Prahladh Harsha, Tata Institute of Fundamental Research)
In this talk, we will discuss some questions related to polynomial approximations of AC0. A classic result due to Tarui (1991) and Beigel, Reingold, and Spielman (1991), states that any AC0 circuit of size s and depth d has an ε-error probabilistic polynomial over the reals of degree at most (log(s/ε))^{O(d)}. We will have a re-look at this construction and show how to improve the bound to (log s)^{O(d)}⋅log(1/ε), which is much better for small values of ε.
As an application of this result, we show that (log s)^{O(d)}⋅log(1/ε)-wise independence fools AC0, improving on Tal's strengthening of Braverman's theorem that (log(s/ε))^{O(d)}-wise independence fools AC0. Time permitting, we will also discuss some lower bounds on the best polynomial approximations to AC0.
13 Mar 2017 at 12:00 in room 4523, Lindstedtsvägen 5
A nearly tight sum-of-squares lower bound for the planted clique problem
(Aaron Potechin, Institute for Advanced Study)
The sum of squares (SoS) hierarchy, a method for deciding the feasibility of polynomial equations over the reals, is an exciting frontier of complexity theory. SoS is exciting because it can be applied to a wide variety of combinatorial optimization problems and is in fact conjectured to be optimal for many important problems. However, its performance is only partially understood.
In this talk, I will describe recent work proving sum of squares lower bounds for the planted clique problem, a classic average case problem. In the first part of the talk, I will describe the sum of squares hierarchy, planted clique, and pseudo-calibration, a key idea for the lower bound. In the second part of the talk, I will describe the main technical ideas needed in the proof.
06 Mar 2017 at 13:15 in room 4523, Lindstedtsvägen 5
SAT-based learning to verify liveness of randomised parameterised systems
(Philipp Rümmer, Uppsala University)
We consider the problem of verifying liveness for systems with a finite, but unbounded, number of processes, commonly known as parameterised systems. Typical examples of such systems include distributed protocols (e.g. for the dining philosopher problem). Unlike the case of verifying safety, proving liveness is still considered extremely challenging, especially in the presence of randomness in the system. We introduce an automatic method of proving liveness for randomised parameterised systems under arbitrary schedulers. Viewing liveness as a two-player reachability game (between Scheduler and Process), our method is a CEGAR approach that synthesises a progress relation for Process that can be symbolically represented as a finite-state automaton. The method constructs a progress relation by means of a suitable Boolean encoding and incremental SAT solving. Our experiments show that our algorithm is able to prove liveness automatically for well-known randomised distributed protocols, including Lehmann-Rabin Randomised Dining Philosopher Protocol and randomised self-stabilising protocols (such as the Israeli-Jalfon Protocol). To the best of our knowledge, this is the first fully-automatic method that can prove liveness for randomised protocols.
25 Jan 2017 at 10:00 in room 4523,, Lindstedtsvägen 5
The State of the Art of Automatic Program Repair
(Martin Monperrus, University of Lille & INRIA France)
Automatic software repair is the process of fixing software bugs automatically and is a very active research area. Before having a large impact on practice, current research tries to understand the fundamental strengths and weaknesses of the best-of-breed repair algorithms. This talk presents the results of a recent experiment on repairing 224 real Java bugs from open source projects.
Ref:
Automatic Repair of Real Bugs in Java: A Large-Scale Experiment on the Defects4J Dataset (Matias Martinez, Thomas Durieux, Romain Sommerard, Jifeng Xuan and Martin Monperrus), In Springer Empirical Software Engineering, 2016.
Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs (Jifeng Xuan, Matias Martinez, Favio DeMarco, Maxime Clément, Sebastian Lamelas, Thomas Durieux, Daniel Le Berre and Martin Monperrus), In IEEE Transactions on Software Engineering, 2016.
23 Jan 2017 at 12:00 in room 4523, Lindstedtsvägen 5
Cumulative space in black-white pebbling and resolution
(Susanna F. de Rezende, Theory Group, KTH)
The space usage of a computation is usually defined as the maximum memory it requires, and this is essentially everything one would want to know if the computational model is static. But what if we are computing in an environment where we can allocate memory dynamically? Then it makes a big difference whether we only have one single spike of high memory consumption or whether we consistently need large amounts of memory throughout the whole computation.
In this talk we will explore a complexity measure, namely cumulative space, which allows us to measure this difference by accounting for the sum of memory usage over all time steps rather than just the maximum. It was introduced for pebble games by [Alwen and Serbinenko '15] and used to propose a more robust definition of cryptographic functions that are hard to invert (memory-hard functions).
We will present some results regarding the cumulative space of another pebble game, namely the black-white pebble game, and relate this to the cumulative space complexity of refuting formulas in the resolution proof system. This is joint work with Joël Alwen, Jakob Nordström and Marc Vinyals.
This presentation will also serve as Susanna's 50% PhD ladder seminar.
18 Jan 2017 at 12:00 in room 1537, Lindstedtsvägen 3
The QRAT proof system
(Martina Seidl, Johannes Kepler University Linz)
We present QRAT, a novel proof system for quantified Boolean formulas. With QRAT it is now possible to produce certificates for all reasoning techniques implemented in state-of-the-art QBF preprocessors. Such certificates are not only useful to efficiently validate the correctness of a result, but they also allow the extraction of Skolem functions, i.e., winning strategies for satisfiable formula instances.
In this talk, we first introduce the rules of QRAT, then we show how they can be used to express some powerful reasoning techniques implemented in modern preprocessors, and finally we outline how to extract a Skolem function from a QRAT proof.
28 Nov 2016 at 12:00 in room 4523, Lindstedtsvägen 5
Space in proof complexity
(Marc Vinyals, Theory Group, KTH)
Space in proof complexity measures the amount of memory that a verifier needs to check a proof, which imposes a lower bound on the memory a SAT solver needs to generate a proof. Space is very well understood for the most common proof system, resolution, but less so in other proof systems. In this talk we will survey some recent results about space in other settings: stronger proof systems such as polynomial calculus and cutting planes on the one hand, and a weaker "CDCL" proof system that is closer to the actual capabilities of SAT-solving algorithms on the other hand. We will even explore alternative definitions of space. The proof techniques will lead us to discussing adjacent areas of computational complexity such as pebble games and communication complexity.
21 Nov 2016 at 12:00 in room 4523, Lindstedtsvägen 5
Strong size lower bounds in regular resolution via games
(Ilario Bonacina, Theory Group, KTH)
The Strong Exponential Time Hypothesis (SETH) says that solving the SATISFIABILITY problem on formulas that are k-CNFs in n variables require running time 2^(n(1 - c_k)) where c_k goes to 0 as k goes to infinity. Beck and Impagliazzo (2013) proved that regular resolution cannot disprove SETH, that is they prove that there are unsatisfiable k-CNF formulas in n variables such that each regular resolution refutation of those has size at least 2^(n(1 - c_k)) where c_k goes to 0 as k goes to infinity. We give a different/simpler proof of such lower bound based on the known characterizations of width and size in resolution and our technique indeed works for a proof system stronger than regular resolution. The problem of finding k-CNF formulas for which we can prove such strong size lower bounds in general resolution is still open. This is a joint work with Navid Talebanfard.
14 Nov 2016 at 13:15 in room 1440 Biblioteket, Lindstedtsvägen 3
Architecting the next generation of vehicles
(Patrizio Pelliccione, Chalmers University of Technology)
The automotive domain is living an extremely challenging historical moment since it is shocked by many emerging business and technological needs. Electrification, autonomous driving, and connected cars are some of the driving needs in this changing world. Increasingly, vehicles are becoming software-intensive complex systems and most of the innovation within the automotive industry is based on electronics and software. Modern vehicles can have over 100 Electronic Control Units (ECUs), which are small computers, together executing gigabytes of software. ECUs are connected to each other through several networks within the car, and the car is increasingly connected with the outside world. These novelties ask for a change on how the software is engineered and produced and for a disruptive renovation of the electrical and software architecture of the car.
In this talk I will present the current investigation within Volvo Cars to create an architecture framework able to cope with the complexity and needs of present and future vehicles. Specifically, I will present scenarios that describe demands for the architectural framework and introduce three new viewpoints that need to be taken into account for future architectural decisions: Continuous Integration and Deployment, Ecosystem and Transparency, and car as a constituent of a System of Systems.
07 Nov 2016 at 12:00 in room 4523, Lindstedtsvägen 5
Graph-based pseudo-industrial random SAT instance generators
(Jesús Giráldez Crú, Theory Group, KTH)
The Boolean Satisfiability problem (SAT) is the canonical NP-complete problem. However, Conflict-Driven Clause Learning (CDCL) SAT solvers are nowadays able to efficiently solve many industrial, or real-world, SAT instances -as hardware and software verification, cryptography, planning or scheduling, among others-. On the other hand, relatively small random SAT instances are still too hard. The common intuition to explain the success of CDCL solving industrial instances is the existence of some hidden structure in them (whereas random formulas would not show such structure). In some works, this structure is studied representing SAT instances as graphs and analyzing some graph features, showing that these features are shared by the majority of existing industrial SAT instances. Some examples are the scale-free structure or the community structure.
Unfortunately, the process of development and testing of new SAT solving techniques (specialized in industrial problems) is conditioned to the reduced number of existing industrial benchmarks. Therefore, the generation of random SAT instances that captures realistically some computational properties of such industrial instances is an interesting open problem.
In this talk, we review some models of pseudo-industrial random SAT instances generation. They are the scale-free model and the community attachment model, which are related to some well-known concepts in real-world complex networks: preferential attachment and high modularity, respectively. In the scale-free model, the number of variable occurrences k follows a power-law distribution P(k) \propto k^{-\alpha}. With the community attachment model, it is possible to generate SAT instances with clear community structure (i.e., high modularity). These models will be reviewed from the perspectives of both graphs and SAT instances. Finally, we discuss some models for generating complex networks -not adapted to SAT instances yet- that may reduce the limitations of the previous models.
24 Oct 2016 at 13:15 in room 1440 Biblioteket, Lindstedtsvägen 3
Rigorous simulation
(Walid Taha, Halmstad University)
The falling price of computational and communication components means that they will increasingly be embedded into physical products. Verifying the designs of the resulting “cyber-physical” products is challenging for several reasons. First, closed-form solutions for the behavior of physical systems rarely exist. Second, the most natural mathematical tool for modeling cyber-physical combinations, namely, hybrid (discrete/continuous) systems, exhibit pathologies that arise in neither purely continuous nor purely discrete systems. Third, the expressivity of existing continuous dynamics formalisms is generally lower than those used by domain experts.
To address these problems, we are developing a technology called “rigorous simulation”. The back-end for rigorous simulation uses validated numerics algorithms, which compute guaranteed bounds for the precision of all solutions. We show that these algorithms can be extended to compute trajectories for some hybrid systems exhibiting Zeno behavior. Ongoing work suggests that chattering behavior can be similarly addressed. We make validated numerics more accessible to non-specialists through the use of a domain-specific language, based on hybrid ordinary differential equations, which we also extend to support partial derivatives and certain types of equational modeling. An implementation called “Acumen” has been built and used for several case studies. These include virtual testing of advanced driver assistance functions, bipedal robotics, and a range of model problems for teaching at both graduate and undergraduate levels.
17 Oct 2016 at 12:00 in room 4523, Lindstedtsvägen 5
Simulation theorem and fork-lift
(Sagnik Mukhopadhyay, TIFR Mumbai)
Recently, proving theorems of the form that the communication complexity of a composed function f \circ g is essentially of the order of the decision tree complexity of f times the communication complexity of g has received a lot of attention. In particular, Goos-Pitassi-Watson (2015) simplified the proof of such a theorem for deterministic complexity due to Raz-McKenzie (1997) that worked only when g is the Indexing function. They used this theorem to settle a longstanding open problem in communication complexity. The Raz-McKenzie theorem needs the size of the Indexing gadget to be at least p^20, where p is the number of instances of Index.
We identify a simple sufficient condition for g to be satisfied to prove such deterministic simulation theorems. Using this, we show that CC(f \circ IP) = Omega(DT(f). n), provided n = Omega(log p), where IP is the inner-product function. This gives an exponential improvement over the gadget size of Raz-McKenzie.
We also prove a tight lower bound for randomized communication complexity of OrdSearch \circ IP in terms of randomized decision tree complexity of the function OrdSearch, which is Omega(log p). Proving a randomized simulation theorem remains elusive and we will discuss the hurdles that are needed to be faced and overcome.
This is a joint work with Arkadev Chattopadhyay (TIFR Mumbai), Michal Koucky and Bruno Loff (Charles University, Prague).
03 Oct 2016 at 14:00 in Room 4423, Lindstedtsvagen 5
Robust and Efficient Computation in Dynamic Networks with Heavy Churn
(John Augustine, IIT Madras)
Peer-to-Peer (P2P) networks — typically overlays on top of the Internet — pose some unique challenges to algorithm designers. The difficulty comes primarily from constant churn owing to the short life span of most nodes in the network. In order to maintain a well-connected overlay network despite churn at this level, the overlay must be dynamically maintained. This makes the overlay network graph an evolving graph that exhibits both edge dynamism and node churn. In this talk, we will discuss the somewhat surprising line of work in which we show how to design fast algorithms that are robust against heavy churn.
We will begin with a discussion on how to create an overlay network that has good expansion despite adversarial churn. Subsequently, assuming such an overlay network with good expansion, we will present a few basic techniques like fast information dissemination, random walks, and support estimation. Finally, we show how to use these techniques to design algorithms to solve fundamental problems like agreement, leader election, storing and retrieving data items.
30 Sep 2016 at 14:00 in Room 4523, Lindstedtsvagen 5
Distributed (Delta+1)-Coloring in Sublogarithmic Rounds
(Hsin-Hao Su, MIT)
The (Delta+1)-coloring problem and the MIS (maximal independent set) problem are fundamental distributed symmetry-breaking problems. Although many faster algorithms are known when the maximum degree, Delta, is small, the best running time for both problems remain O(log n) rounds since Luby. In this talk, I will review some randomized approaches for distributed coloring. Then I will talk about a recent joint work with David Harris and Johannes Schneider, which shows that a (Delta+1)-coloring can be computed with high probability in O(\sqrt{log Delta} ) + 2^{O(\sqrt{log log n})} rounds. It also implies the (Delta+1)-coloring problem is easier than the MIS problem, due to its min( log Delta / log log \Delta, \sqrt{log n/ \log log n}) lower bound by Kuhn, Moscibroda, and Wattenhofer. Finally, I will address some open problems.
26 Sep 2016 at 13:15 in Room 4523, Lindstedtsvagen 5
AnyDSL: Building Domain-Specific Languages for Productivity and Performance
(Sebastian Hack, Saarland University)
To achieve good performance, programmers have to carefully tune their application for the target architecture. Optimizing compilers fail to produce the "optimal" code because their hardware models are too coarse-grained. Even more, many important compiler optimizations are computationally hard even for simple cost models. It is unlikely that compilers will ever be able to produce high-performance code automatically for today's and future machines.
Therefore, programmers often optimize their code manually. While manual optimization is often successful in achieving good performance, it is cumbersome, error-prone, and unportable. Creating and debugging dozens of variants of the same original code for different target platform is just an engineering nightmare.
An appealing solution to this problem are domain-specific languages (DSLs). A DSL offers language constructs that can express the abstractions used in the particular application domain. This way, programmers can write their code productively, on a high level of abstraction. Very often, DSL programs look similar to textbook algorithms. Domain and machine experts then provide efficient implementations of these abstractions. This way, DSLs enable the programmer to productively write portable and maintainable code that can be compiled to efficient implementations. However, writing a compiler for a DSL is a huge effort that people are often not willing to make. Therefore, DSLs are often embedded into existing languages to save some of the effort of writing a compiler.
In this talk, I will present the AnyDSL framework we have developed over the last three years. AnyDSL provides the core language Impala that can serve as a starting point for almost "any" DSL. New DSL constructs can be embedded into Impala in a shallow way, that is just by implementing the functionality as a (potentially higher-order) function. AnyDSL uses online partial evaluation remove the overhead of the embedding.
To demonstrate the effectiveness of our approach, we generated code from generic, high-level text-book image-processing algorithms that has, on each and every hardware platform tested (Nvidia/AMD/Intel GPUs, SIMD CPUs), beaten the industry standard benchmark (OpenCV) by 10-35%, a standard that has been carefully hand-optimized for each architecture over many years. Furthermore, the implementation in Impala has one order of magnitude less lines of code than a corresponding hand-tuned expert code. We also obtained similar first results in other domains.
This is joint work with Roland Leissa, Klaas Boesche, Richard Membarth, and Philipp Slusallek
12 Sep 2016 at 13:15 in room 4523 , Lindstedtsvägen 5
On Complexity Measures in Polynomial Calculus
(Mladen Miksa, KTH Royal Institute of Technology)
f we want to show that a formula in propositional logic is unsatisfiable, we can present a proof of this fact in some formal proof system. Propositional proof systems give us rules on how to derive new lines of the proof, starting from the original propositional formula, until we finally deduce a line that is an obvious contradiction. The most well studied proof system is resolution in which lines are clauses and there is only one inference rule. One stronger proof system is polynomial calculus, in which clauses are replaced with polynomial equations and which can therefore reason more efficiently about polynomial constraints than resolution can. In both of these proof systems we can naturally define the size and space of the proof. These two measures can be tied to the running time and memory usage of SAT solvers that use these proof systems. In this talk I will survey some recent results related to the size and space measures in polynomial calculus, and compare these results to similar results for resolution.
16 Jun 2016 at 13:15 in room 4523, Lindstedtsvägen 5
Addressing dynamic behavior in cyber-physical systems from a communication middleware perspective: Challenges and some selected contributions.
(Marisol García Valls, Universidad Carlos III de Madrid)
a communication middleware perspective: Challenges and some selected contributions.The current vision of some future systems brings in a degree of complexity that is not manageable with the available design and development methods and technologies. As an example, the scale, heterogeneity, and dynamic behavior of cyber-physical systems pose a number of challenges to the fulfillment of their inherent fundamental properties, such as timeliness. In this context, there are numerous sources of unpredictability such as network transmissions, system reconfiguration needs, etc., that challenge the construction of system models that can be fully designed and verified statically.
Communication middleware is a fundamental software technology with a major role in supporting the dynamic behavior of distributed systems, specially at the large scale. It allows to abstract the application level from the underlying details of the network protocols and the execution platform at the participant nodes or subsystems (i.e., hardware, and mainstream software layers such as the operating system and even the -eventual- virtualization software).
This talk will present the challenges faced by the development of communication middleware for cyber-physical systems. The focus will be on their distributed nature of CPS and on their changing architecture needs, differentiating clearly between the pure resource managers and the communication middleware. Also, some contributions and work lines towards a real-time middleware that supports the dynamic properties of these systems will be presented.
13 Jun 2016 at 13:15 in room 1537
Building Dependable Concurrent Systems through Probabilistic Inference, Predictive Monitoring and Self-Adaptation
(Gul Agha, University of Illinois at Urbana-Champaign)
The infeasibility of statically verifying complex software is well established; in concurrent systems, the difficulty is compounded by nondeterminism and the possibility of 'Heisenbugs'. Using runtime verification, one can not only monitor a concurrent systems to check if it has violated a specification, but potentially predict future violations. However, a key challenge for runtime verification is that specifications are often incomplete. I will argue that the safety of concurrent systems could be improved by observing patterns of interaction and using probabilistic inference to capture intended coordination behavior. Actors reflecting on their choreography this way would enable deployed systems to continually improve their specifications. Mechanisms to dynamically add monitors and enforce coordination constraints during execution would then facilitate self-adaptation in concurrent systems. I will conclude by suggesting a program of research to extend runtime verification so systems can evolve robustness through such self-adaptation.
02 May 2016 at 12:00 in room 4523, Lindstedtsvägen 5
Dynamic primal-dual algorithms for vertex cover and matching
(Sayan Bhattacharya, Institute of Mathematical Sciences Chennai)
Consider a scenario where we are given an input graph G = (V, E) with n nodes and m edges. The node-set of the graph remains unchanged over time, but the edge-set is dynamic. Specifically, at each time-step an adversary either inserts an edge into the graph, or deletes an already existing edge from the graph. The problem is to maintain an approximately maximum matching (resp. minimum vertex cover) in this dynamic setting.
We present a clean primal-dual algorithm for this problem that has O(log n/\epsilon^2) amortized update time. The approximation ratio of the algorithm is (2+\epsilon) for minimum vertex cover and (3+\epsilon) for maximum matching. We also show how to extend this result to the dynamic b-matching and set-cover problems. This is the first application of the primal-dual framework in a dynamic setting.
Joint work with M. Henzinger and G. F. Italiano (based on papers that appeared in SODA 2015 and ICALP 2015).
25 Apr 2016 at 12:00 in room 4523, Lindstedtsvägen 5
Memory-hard functions and parallel graph pebbling
(Joël Alwen, Institute of Science and Technology Austria)
There is growing interest in the security community in Memory-Hard Functions (MHFs). These require moderate amounts of memory to compute on a general-purpose single-processor (i.e. sequential) machine, but cannot be *repeatedly* computed with significantly less memory amortized per instance on dedicated custom hardware (i.e. a *parallel* generalized circuit). For example, such functions are used for password-hashing and key-derivation to prevent brute-force attacks being cost-effectively implemented on custom circuits and in proofs-of-work for more egalitarian decentralized cryptocurrencies.
In (STOC 2015) Alwen and Serbinenko showed that, in order to construct a secure MHF it suffices to find a constant in-degree directed acyclic graph with a high cumulative pebbling complexity in a simple game of parallel pebbling. Conversely a wide class of candidate MHF constructions from the literature are given by fixing some particular (constant in-degree) DAG and showing an efficient way to pebble these DAGs immediately leads to a break of the MHF construction (i.e. a method of computing the MHF with low parallel memory complexity).
The first part of this talk will be aimed at providing an overview of this area of research. In particular will cover the following: - The motivation for and definition of MHFs. - The close connection to a certain parallel pebbling game over DAGs and new pebbling complexity measure called the cumulative complexity (CC) of the DAG. - What won't work: line graphs, bit-reversal graphs, superconcentrators and stacks of superconcentrators. - A powerful parallel pebbling algorithm with low CC. In particular we show how this gives us a non-trivial general lower-bound on the CC of any DAG of a fixed size. - A method for lower-bounding the CC of a DAG using a well-studied combinatorial property of the DAG called its depth-robustness. - Finally we conclude with two strong positive results. Namely a pair of constant in-degree DAGs enjoying very high CC. Indeed the second has maximal CC for any constant in-degree DAG of equal size. Moreover it can be sequentially pebbled with this same CC. Thus we obtain a provably secure MHF.
To demonstrate the power of these tools we will also briefly describe their implications for several of the most important MHF candidate constructions from the literature including the winner and several of the finalists of the recent multi-year international Password Hashing Competition. For each candidate we will see an attack strongly invalidating the conjectured security of the construction. We will also see a (weak) security proof for the construction showing that the attack is (essentially) optimal.
The second part of this talk will focus on some of the most important proof techniques underlying these results. In particular we will cover the following: - The "meta-node" technique for analysing the CC of random DAGs. - A method for in-degree reduction in graphs. - Lower-bounding CC using depth-robustness. - Using the "dispersal" property of a graph to lower-bound its CC. - Stacks of depth-robust graphs with maximal parallel CC which can nevertheless be sequentially pebbled with the same CC.
19 Apr 2016 at 13:00 in room 4523
Combinatorial Optimization and k-Set Packing
(Adam Schill Collberg, KTH-TCS)
For many interesting combinatorial optimization problems there are no known exact algorithms running faster than exponential time (in the input size), and it is a vast research area to try improve the running times of such algorithms. Another big research direction is to study algorithms which approximately solves these hard problems, but achieving better running times (e.g. polynomial time).
In this talk I will begin by introducing the concepts of exact and approximate algorithms. Then I will on a high level review the progress within each of these research areas in the context of a classic and well-studied problem: k-Set Packing (k-uniform hypergraph matching). Lastly we shall look at some techniques used in approximation algorithms, and in particular when applied to the k-Set Packing problem.
18 Apr 2016 at 12:00 in room 4523, Lindstedtsvägen 5
Experimenting with CDCL SAT solvers and glue clauses
(Laurent Simon, Université de Bordeaux)
Trying to tackle in practice NP-Complete problems was believed hopeless 20 years ago. However, with the introduction of Conflict Driven Clause Learning algorithms (CDCL for short) in the late 90's, we observed one of the most fascinating victories against hard problems. However, despite these impressive results, the underlying reasons for these successes are just partially known. We will begin this talk by presenting a set of experiments showing why SAT solvers are so hard to study in practice.
In a second part of the talk, we will focus on one of the many ingredients of SAT solvers: the concept of glue clauses and Literal Bock Distance. This measure for the quality of learnt clauses was introduced in 2009 and is now used in most of CDCL solvers. However, despite its interest, this measure is not fully understood. We will present the concept of glue clauses, as it was stated five years ago, and develop new insights in what may explain its performance, for instance by trying to find correlations between blocks as stated in the LBD measure and communities.
11 Apr 2016 at 13:15 in room 4523
Modeling and Solving Code Generation for Real
(Christian Schulte, School of ICT, KTH Royal Institute of Technology, Sweden, and SICS)
This talk shows how to improve code generation in compilers by using constraint programming (CP) as a method for solving combinatorial optimization problems. It presents how instruction selection (selecting processor instructions for programs), register allocation (assigning program variables to processor registers), and instruction scheduling (reordering processor instructions to increase throughput) can be modeled and solved using CP. The talk covers instruction selection, the integration of register allocation and instruction scheduling, and future plans. The talk presents a combinatorial model that integrates global register allocation with instruction scheduling. The model covers advanced aspects such as ultimate coalescing, spill code optimization, register packing, and multiple register banks. The talk will sketch a graph-based universal program representation that unifies data and control flow for both programs and processor instructions. The representation is the essential prerequisite for a CP model for instruction selection. The model is demonstrated to be expressive in that it supports many processor features that are out of reach of state-of-the-art approaches, such as advanced branching instructions, multiple register banks, and SIMD instructions. The models are significant as they address the same aspects as traditional code generation algorithms, yet are based on simple models and can robustly generate optimal code. Joint work with Mats Carlsson, Roberto Castañeda Lozano, Frej Drejhammar, and Gabriel Hjort Blindell.
04 Apr 2016 at 12:00 in room 4523, Lindstedtsvägen 5
Unconditional lower bounds for data structures
(Kasper Green Larsen, Aarhus University)
In the first part of this talk, we will introduce and motivate the cell probe model for proving data structure lower bounds. We will then survey some of the recent techniques for proving lower bounds in this model, with an emphasis on the results obtained by the speaker and co-authors. The talk will highlight the current limitations of our techniques and we will also briefly discuss work by the speaker on lower bounds in more restricted models of computation.
The second part of the talk is more technical and will be based on a FOCS'15 paper joint with Raphael Clifford (Bristol) and Allan Grønlund (Aarhus). The main focus here is a new type of threshold lower bound proved for the well-studied Multiphase Problem. The Multiphase Problem, presented by Patrascu at STOC'10, was one of the problems that really sparked the recently very popular discipline of proving conditional lower bounds. Our focus is on proving unconditional lower bounds for the Multiphase Problem in the regime of parameters where we can actually make useful statements. More specifically, we show that any data structure for the Multiphase Problem which insist on having a very fast query time of o(lgn/lglgn) must have n^{1-o(1)} update time. This is a whole new type of lower bound as previous techniques could only prove n^{eps} update time lower bounds, even when insisting on O(1) query time. We will also briefly touch on new lower bounds we prove for Matrix-vector multiplication.
21 Mar 2016 at 12:00 in room 4523, Lindstedtsvägen 5
Deterministic communication vs. partition number
(Marc Vinyals, Theory Group, KTH)
Alice is given a clique in a graph and Bob an independent set in the same graph. How much communication do they need to decide if these two sets of vertices intersect? This seemingly innocent question is connected to deep topics in communication complexity and analysis of Boolean functions.
In a breakthrough paper in FOCS 2015, Göös, Pitassi and Watson solved this and other problems by proving lower bounds in query complexity, and then giving an explicit way of amplifying query complexity lower bounds to communication complexity lower bounds. This solved a problem that had been open since 1979, and the paper has already generated a long (and growing) list of follow-up works that have made progress on other long-standing open problems in different areas of communication complexity and query complexity.
In this seminar, we will go over the GPW paper. During the first hour we will review the new results, and after the break we will present a detailed proof of their main theorem
16 Mar 2016 at 12:00 in room 4523, Lindstedtsvägen 5
Structural restrictions of CNF formulas: applications and limitations
(Florent Capelli, Université Paris Diderot)
It is well-known that clauses restrictions of CNF formulas such as 2-SAT or Horn-SAT are easy instances of the problem SAT. It is however not the case for harder problems such as #SAT, the problem of counting the satisfying assignments of a CNF formula: #2-SAT is already as hard as the general case. Fortunately, restrictions on the way the variables interact with the clauses have been a successful approach to find large classes of formulas for which #SAT was doable in polynomial time.
In the first part of this lunch seminar, I will give a broad picture of what can be done with these structural restrictions of CNF formulas. I will first present how such restrictions are defined and give an overview of the tractability results they enable for #SAT. I will then leverage these results to the broader problem of knowledge compilation, that is, the problem of finding a good data structure representing F that supports queries such as model counting or enumeration in polynomial time. This naturally raises the questions of finding hard instances for such algorithmic techniques. We reformulate these questions as proving lower bounds in knowledge compilation and answer this by giving new exponential lower bounds on the compilation of some family of CNF formulas.
In the second and more technical part of the talk, I will either present the algorithmic techniques in more details or give a complete proof of one of the lower bounds mentioned above depending on what the audience prefers. Most of the results presented in this talk were conceived in collaboration with Simone Bova, Stefan Mengel and Friedrich Slivovsky.
14 Mar 2016 at 12:00 in room 4523, Lindstedtsvägen 5
Verification of Bit-Vector Arithmetic using Finite Integer Algebras
(Priyank Kalla, University of Utah)
Finite-precision integer arithmetic plays an important role in many hardware and software systems, minimizing resource usage while maintaining necessary precision. However, operating on these bit-vector (BV) data-types can introduce subtle, unforeseen errors, causing incorrect function or even security vulnerabilities. With the widespread use of finite-precision arithmetic from multimedia DSP to embedded automotive control, it is imperative to devise new techniques to efficiently model and verify such systems at higher levels of abstractions --- raising the abstraction from bits to words, yet maintaining precision.
A bit-vector of size "m" represents integer values reduced "(mod 2^m)". Therefore, BV-arithmetic can be modeled as a system of polynomial functions over Z_{2^m}; and number-theoretic and algebraic techniques can be devised for verification. In this talk, I will describe decision procedures for verification of bit-vector arithmetic that lie at the cross-roads of number theory and commutative algebra --- such as canonical simplification of polynomial functions, Newton's p-adic iterations, etc. We will look at the challenge of making such techniques practical, and also discuss their potential for integration with SMT-solvers.
11 Mar 2016 at 13:15 in room 4523, Lindstedtsvägen 5
Risk management meets model checking:compositional fault tree analysis via Markov automata
(Marielle Stoelinga, University of Twente)
How do we ensure that our railroad, nuclear power plants and medical devices are safe and reliable? That is the topic of risk analysis, and fault tree analysis is a very popular technique here, deployed by many institutions like NASA, ESA, Honeywell, Airbus, the FDA, Toyota, Shell etc.
In this presentation, I will elaborate how the deployment of stochastic model checking can improve the capabilities of fault tree analysis, making them more powerful, flexible and efficient: I will present a compositional framework, where we can analyze a richer variety of questions via stochastic model checking of Markov automata; I will show how we obtain more compact models via bisimulation and graph rewriting techniques, and get more flexibility in the modeling power.
Finally, I will show how one can incorporate maintenance strategies, a crucial aspect in reliability engineering, reporting on our experience with the application and validation of these techniques in industrial practice; in particular in the railroad and nuclear domain.
08 Mar 2016 at 12:00 in room 4523, Lindstedtsvägen 5
SAT-Enabled Verification of State Transition Systems
(Karem Sakallah, University of Michigan and Qatar Computing Research Institute)
Modern conflict-driven clause-learning (CDCL) SAT solvers, introduced twenty years ago, have had a profound impact in many domains by enabling the solution of large-scale problems that were thought to be out of reach. It is now routine for modern SAT and SAT modulo Theories (SMT) solvers to tackle decision problems containing millions of variables and constraints. Verification of complex hardware and software systems is now increasingly facilitated by the automated reasoning capabilities of modern SAT technology.
In this seminar I argue that the CDCL paradigm is now sufficiently mature and attempts to improve it further can only yield incremental gains in performance and capacity. Instead, I propose to combine it with two equally powerful concepts to allow for scalable reasoning about exponentially-large state transition systems. The first concept, pioneered by the IC3 and later PDR incremental induction reachability solvers, culminates a decades-old quest for solving the so-called state explosion problem in model checking. The second concept, CounterExample-Guided Abstraction Refinement (CEGAR for short), provides an adaptive computational framework for managing complexity by a) judicious elimination of irrelevant details (abstraction/over-approximation) and by b) automatically filtering any false positives/spurious counterexamples (refinement).
After briefly describing the salient features of these two concepts I will illustrate their use, along with an underlying SAT/SMT engine, on two example applications of state transition systems: sequential hardware verification and detection of cross-site scripting (XSS) in PHP web servers. In both cases the goal is to show that all states reachable from a good initial state satisfy a given safety property or to produce a counterexample trace demonstrating violation of the property.
02 Feb 2016 at 13:30 in D2, Lindstedtsvägen 5, KTH
Scalable Software Testing and Verification Through Heuristic Search and Optimization: Experiences and Lessons Learned
(Professor Lionel C. Briand, University of Luxembourg, Luxembourg)
Testing and verification problems in the software industry come in many different forms, due to significant differences across domains and contexts. But one common challenge is scalability, the capacity to test and verify increasingly large, complex systems. Another concern relates to practicality. Can the inputs required by a given technique be realistically provided by engineers? This talk reports on 10 years of research tackling verification and testing as a search and optimization problem, often but not always relying on abstractions and models of the system under test. Our observation is that most of the problems we faced could be re-expressed so as to make use of appropriate search and optimization techniques to automate a specific testing or verification strategy. One significant advantage of such an approach is that it often leads to solutions that scale in large problem spaces and that are less demanding in terms of the level of detail and precision required in models and abstractions. Their drawback, as heuristics, is that they are not amenable to proof and need to be thoroughly evaluated by empirical means. However, in the real world of software development, proof is usually not an option, even for smaller and critical systems. In practice, testing and verification is a means to reduce risk as much as possible given available resources and time. Concrete examples of problems we have addressed and that I will cover in my talk include schedulability analysis, stress/load testing, CPU usage analysis, robustness testing, testing closed-loop dynamic controllers, and SQL Injection testing. Most of these projects have been performed in industrial contexts and solutions were validated on industrial software. There are, however, many other examples in the literature, a growing research trend that has given rise to a new field of study named search-based software testing.
29 Jan 2016 at 13:15 in room 4523
Foundations of Model-Based System Design
(Stavros Tripakis, Associate Professor at Aalto University and Adjunct Associate Professor at University of California, Berkeley)
Model-based design (MBD) is a design methodology that relies on three key elements: modeling (how to capture the system that we want), analysis (how to be sure that this is the system that we want before actually building it), and synthesis (how to build the system). In this talk I will present some recent work on two aspects of MBD: synthesis and compositionality. I will first present synthesis of distributed control protocols from scenarios and requirements. Automated synthesis of such protocols is a hard, generally undecidable, problem. Nevertheless, by allowing the designer to specify, in addition to a set of formal requirements, a set of example scenarios that illustrate how the protocol should behave in certain situations, we are able to fully automatically synthesize several simple distributed protocols.
I will then discuss compositional methods, which allow to build systems from smaller components. Such methods are not simply a desirable feature in system design: they are a must for building large and complex systems. A key ingredient for compositionality is that of an "interface". An interface abstracts a component, exposing relevant information while hiding internal details. I will give an overview of the many uses of interfaces in MBD, from modular code generation from hierarchical models, to incremental design with interface theories, to co-simulation and multiview modeling.
18 Jan 2016 at 13:15 in room 4523
Agile Quality Assurance (“in vivo” research in Software Engineering)
(Serge Demeyer, the University of Antwerp and the spokesperson for the ANSYMO research group)
Software is vital to our society and consequently the software engineering community is investigating means to produce reliable software. For a long, long time, reliable software was seen as software “without bugs”. Today however, reliable software has come to mean “easy to adapt” because of the constant pressure to change. As a consequence, organisations producing software must seek for a delicate balance between two opposing forces: striving for reliability and striving for agility. In the former, organisations optimise for perfection; in the latter they optimise for development speed.
This talk will will investigate ways to reduce the tension between reliability and agility by exploiting so-called software configuration management systems. Today these systems are used rather passively: they monitor the past and present activities, but are seldom used to predict what is likely to happen in the future. Yet with all the data in these systems it becomes feasible to answer questions like: which modules are likely to be affected by a given change request? where should we focus our regression test activities ? how severe is a given problem report ? who is the best person the handle a given bug report?
15 Dec 2015 at 13:15 in room 4523, Lindstedtsvägen 5
Engineering Next-Generation Data Systems for Secure, Smart, Connected Health Analytics
(Eric Rozier, University of Cincinnati)
Enabled by the broad availability of small-scale, inexpensive, and often commercial off-the-shelf available sensors a new field is developing around Smart and Connected Health. Fusing research from embedded systems, data science and engineering, and cybersecurity, new opportunities are developing for human augmentation. In this talk we will discuss recent work from the Trustworthy Data Engineering Lab towards building flexible, intelligent, software and hardware infrastructure to improve the ability of individuals, physicians, and scientists to understand individualized health data. We will discuss the new Data Lineage Formalism being developed for automated data workflow, bias detection and correction, to allow robust and formal feature engineering and derivation. We will discuss the tradeoffs involved between data availability and privacy, introduce a new formal logic for privacy preserving data operations, and demonstrate their performability and correctness, along with metrics for their improved privacy and suitability for high-assurance areas of data science. Lastly we will present real examples from health analytics, and public health work conducted with the University of Cincinnati and the University of Chicago, and discuss areas for future work.
14 Dec 2015 at 12:00 in room 4523, Lindstedtsvägen 5
Linear temporal logic satisfiability checking
(Kristin Yvonne Rozier, University of Cincinnati)
Formal verification techniques are growing increasingly vital for the development of safety-critical software and hardware. Techniques such as requirements-based design and model checking have been successfully used to verify systems for air traffic control, airplane separation assurance, autopilots, logic designs, medical devices, and other functions that ensure human safety. Formal behavioral specifications written early in the system-design process and communicated across all design phases increase the efficiency, consistency, and quality of the system under development. We argue that to prevent introducing design or verification errors, it is crucial to test specifications for satisfiability.
In 2007, we established LTL satisfiability checking as a sanity check: each system requirement, its negation, and the set of all requirements should be checked for satisfiability before being utilized for other tasks, such as property-based system design or system verification via model checking. We demonstrated that LTL satisfiability checking reduces to model checking; an extensive experimental evaluation proved that for LTL satisfiability checking, the symbolic approach is superior to the explicit approach. However, the performance of the symbolic approach critically depends on the encoding of the formula. Since 1994, there had been essentially no new progress in encoding LTL formulas as symbolic automata for BDD-based analysis. We introduced a set of 30 symbolic automata encodings, demonstrating that a portfolio approach utilizing these encodings translates to significant, sometimes exponential, improvement over the standard encoding for symbolic LTL satisfiability checking. In recent years, LTL satisfiability checking has taken off, with others inventing exciting new methods to scale with increasingly complex systems. We revisit the benchmarks for LTL satisfiability checking that have become the de facto industry standard and examine the encoding methods that have led to leaps in performance. We highlight the past and present, and look to the future of LTL satisfiability checking, a sanity check that now has an established place in the development cycles of safety-critical systems.
10 Dec 2015 at 10:00 in CSC Library
Privacy to the People
(Daniel Bosk, Theory Group-KTH)
One of the pillars of democracy is the ability to voice an opinion. However, it is not hard to find numerous examples where someone with power represses others and people cannot express their opinions without severe consequences. In this talk we will cover two mechanisms that can be useful in this context: one for deniable text-messaging and another for privacy-preserving access control.
For the first, imagine Alice and Bob having an online (encrypted) conversation where they discuss the ruling regime in a negative way. The national intelligence agency records everything sent by anyone and when. An agent, Eve, suspects the topic of Alice and Bob's conversation is political and asks Alice for the key so Eve can decrypt and check. Alice wants to produce a key which decrypts the conversation to a benign topic. We developed a scheme which allows Alice and Bob to do this. It is based on the One-Time Pad, so Alice and Bob need a lot of key-material. They can exchange this using the NFC-capability of their smartphones. We formally prove that this scheme provides deniable yet authenticated encryption, that it is secure against replay and out-of-order attacks, and that Eve cannot distinguish whether Alice is lying or not with more than a negligible advantage.
For the privacy-preserving access control, imagine a distributed storage system which Alice and Bob use to store their social media data. Eve can read all ciphertexts stored in this system. Now Alice and Bob want to hide the meta-data: for whom certain data is encrypted, if they access it etc., so that Eve cannot target specific individuals. Our approach is to use Anonymous Broadcast Encryption. There are several possibilities for the construction, we explore the limits and trade-offs of these approaches.
After summarizing the work done for the two mechanisms above, I will outline possible future research directions and prioritize them to outline my thesis work.
03 Dec 2015 at 11:00 in room 1537, Lindstedtsvägen 3
Enhanced User Control for Online Social Networks
(Guillermo Rodríguez Cano, Theory Group, KTH)
Online social networks have become a practical and popular means for communication on the Internet, to some extent replacing traditional ones such as e-mail or bulletin boards. The free flow of large amounts of information, some of it of a personal and sensitive kind, has increased the need to know what happens with this data and how to achieve more control for the users.
We are interested in the relation between free flow of information and the desire to have control over data, for reasons such as privacy preservation or prevention of manipulation. We approach this relation from both sides, trying to understand how information flows and communication networks are formed as well as developing concrete mechanisms to control the information flow.
By understanding the characteristics of these flows in online social networks we learn that content is pushing the growth of online social networks and time is a driving factor for the evolution of these networks as well. For example, tight-knit short paths between users are analogous to cores of trust as the length of the path between users is a function of trust, or the distinctive ability of social structures to facilitate the rapid spread of information.
These characteristics reveal a multi-dimensional network whose abstract model is not simply a social graph. We find that relationships are of higher-order, comprising individuals and content, for which graphs need additional representation artefacts. We investigate the suitability of hyper- graphs as a tool for representation to mitigate semantic ambiguities and the potential loss of information for cases when pairwise relationships are superseded by the group.
On the side of controling data, we have worked on authenticity, accountability, and deniability aspects. Concretely, we developed protocols for usable password authentication in a decentralized manner as one building block for enabling increased user control over data by means of decentralized social networks. In this context, we also developed cooperation mechanisms without the need of a trusted third party for organizing events. Another mechanism was an anonymous document submission system with both unlinkability and provability for respective parties.
Our work in progress aims at combining these two approaches, understanding information flow and devising concrete mechanisms for data control in the case of disproportionate influence of users on a news web service based on community participation.
30 Nov 2015 at 12:00 in room 4523, Lindstedtsvägen 5
Model checking, SAT and bit-vectors + Evaluating CDCL restart schemes
(Armin Biere, Johannes Kepler University Linz)
The lunch seminar part is titled "Model checking, SAT and bit-vectors" with abstract as follows.
Both SAT solving and Model Checking are considered to have saved the reputation of formal methods. We will highlight how their recent history is actually closely related. We further discuss important developments in both domains, mostly from the historical and practical point of view, but then will delve into the complexity of deciding bit-vector logic. This logic is the most important theory for bit-precise reasoning with SMT solvers and has many practical applications in testing and verification both of Hardware and Software. Related to solving certain bit-vector problems is the challenge to make bit-level arithmetic reasoning work.
After the break, there will be a more technical presentation on evaluating restart schemes for CDCL SAT solvers, which is based on joint work with Andreas Fröhlich.
Modern CDCL (conflict-driven clause learning) SAT solvers are used for many practical applications. One of the key ingredients of state-of-the-art CDCL solvers are efficient restart schemes. The main contribution of this work is an extensive empirical evaluation of various restart strategies. We show that optimal static restart intervals are not only correlated with the satisfiability status of a certain instance, but also with the more specific problem class of the given benchmark. We further compare uniform restart intervals with the performance of non-uniform restart schemes, such as Luby restarts. Finally, we revisit the dynamic restart strategy used in Glucose and propose a new variant thereof, which is based on the concept of exponential moving averages. The resulting implementation in Lingeling improves state-of-the-art performance in SAT solving.
23 Nov 2015 at 12:00 in room 4523, Lindstedtsvägen 5
Lower bounds: from circuits to QBF proof systems
(Ilario Bonacina, Theory Group, KTH)
A general and long-standing belief in the proof complexity community asserts that there is a close connection between progress in lower bounds for Boolean circuits and progress in proof size lower bounds for strong propositional proof systems. Although there are famous examples where a transfer from ideas and techniques from circuit complexity to proof complexity has been effective, a formal connection between the two areas has never been established so far. Here we provide such a formal relation between lower bounds for circuit classes and lower bounds for Frege systems for quantified Boolean formulas (QBF). Using the full spectrum of the state-of-the-art circuit complexity lower bounds we will prove lower bounds for very strong QBF proof systems (e.g. for what we called AC0[p]-FREGE + \forall red). Such lower bounds correspond, in the propositional case, to major open problems in proof complexity.
This talk is based on the joint work with Olaf Beyersdorff and Leroy Chew (ECCC TR15-133 and ITCS16, to appear).
16 Nov 2015 at 12:00 in room 4523, Lindstedtsvägen 5
Unconditional lower bounds for data structures
(Kasper Green Larsen, Aarhus University)
In the first part of this talk, we will introduce and motivate the cell probe model for proving data structure lower bounds. We will then survey some of the recent techniques for proving lower bounds in this model, with an emphasis on the results obtained by the speaker and co-authors. The talk will highlight the current limitations of our techniques and we will also briefly discuss work by the speaker on lower bounds in more restricted models of computation. The second part of the talk is more technical and will be based on a FOCS'15 paper joint with Raphal Clifford (Bristol) and Allan Grønlund (Aarhus). The main focus here is a new type of threshold lower bound proved for the well-studied Multiphase Problem. The Multiphase Problem, presented by Patrascu at STOC'10, was one of the problems that really sparked the recently very popular discipline of proving conditional lower bounds. Our focus is on proving unconditional lower bounds for the Multiphase Problem in the regime of parameters where we can actually make useful statements. More specifically, we show that any data structure for the Multiphase Problem which insist on having a very fast query time of o(lgn/lglgn) must have n^{1-o(1)} update time. This is a whole new type of lower bound as previous techniques could only prove n^{eps} update time lower bounds, even when insisting on O(1) query time. We will also briefly touch on new lower bounds we prove for Matrix-vector multiplication.
09 Nov 2015 at 12:00 in room 4523, Lindstedtsvägen 5
Oblivious and online matching via continuous linear programming
(Fei Chen, Theory Group, KTH)
Variants of the maximum matching problem have wide applications in the real world. Motivated by a kidney exchange program, where kidney transfer is expected to be performed right after patients and donors pass the compatibility tests, the oblivious matching problem was proposed allowing greedy matching algorithms only. Motivated by online advertising, where for each keyword arriving online the advertising system should immediately decide which ad to display to maximize the profit, the online matching setting was proposed that requires the algorithm to maintain a matching in an online fashion.
We study the oblivious and online matching problems. For oblivious matching, we prove that the Ranking algorithm has a performance ratio of at least 0.523 on arbitrary graphs. For edge-weighted online bipartite b-matching, we give a procedure to construct an asymptotically optimal algorithm. The analysis of both problems relies on linear programming framework. We use a continuous linear programming approach to analyze the limiting behavior of normal LPs. In particular, our results for online bipartite b-matching are based on a generalized secretary problem. The continuous LP gives a clear perspective on the secretary problem from which we are able to make a connection between secretary and online matching.
06 Nov 2015 at 10:15 in room E2, Lindstedsvägen 3
I skärningspunkten mellan beviskomplexitet och SAT-lösning
(Jakob Nordström, Theory Group, KTH)
Denna föreläsning handlar om en försåtligt enkel fråga: Givet en formel i vanlig satslogik, där variablerna kan ta värdet SANT eller FALSKT, och där de binds samman av logiska operatorer OCH, ELLER och ICKE som anger hur variablerna måste förhålla sig till varandra, kan man snabbt med hjälp av datorberäkningar avgöra om det finns ett sätt att tilldela sanningsvärden till variablerna så att formeln blir satisfierad, dvs. så att alla villkor i den blir uppfyllda?
Denna till synes enkla fråga är i själva verket ett av de stora öppna problemen inom teoretisk datavetenskap. Satisfierbarhetsproblemet (SAT) är vad som med fackterminologi kallas NP-fullständigt, vilket indikerar att det förmodligen inte finns effektiva beräkningsmetoder, eller algoritmer, som klarar av alla formler. Att bevisa att så verkligen är fallet verkar dock mycket svårt. Och detta är av stort intresse inte bara inom datavetenskapen -- i samband med millennieskiftet utnämndes problemet av Clay Mathematics Institute till ett av de sju så kallade Millennium Prize Problems som utgör verkligt stora utmaningar inom den moderna matematiken.
I skarp kontrast till detta har dramatiska framsteg inom mer tillämpad forskning under de senaste 15-20 åren lett fram till mycket effektiva datorprogram, s.k. SAT-lösare, som kan hantera formler med miljontals variabler. Det saknas dock en djupare teoretisk förståelse varför dessa SAT-lösare är så effektiva och för vilka typer av formler som de fungerar väl.
Beviskomplexitet studerar formella metoder för härledning av logiska formler och är ett av få tillgängliga verktyg för teoretisk analys av SAT-lösare. Föredraget kommer att ge en (något selektiv) översikt av beviskomplexitet, med fokus på bevissystem som är av särskilt intresse för tillämpad SAT-lösning, och även diskutera om och hur studiet av sådana bevissystem kan leda till en ökad förståelse av styrkor och svagheter hos moderna SAT-lösare.
12 Oct 2015 at 12:00 in room 4523, Lindstedtsvägen 5
Fast and Powerful Hashing using Tabulation + Deterministic Edge Connectivity in Near-Linear Time
(Mikkel Thorup, University of Copenhagen)
The first part of the presentation, titled "Fast and Powerful Hashing using Tabulation," is intended for a general audience.
Randomized algorithms are often enjoyed for their simplicity, but the hash functions employed to yield the desired probabilistic guarantees are often too complicated to be practical. Here we survey recent results on how simple hashing schemes based on tabulation provide unexpectedly strong guarantees.
Simple tabulation hashing dates back to Zobrist [1970]. Keys are viewed as consisting of c characters and we have precomputed character tables h_1,...,h_c mapping characters to random hash values. A key x=(x_1,...,x_c) is hashed to h_1[x_1] ⊕ h_2[x_2].....⊕ h_c[x_c]. This schemes is very fast with character tables in cache. While simple tabulation is not even 4-independent, it does provide many of the guarantees that are normally obtained via higher independence, e.g., linear probing and Cuckoo hashing.
Next we consider twisted tabulation where one input character is "twisted" in a simple way. The resulting hash function has powerful distributional properties: Chernoff-Hoeffding type tail bounds and a very small bias for min-wise hashing. This is also yields an extremely fast pseudo-random number generator that is provably good for many classic randomized algorithms and data-structures.
Finally, we consider double tabulation where we compose two simple tabulation functions, applying one to the output of the other, and show that this yields very high independence in the classic framework of Carter and Wegman [1977]. In fact, with high probability, for a given set of size proportional to that of the space consumed, double tabulation gives fully-random hashing. We also mention some more elaborate tabulation schemes getting near-optimal independence for given time and space.
While these tabulation schemes are all easy to implement and use, their analysis is not.
After the break, there will be a more technical presentation titled "Deterministic Edge Connectivity in Near-Linear Time," based on joint work with Ken-ichi Kawarabayashi.
We present a deterministic algorithm that computes the edge-connectivity of a graph in near-linear time. This is for a simple undirected unweighted graph G with n vertices and m edges. This is the first o(mn) time deterministic algorithm for the problem. Our algorithm is easily extended to find a concrete minimum edge-cut. In fact, we can construct the classic cactus representation of all minimum cuts in near-linear time.
The previous fastest deterministic algorithm by Gabow from STOC'91 took ~O(m+k^2 n), where k is the edge connectivity, but k could be Omega(n).
At STOC'96 Karger presented a randomized near linear time Monte Carlo algorithm for the minimum cut problem. As he points out, there is no better way of certifying the minimality of the returned cut than to use Gabow's slower deterministic algorithm and compare sizes.
Our main technical contribution is a near-linear time algorithm that contract vertex sets of a simple input graph G with minimum degree d, producing a multigraph with ~O(m/d) edges which preserves all minimum cuts of G with at least 2 vertices on each side.
In our deterministic near-linear time algorithm, we will decompose the problem via low-conductance cuts found using PageRank à la Brin and Page (1998), as analyzed by Andersson, Chung, and Lang at FOCS'06. Normally such algorithms for low-conductance cuts are randomized Monte Carlo algorithms, because they rely on guessing a good start vertex. However, in our case, we have so much structure that no guessing is needed.
08 Oct 2015 at 13:15 in room 1537
Winning Cores in Parity Games
(Steen Vester, DTU)
Whether parity games can be solved by a polynomial-time algorithm is a well-studied problem in theoretical computer science which has not yet been resolved. In this talk we propose a new direction for approaching this problem based on the novel notion of a winning core.
We give two different, but equivalent, definitions of a winning core and show a number of interesting properties about them. This includes showing that winning cores can be computed in polynomial time if and only if parity games can be solved in polynomial time implying that computation of winning cores is in the intersection of NP and coNP.
We also present a deterministic polynomial-time approximation algorithm for solving parity games based on computing winning cores. It runs in time O(d * n^2 * m) where d is the number of colors, n is the number of states and m is the number of transitions. The algorithm returns under-approximations of the winning regions in parity games. It works remarkably well in practice as it solves all benchmark games from the PGSolver framework in our experiments completely and outperforms existing algorithms in most cases in our experiments. In addition, correctness of the output of the algorithm can be checked efficiently.
07 Oct 2015 at 13:15 in room 4523 Lindstedtsvägen 5
Size-space trade-offs in proof complexity
(Susanna F. de Rezende, Theory Group, KTH)
The study of proof size, proof size, and size-space trade-offs has recently been an active line of research in proof complexity. This study was originally motivated by concerns in applied SAT solving, but has also led to questions of intrinsic interest to proof complexity.
The resolution proof system underlying current state-of-the-art SAT solvers is now fairly well-understood in this regard, but for stronger proof systems many open problems remain. In this talk, we will give an overview of what is known and then present our current research aimed at obtaining size-space trade-offs for the cutting planes proof system.
05 Oct 2015 at 12:00 in room 4523, Lindstedtsvägen 5
An average-case depth hierarchy theorem for Boolean circuits + Circuit complexity of small distance connectivity
(Li-Yang Tan, Toyota Technological Institute at Chicago)
In the first hour I will speak about recent work with Ben Rossman and Rocco Servedio. We prove an average-case depth hierarchy theorem for Boolean circuits over the standard basis of AND, OR, and NOT gates. Our hierarchy theorem says that for every d >= 2, there is an explicit n-variable Boolean function f, computed by a linear-size depth-d formula, which is such that any depth-(d-1) circuit that agrees with f on (1/2 + o_n(1)) fraction of all inputs must have size \exp(n^{\Omega(1/d)}). This answers an open question posed by Håstad in his Ph.D. thesis, and has applications in structural complexity and the analysis of Boolean functions. A key ingredient in our proof is a notion of random projections which generalize random restrictions.
After the break, I'd be happy to present the technical details and/or speak about related subsequent work with Xi Chen, Igor Oliveira, and Rocco Servedio on the st-connectivity problem:
We show that any depth-d circuit for determining whether an n-node graph has an s-to-t path of length at most k must have size n^{\Omega(k^{1/d}/d)}. The previous best circuit size lower bounds for this problem were n^{k^{\exp(-O(d))}} [Beame, Impagliazzo, Pitassi 1998] and n^{\Omega((\log k)/d)} [Rossman 2014]. Our lower bound is quite close to optimal, since a simple construction gives depth-d circuits of size n^{O(k^{2/d})} for this problem. Our proof is by reduction to a new lower bound on the size of small-depth circuits computing a skewed variant of the "Sipser functions" that have played an important role in classical circuit lower bounds of Sipser, Yao, and Håstad. A key ingredient in our proof is the use of random projections, an extension of random restrictions which allow us to obtain sharper quantitative bounds while employing simpler arguments, both conceptually and technically, than in the previous works.
29 Sep 2015 at 13:15 in room 1440(biblioteket)
Evolving System Variants into Software Product Lines: An Overview and the Road Ahead
(Roberto Lopez Herrejon)
Because of economical, technological and marketing reasons, today's software systems need to be built as software families where each product implements a different combination of features. Software families are commonly called Software Product Lines (SPLs) and over the past three decades they have been the subject of extensive research and application. This endeavor has produced a plethora of methods, technologies and tools that have been applied in multiple industrial domains. Despite the documented success stories, there are still several open challenges that must be addressed to reap the potential benefits of SPLs. Salient among these challenges is providing robust and comprehensive end-to-end support for evolving existing system variants into SPLs, which is the prevailing scenario of SPL development in industry. In this talk, I will present an overview of the state-of-the-art in SPL development, what are some of the open issues in evolving system variants into SPLs, and how approaches such as search-based techniques can help to address them.
28 Sep 2015 at 13:15 in room 1537
Verification and Synthesis of Parameterized Systems
(Swen Jacobs, Reactive Systems Group Saarland University)
In this talk, I will present an overview of our work on model checking and synthesis of reactive systems that are composed of a parametric number of (finite-state) components. The starting point is the parameterized synthesis approach, which is based on cutoff results that reduce reasoning about parametric systems to reasoning about systems of bounded size. I will show how we extended existing cutoff results for token-passing systems to more general systems and specifications, and how we applied the parameterized synthesis approach with these extensions to obtain correct-by-construction implementations of the AMBA AHB protocol for an arbitrary number of communicating components.
18 Sep 2015 at 13:15 in room 4523
Building and Testing Software: Construction and Deconstruction
(Cyrille Artho, National Institute of Advanced Industrial Science and Technology (AIST) Japan)
Domain-Specific Languages make machine-readable models and specifications human-readable as well. They can support models of configurations, designs, or tests, and thus many aspects of software development.
Modbat is a model-based tester that is based on a domain-specific language that makes it easier and more efficient to define complex test models. It has built-in support for non-deterministic system actions and exceptions, making it ideal to model the behavior of networked software.
We have applied Modbat to a model of the Java network library for Java PathFinder (published at ASE 2013), and a state-of-the-art SAT solver (HVC 2013, ASE 2015). Work in progress tries to analyze cloud computing middleware Apache Zookeeper.
16 Sep 2015 at 13:15 in room 522/(Fantum 22) (THM-CB)
From models and specifications to programs and constraints: increasing automation in verification
(Damien Zufferey, MIT CSAIL)
Automated reasoning and verification tools have been developed to help programmers produce correct software. However, the limit of decidability tells us that there is no silver bullet. In this talk, I will show how we can harness the synergy between programming abstractions, formalisms, and verification methods to prove functional properties of complex system. First, I will speak about the verification of heap-manipulating programs. Separation logic (SL) has gained widespread popularity to specify such programs. SL succinctly expresses how data structures are laid out in memory and its discipline of local reasoning mimics human intuition about proving heap programs correct. Unfortunately, SL is a nonclassical logic and requires specialized provers. We present a reduction of decidable SL fragments to a decidable first-order theory that fits into the satisfiability modulo theories (SMT) framework. Our approach provides a way of integrating SL into verification tools with an SMT backend and combining SL fragments with other decidable first-order theories. Then, I will shortly describe ongoing work about programming and verification of fault-tolerant distributed algorithms. These algorithms are notoriously difficult to implement correctly, due to asynchronous communication and faults. I will introduce PSync, a domain specific language based on the Heard-Of model, which views asynchronous faulty systems as synchronous ones with an adversarial environment that simulates faults. This high-level abstraction simplifies the implementation of fault-tolerant distributed algorithms, enables automated formal verification, and can be executed efficiently.
14 Sep 2015 at 12:00 in room 4523
Conflict-driven clause learning and pseudo-Boolean SAT solving
(Jan Elffers, Theory Group, KTH)
Conflict-driven clause learning (CDCL) is the most popular method to solve the Boolean satisfiability (SAT) problem in practice. This approach is based on backtracking search and uses clause learning to avoid solving the same subproblem multiple times. I will present the core algorithm and a number of extensions and optimizations that are incorporated in modern SAT solvers. I will also present possible directions for future research aimed at improving the understanding of this method.
The pseudo-Boolean SAT problem is a generalization of SAT with linear constraints instead of disjunctive clauses. This area is much less well developed. One approach is to use an extension of CDCL with a modified implementation of clause learning to handle linear constraints. I will present this approach as well, and I will go through an example execution of the method on the Pigeonhole Principle. I will also outline some interesting research questions regarding pseudo-Boolean SAT solving.
09 Jul 2015 at 13:15 in room 4523
Hardness of Hypergraph Coloring
(Sangxia Huang, KTH and EPFL)
In hypergraph coloring, we are given a hypergraph, and the goal is to find a vertex coloring such that all hyperedges contain at least two colors. The focus of this talk is the computational complexity of finding a hypergraph coloring that minimizes the number of colors used. I will first survey the algorithmic and hardness results of the problem, and then present my recent result showing quasi-NP-hardness of coloring 2-colorable-8-uniform hypergraphs with 2^{(log N)^{1/4-o(1)}} colors. Our result is based on techniques developed recently in [Dinur, Guruswami '13], [Guruswami, Håstad, Harsha, Srinivasan, Varma '14] and [Khot, Saket '14].
06 Jul 2015 at 13:15 in room 4523
Efficient low-redundancy codes for correcting multiple deletions
(Venkatesan Guruswami, Carnegie Mellon University)
We consider the problem of constructing codes to recover from k-bit deletions with efficient encoding/decoding, for a fixed k. The single deletion case is well understood, with the Varshamov-Tenengolts code from 1965 giving an asymptotically optimal construction with ~ 2^n/n codewords of length n, i.e., at most log n bits of redundancy. However, even for the case of two deletions, there was no known explicit construction with redundancy less than n^{Omega(1)}.
For any fixed k, we construct a binary code with O(k^2 log k log n) redundancy that is capable of efficiently recovering from k deletions, which comes close to the optimal, non-constructive Theta(k log n) bound.
Joint work with Joshua Brakensiek and Samuel Zbarsky (Carnegie Mellon).
24 Jun 2015 at 13:15 in room 4523
ISA Specification
(Anthony Fox, Computer Lab, University of Cambridge)
Instruction Set Architecture (ISA) specifications seek to formalise the operational semantics of machine-code programs. The objective is to accurately interpret (model) the programmer’s model view of an architecture. This is done using reference documentation from processor vendors such as Intel, ARM and IBM. These vendor descriptions are extensive and, in places, loose - freely mixing prose with sections of pseudocode. Examples include: Intel 64 (3511 pages) and ARMv8 (5886 pages). Partial models of ISAs have been developed and used in various theorem proves, including ACL2, Coq, HOL4 and Isabelle/HOL.
ISA specifications are useful for: architecture simulation (design exploration), documentation and formal verification. ISA models in theorem provers are being used to verify ever more complex systems, such as:
- microprocessor designs (micro-architectures); - compilers; and - trusted code, such as low-level libraries, hypervisors and operating systems (microkernels).
This talk will give an overview of ISA specification work undertaken at Cambridge, all stemming from an ARM6 verification project that dates back to October 2000.
23 Jun 2015 at 13:00 in room 4523
Verification of an ARM hypervisor, from a tiny hypervisor to a real world platform
(Hamed Nemati, KTH)
For embedded systems security the ability to isolate processes belonging to different privilege domains is critical. This is difficult to achieve in a trustworthy manner using only OS level threading. An alternative is to use virtualization. System virtualization can establish isolation with a very small code base. This opens up for the possibility of using formal verification to produce secure systems that are fully verified and certified down to binary level.
The Prosper project started in 2011 with the goal of producing such a fully verified virtualization platform for the ARM processor family. Since I joined the project in 2012 several generations of verified hypervisors have been produced.
In the talk I give an overview of these results with particular focus on recent work on memory management. This is important since virtualization of the memory management subsystem enables to delegate to a hosted OS the management of its own memory hierarchy and to allow it to enforce its own access restrictions. The security proof has also been extended to allow dynamic management of the memory. This allows the hypervisor to host a fully fledged operating system, like Linux, isolated from security critical components, e.g. a trusted run-time monitor.
The presentation is completed by a brief discussion of planned and currently ongoing work.
22 Jun 2015 at 12:00 in room 4523
Hardness of dynamic problems and the geometry of binary search trees
(Thatchaphol Saranurak, Theory Group, KTH)
This talk consists of two separated parts; both about dynamic data structures.
The first part is about proving hardness of dynamic graph problems. In dynamic graph problems, one wants to maintain some properties (e.g. connectivity, distances between nodes, maximum matching) of a given graph where edges of a graph can be inserted and deleted. While there are several techniques for proving polylogarithmic lower bounds for the time required for the update, these techniques currently do not give polynomial (n^ε) lower bounds. Then, one way to show hardness of the problems is to assume some conjectures (e.g. SETH, 3SUM) and prove that an algorithm with fast update time would contradict the conjecture. I will survey the active development of these techniques for proving hardness based on conjectures. Then, I will talk about our new conjecture called the Online Matrix-Vector Multiplication, which very well tightly captures the hardness of many dynamic problems. This is joint work with Monika Henzinger, Sebastian Krinninger, and Danupon Nanongkai.
The second part is about the geometry of binary search trees. First, I will talk about a Demaine et al. paper which shows how the "execution log" of a binary search tree algorithm can be represented and characterized by a point set in a plane with a simple property. This characterization suggests a natural algorithm called Greedy. Next, I will talk about our work which shows that, using forbidden-pattern theory, Greedy almost solves 30-year-old Traversal Conjecture up to a function depending on alpha(n) (inverse-ackerman function). This is based on a joint work with Parinya Chalermsook, Mayank Goswami, Laszlo Kozma, and Kurt Mehlhorn.
03 Jun 2015 at 15:00 in room 1440
Detecting Redundant CSS Rules in HTML5: A Tree-Rewriting Approach
(Anthony Widjaja Lin, Yale-NUS College)
HTML5 applications normally have a large set of CSS (Cascading Style Sheets) rules for data display. Each CSS rule consists of a node selector and a declaration block (assigning values to selected nodes' display attributes). As web applications evolve, maintaining CSS files can easily become problematic. Some CSS rules will be replaced by new ones, but these obsolete (hence redundant) CSS rules often remain in the applications. Not only does this ``bloat'' the applications, but it also increases web browsers' processing time and download time. Most works on detecting redundant CSS rules in HTML5 applications do not consider the dynamic behaviours of HTML5 (specified in JavaScript); in fact, the only available method that takes these into account is dynamic analysis (a.k.a. testing), which cannot soundly prove redundancy of CSS rules. In this talk, I will describe a static analysis technique based on a monotonic tree-rewriting abstraction of HTML5 applications. The algorithm relies on an efficient reduction to an analysis of symbolic pushdown systems (for which highly optimised solvers are available, e.g., Moped and Bebop), which yields a fast method for checking CSS redundancy in practice. We have implemented a proof-of-concept prototype, TreePed, and our preliminary experimental results have been promising. I will present these and compare TreePed with existing tools.
This is a work that has recently been provisionally accepted at OOPSLA, joint with Matthew Hague (Royal Holloway, University of London) and Luke Ong (Oxford University).
02 Jun 2015 at 12:00 in room 4423
From graphs to matrices, and back: bridging the combinatorial and the continuous
(Aleksander Madry, MIT)
Over the last several years there was an emergence of new type of fast algorithms for various fundamental graph problems. A key primitive employed in these algorithms is electrical flow computation, which corresponds to solving a Laplacian linear system.
In this talk, I will discuss how this approach enabled improvement over longstanding bounds for the maximum flow problem. This progress will also illustrate a broader emerging theme of employing optimization methods as an algorithmic bridge between our combinatorial and continuous understanding of graphs.
Additionally, I will briefly outline how this line of work brings a new perspective on some of the core continuous optimization primitives --- most notably, interior-point methods.
01 Jun 2015 at 13:00 in room 4523
Domain-Specific Guidance for Craig Interpolation
(Philipp Rümmer, Department of Information Technology, Uppsala University)
Model checkers use abstractions to reduce the state space of software programs or hardware designs, either to speed up the verification process, or as a way of handling infinite state space. One of the most common methods to construct or refine abstractions is Craig interpolation, a logical tool to extract concise explanations for the (bounded) unreachability of error locations or states. To ensure rapid convergence, model checkers rely on theorem provers to find suitable interpolants, or interpolants containing the right predicates, in a generally infinite lattice of interpolants for any given interpolation problem.
We have recently presented a semantic framework for systematically exploring interpolant lattices, based on the notion of interpolation abstraction. Our approach is solver-independent and works by instrumenting the interpolation query, and therefore does not require any changes to the theorem prover. While simple to implement, interpolation abstractions are extremely flexible, and can incorporate domain-specific knowledge about promising interpolants, for instance in the form of interpolant templates used by the theorem prover. The framework can be used for a variety of logics, including arithmetic domains or programs operating on arrays or heap, and is also applicable for quantified interpolants. In this presentation, in particular the application to the analysis of software programs, and to (unbounded) Petri net models is considered.
The presentation is based on joint work with Jerome Leroux and Pavle Subotic, accepted for publication in Acta Informatica (to appear in 2015). Parts of the work were earlier presented at FMCAD 2013.
18 May 2015 at 12:00 in room 4523
Symbol elimination for program analysis
(Laura Kovács, Chalmers University of Technology)
Automatic understanding of the intended meaning of computer programs is a very hard problem, requiring intelligence and reasoning. In this talk we describe applications of our symbol elimination methods in automated program analysis. Symbol elimination uses first-order theorem proving techniques in conjunction with symbolic computation methods, and derives non-trivial program properties, such as loop invariants and loop bounds, in a fully automatic way. Moreover, symbol elimination can be used as an alternative to interpolation for software verification.
07 May 2015 at 12:00 in room 4523
The notion of structure in real-world SAT solving
(Jesús Giráldez Crú, Artificial Intelligence Research Institute IIIA-CSIC, Barcelona)
Modern SAT solvers have experienced a remarkable progress on solving industrial (or real-world) SAT instances. Most of the techniques have been developed after an intensive experimental testing process. The common wisdom in the SAT community is that the success of these techniques is because they exploit some kind of hidden structure that industrial formulas actually have. Recently, there have been some attempts to analyze this structure in terms of complex networks, with the long-term aim of explaining the success of these SAT solving techniques, and possibly improving them.
In this talk, we analyze some structural properties of SAT instances, viewed as graphs. Namely, the scale-free structure, the community structure and the self-similar structure. In addition, we explore how these features are affected during the SAT solving search by effects of variable instantiation or clause learning. Finally, we present some applications in SAT based on these notions of structure.
04 May 2015 at 12:00 in room 4523
The parity of set systems under random restrictions with applications to exponential time problems
(Thore Husfeldt, Lund University and IT University of Copenhagen)
We reduce the problem of detecting the existence of an object to the problem of computing the parity of the number of objects in question. In particular, when given any non-empty set system, we prove that randomly restricting elements of its ground set makes the size of the restricted set system an odd number with significant probability. When compared to previously known reductions of this type, ours excel in their simplicity: For graph problems, restricting elements of the ground set usually corresponds to simple deletion and contraction operations, which can be encoded efficiently in most problems. We find three applications of our reductions:
1. An exponential-time algorithm: We show how to decide Hamiltonicity in directed n-vertex graphs with running time 1.9999^n provided that the graph has at most 1.0385^n Hamiltonian cycles. We do so by reducing to the algorithm of Björklund and Husfeldt (FOCS 2013) that computes the parity of the number of Hamiltonian cycles in time 1.619^n.
2. A new result in the framework of Cygan et al. (CCC 2012) for analyzing the complexity of NP-hard problems under the Strong Exponential Time Hypothesis: If the parity of the number of Set Covers can be determined in time 1.9999^n, then Set Cover can be decided in the same time.
3. A structural result in parameterized complexity: We define the parameterized complexity class ⊕W[1] and prove that it is at least as hard as W[1] under randomized fpt-reductions with bounded one-sided error; this is analogous to the classical result NP ⊆ RP ⊕ P by Toda (SICOMP 1991).
This is joint work with Andreas Björklund and Holger Dell.
08 Apr 2015 at 12:00 in room 4523
Space for random CNFs in proof complexity
(Ilario Bonacina, Sapienza, Università di Roma)
The aim of this talk is to give an overview of some recent results about space lower bounds in proof complexity. In particular, for random 3-CNFs we will present a (reasonably complete) proof of an optimal monomial space lower bound for Polynomial Calculus with Resolution (PCR) and an optimal total space lower bound in Resolution. We will see how to apply a fairly general combinatorial framework for proving space lower bounds in Resolution and PCR and, more in detail, the difficulties we had to overcame for the particular case of random 3-CNFs.
A crucial point is a result independent from proof complexity: a variation of Hall's Theorem for matchings. We show that in bipartite graphs G with bipartition (L,R) and left-degree at most 3, L can be covered by certain families of disjoint paths, called VW-matchings, provided that L expands in R by a factor of 2-ε for ε > 1/23.
This talk is mainly based on a joint work with Patrick Bennett, Nicola Galesi, Tony Huynh, Mike Molloy and Paul Wollan.
23 Mar 2015 at 12:00 in room 4523
An ultimate trade-off in propositional proof complexity
(Alexander Razborov, University of Chicago)
We exhibit an unusually strong trade-off between resolution proof width and tree-like proof size. Namely, we show that for any parameter k = k(n) there are unsatisfiable k-CNFs that possess refutations of width O(k), but such that any tree-like refutation of width n^{(1-eps)/k} must necessarily have double exponential size exp(n^{Omega(k)}). Conceptually, this means that there exist contradictions that allow narrow refutations, but in order to keep the size of such a refutation even within a single exponent, it must necessarily use a high degree of parallelism. Viewed differently, every tree-like narrow refutation is exponentially worse not only than wide refutations of the same contradiction, but of any other contradiction with the same number of variables. This seems to significantly deviate from the established pattern of most, if not all, trade-off results in complexity theory.
Our construction and proof methods combine, in a non-trivial way, two previously known techniques: the hardness escalation method based on substitution formulas and expansion. This combination results in a hardness compression approach that strives to preserve hardness of a contradiction while significantly decreasing the number of its variables.
19 Mar 2015 at 10:00 in room 4523
Approximating maximum independent set in sparse graphs
(Nikhil Bansal, Department of Mathematics and Computer Science Eindhoven University of Technology)
We consider the maximum independent set problem on graphs with maximum degree d. The best known result for the problem is an SDP based O(d log log d/ log d) approximation.It is also known that no o(d/ log^2 d) approximation exists assuming the Unique Games Conjecture.
We will describe several new results for this problem.We show that the natural LP formulation for the problem strengthened by poly-logarithmic levels of the Shearli-Adams(+) hierarcy has an integrality gap of about O(d/log^2d). We also show how to make this result algorithmic using d levels of the hierarcy. Finally, we give improved bounds on the intergrality gap of the standard SDP formulation. A key ingredient here is a new Ramsey theoretic result about the existence of non-trivial independent sets in graphs without large cliques.
The talk will give a broad overview of the techniques and the concepts involved, such as LP/SDP hiearchies, and Ramsey theoretic techniques such as Shearer's entropy based approach and the nibble methods.
12 Mar 2015 at 12:00 in room 4523
Limitations of algebraic approaches to graph isomorphism testing
(Christoph Berkholz, RWTH Aachen and Theory Group, KTH)
We investigate the power of algebraic techniques to test whether two given graphs are isomorphic. In the algebraic setting, one encodes the graphs into a system of polynomial equations that is satisfiable if the graphs are isomorphic. Afterwards, one tries to test satisfiability of this system using, for example, the Gröbner basis algorithm. In some cases this can be done in polynomial time, in particular, if the equations admit a constant degree refutation in algebraic proof systems such as Nullstellensatz or Polynomial Calculus.
We compare this approach to other known relaxations and show that the k-dimensional Weisfeiler-Lehman Algorithm can be characterized in terms of algebraic proof system over the reals that lies between degree-k Nullstellensatz and degree-k Polynomial Calculus. Furthermore, we prove a linear lower bound on the Polynomial Calculus degree of Cai-Fürer-Immerman graphs, which holds over any field of characteristic different from two.
This is joint work with Martin Grohe.
26 Jan 2015 at 12:00 in 4523
Distributed graph algorithms and lower bounds
(Danupon Nanongkai, Theory Group, KTH)
This talk will focus on distributed approximation algorithms for solving basic graph problems such as shortest paths, minimum spanning trees, and minimum cut. I will cover:
- Survey of the recent progress.
- Open problems and some related conjectures.
- A basic technique for proving lower bounds by a reduction from two-party communication complexity based on [1].
If time permits, I will touch some algorithmic techniques as well (e.g. [2,3]). No background in distributed algorithms will be assumed in this talk.
[1] Atish Das Sarma, Stephan Holzer, Liah Kor, Amos Korman, Danupon Nanongkai, Gopal Pandurangan, David Peleg, Roger Wattenhofer: Distributed Verification and Hardness of Distributed Approximation. SIAM J. Comput. 41(5): 1235-1265 (2012)
[2] Danupon Nanongkai: Distributed Approximation Algorithms for Weighted Shortest Paths. STOC 2014: 565-573
[3] Danupon Nanongkai, Hsin-Hao Su: Almost-Tight Distributed Minimum Cut Algorithms.DISC 2014: 439-453
08 Dec 2014 at 12:00 in room 4523
(2+eps)-SAT is NP-hard
(Per Austrin, KTH)
We prove the following hardness result for a natural promise variant of the classical CNF-satisfiability problem: given a CNF-formula where each clause has width w and the guarantee that there exists an assignment satisfying at least g = w/2 - 1 literals in each clause, it is NP-hard to find a satisfying assignment to the formula (that sets at least one literal to true in each clause). On the other hand, when g = w/2, it is easy to find a satisfying assignment via simple generalizations of the algorithms for 2-SAT.
We also generalize this to prove strong NP-hardness for discrepancy problems with small size sets.
03 Nov 2014 at 12:00 in room 4523
Polynomial identity testing of read-once oblivious algebraic branching programs
(Michael Forbes, Simons Institute for the Theory of Computing at UC Berkeley)
Polynomial Identity Testing (PIT) is the problem of identifying whether a given algebraic circuit computes the identically zero polynomial. It is well-known that this problem can be solved with a small probability of error by testing whether the circuit evaluates to zero on random evaluation points. Recently, there has been much interest in solving this problem deterministically, because it has close connections with circuit lower bounds, and this problem is now one of the frontiers of the area of pseudorandomness.
In this talk we will discuss recent progress in this area, in particular focusing on a model of algebraic computation known as the "read-once oblivious algebraic branching program" which has been the focus of most PIT research for the past several years. Developing PIT algorithms for this class is a natural algebraic analogue of derandomizing small-space computation (the RL vs L question), and this class of computation naturally has a linear-algebraic flavor. I will discuss deterministic algorithms for determining if computations in this model compute the zero polynomial, and will expose the linear algebraic nature of this question. In particular, I will construct a natural pseudorandom object from linear algebra called a "rank extractor" and show how it yields the desired PIT algorithms.
27 Oct 2014 at 12:00 in room 1537
Easy generation and efficient validation of proofs for SAT and QBF
(Marijn Heule, University of Texas at Austin)
Several proof systems have been proposed to verify results produced by satisfiability (SAT) and quantified Boolean formula (QBF) solvers. However, existing proof systems are not very suitable for validation purposes: It is either hard to express the actions of solvers in those systems or the resulting proofs are expensive to validate. We present two new proof systems (one for SAT and one for QBF) which facilitate validation of results in a time similar to proof discovery time. Proofs for SAT solvers can be produced by making only minor changes to existing conflict-driven clause-learning solvers and their preprocessors. For QBF, we show that all preprocessing techniques can be easily expressed using the rules of our proof system and that the corresponding proofs can be validated efficiently.
24 Oct 2014 at 10:30 in 4523
A formal approach to autonomic systems programming: The SCEL Language
(Rocco De Nicola, IMT - Institute for Advanced Studies, Lucca)
The autonomic computing paradigm has been proposed to cope with size, complexity and dynamism of contemporary software-intensive systems. The challenge for language designers is to devise appropriate abstractions and linguistic primitives to deal with the large dimension of systems, and with their need to adapt to the changes of the working environment and to the evolving requirements. We propose a set of programming abstractions that permit to represent behaviours, knowledge and aggregations according to specific policies, and to support programming context-awareness, self-awareness and adaptation. Based on these abstractions, we define SCEL (Software Component Ensemble Language), a kernel language whose solid semantic foundations lay also the basis for formal reasoning on autonomic systems behaviour. To show expressiveness and effectiveness of SCEL's design, we present a Java implementation of the proposed abstractions and show how it can be exploited for programming a robotics scenario that is used as a running example for describing features and potentials of our approach.
20 Oct 2014 at 12:00 in room 4523
Lifting locally consistent partial solutions to a global solution
(Irit Dinur, Weizmann Institute of Science)
We are given a collection of (alleged) partial views of a function. We are promised "local consisency", i.e., that the partial views agree on their intersection with probability p. The question is whether the partial views can be *lifted* to a global function f, i.e. whether a p' fraction of the partial views agree with f (aka "global consistency").
This scenario captures "low degree tests" and "direct product tests", both studied for constructions of PCPs. We describe other possible settings where a lifting theorem may hold. We will relate this to questions on proving a "derandomized" parallel repetition theorem, and the sliding scale conjecture.
02 Oct 2014 at 15:00 in 522/Fantum
Information Flow Monitoring as Abstract Interpretation for Relational Logic
(David Naumann, Stevens Institute of Technology)
A number of systems have been developed for dynamic information flow control (IFC). In such systems, the security policy is expressed by labeling input and output channels; it is enforced by tracking and checking labels on data. Systems have been proven to enforce some form of noninterference (NI), formalized as a property of two runs of the program. In practice, NI is too strong and it is desirable to enforce some relaxation of NI that allows downgrading under constraints that have been classified as `what', `where', `who', or `when' policies. To encompass a broad range of policies, relational logic has been proposed as a means to specify and statically enforce policy. This paper shows how relational logic policies can be dynamically checked. To do so, we provide a new account of monitoring, in which the monitor state is viewed as an abstract interpretation of sets of pairs of program runs.
Joint work with Andrey Chudnov and George Kuan; appeared in CSF 2014
1 Oct 2014 at 12:00 in 4523
Parallel repetition from fortification
(Dana Moshkovitz, MIT)
The Parallel Repetition Theorem upper-bounds the value of a repeated (tensored) two prover game in terms of the value of the base game and the number of repetitions. In this work we give a simple transformation on games — "fortification" — and show that for fortified games, the value of the repeated game decreases perfectly exponentially with the number of repetitions, up to an arbitrarily small additive error. Our proof is combinatorial and short.
As corollaries, we obtain: 1. Starting from a PCP Theorem with soundness error bounded away from 1, we get a PCP with arbitrarily small constant soundness error. In particular, starting with the combinatorial PCP of Dinur, we get a combinatorial PCP with low error. The latter can be used for hardness of approximation as in the work of Håstad. 2. Starting from the work of the author and Raz, we get a projection PCP theorem with the smallest soundness error known today. The theorem yields nearly a quadratic improvement in the size compared to previous work.
19 Sep 2014 at 10:30 in 4523
Synthesis of concurrent programs using genetic programming
(Doron Peled, Bar Ilan University)
We present a method to automatically generate concurrent code using genetic programming, based on automatic verification. As the problem of constructing concurrent code is in general undecidable, the user needs the intervene by tuning various parameters and supplying specification and hints that would steer the search for correct code in the right direction. We demonstrate how various hard-to-program protocols are generated using our method and our developed tool. We show how a commonly used protocol for coordinating concurrent interactions was found to be incorrect using our tool, and was then subsequently fixed.
23 Jun 2014 at 12:00 in 4523
Indistinguishability obfuscation from semantically-secure multilinear encodings
(Rafael Pass, Cornell and KTH Royal Institute of Technology)
The goal of program obfuscation is to "scramble" a computer program, hiding its implementation details while preserving functionality. Unfortunately, the "dream" notion of security, guaranteeing that obfuscated code does not reveal any information beyond black-box access to the original program, has run into strong impossibility results, and is known to be unachievable for general programs (Barak et al, JACM 2012). Recently, the first plausible candidate for general-purpose obfuscation was presented (Garg et al, FOCS 2013) for a relaxed notion of security, referred to as indistinguishability obfuscation; this notion, which requires that obfuscations of functionally equivalent programs are indistinguishable, still suffices for many important applications of program obfuscation.
We present a new hardness assumption—the existence of semantically secure multilinear encodings—which generalizes a multilinear DDH assumption and demonstrate the existence of indistinguishability obfuscation for all polynomial-size circuits under this assumption (and the LWE assumption). We rely on the beautiful candidate obfuscation constructions of Garg et al (FOCS'13), Brakerski and Rothblum (TCC'14) and Barak et al (EuroCrypt'14) that were proven secure only in idealized generic multilinear encoding models, and develop new techniques for demonstrating security in the standard model, based on semantic security of multilinear encodings (which trivially holds in the generic multilinear encoding model).
Joint work with Karn Seth and Sidharth Telang.
16 Jun 2014 at 12:00 in 4523
Formulas vs. circuits for small distance connectivity
(Benjamin Rossman, National Institute of Informatics, Tokyo)
Are poly-size boolean circuits strictly more powerful than poly-size boolean formulas? This question (also known as NC^1 vs. P) is one of the central open problems in complexity theory. We can also consider versions of this question for restricted classes of circuits. In the monotone setting, a super-polynomial separation of formulas vs. circuits was shown by Karchmer and Wigderson (1988). In recent work, we give the first super-polynomial separation in the (non-monotone) bounded-depth setting.
Our main result is a lower bound showing that depth-d formulas solving the Distance-k st-Connectivity problem require size n^(log k) for all k <= loglog n and d <= log n/(loglog n)^O(1). In contrast, this problem has circuits of size n^Ω(1) and depth O(log k) by the "recursive doubling" method of Savitch. As a corollary, it follows that depth-d circuits of size S cannot be simulated by depth-d formulas of size S^o(d) for all d <= logloglog S (previously this was not known for any unbounded d -> \infty).
The first part of the talk will be a gentle introduction to the question of formulas vs. circuits and the Distance-k st-Connectivity problem. After the break, I will give an overview of the new proof technique.
04 Jun 2014 at 15:00 in 4523
Real Algebraic Geometry in Computational Game Theory
(Peter Bro Miltersen (Aarhus University, Denmark))
We discuss two recent applications of Real Algebraic Geometry in Computational Game Theory: 1) A tight worst case upper bound on the running time of the strategy iteration algorithm for concurrent reachability games. 2) Polynomial time equivalence between approximating a Nash equilibrium and approximating a trembling hand perfect equilibrium of a multi-player game in strategic form. The applications appear in joint works with Kousha Etessami, Rasmus Ibsen-Jensen, Kristoffer Arnsfelt Hansen, Michal Koucky, Niels Lauridsen, Troels Bjerre Soerensen and Elias Tsigaridas.
02 Jun 2014 at 13:15 in room 1537
On pollution attacks in network coding based systems
(Frederique Oggier (NTU Singapore))
Network coding based systems are known to enjoy a larger data throughput than routing based systems. However, they are also more vulnerable to pollution attacks. In this talk, we will present (1) homomorphic authentication codes as a way to mitigate pollution attacks in network coding based multicast networks, and (2) a study of how pollution attacks disrupt distributed storage systems relying on network coding mechanisms for repair.
26 May 2014 at 13:15 in 4523
Dynamic Distributed Virtual Systems
(Stefan Schmid (TU Berlin & T-Labs))
Virtualization is a powerful paradigm in computer science. It allows to abstract computer systems and applications from the constraints and heterogeneity of the underlying hardware and infrastructure components. Virtualization is also the main paradigm behind the success of cloud computing. However, today, we are still far from exploiting the full potential of virtualization. First, current systems are only partially virtualized and do not provide true performance isolation guarantees, leading to an unpredictable performance. Second, the mapping and resource allocation of virtual systems is often static; this is problematic in the light of the increasingly dynamic nature of today's applications as well as the more frequent hardware failures occurring at a larger scale. Third, virtual systems are constrained to a single site; the spatial distribution of resources is not exploited.
In this talk, I will give an overview of our project "CloudNet", in which we are developing the foundations of more "Dynamic and Distributed Virtual Systems". In particular, I will give a quick overview of the CloudNet architecture and its economic roles, and then present in more detail the resource allocation and embedding problems we have devised so far. The talk will have an algorithmic touch, and we will present different online algorithms and sketch their analysis, and also discuss security aspects.
However, the talk is generally made for a broad audience, and does not assume detailed knowledge of theoretical computer science concepts.
14 May 2014 at 13:15 in 4523
Recommender Systems, Mathematics and Graphs
(Roelof Pieters, Vionlabs R&D)
Recommender Systems have become extremely common in recent years. A whole subfield of science, a yearly conference, and widespread use in industry, has pushed forward its development. This short seminar will focus on the various uses of Recommender Systems (RecSys), and the different approaches towards information selection, generally understood as Collaborative Filtering, Content-based Filtering or Context Filtering.
The talk will give insight in some of the mainstream mathematical models used to solve various problems in information selection and representation, ie. cold start, scalability, sparsity, similarity versus diversity, as well as some more experimental ones developed at Vionlabs.
After a general introduction into the main principles and algorithms used in RecSys in its various shapes, it might if time allows also show how graph theory and graph databases have leveled the playing field, making it possible for low budget machines to sift through and do calculations on terabytes of data, before only privileged to large-scale enterprise-type machines. Here the talk will also point to some of the faster approximation algorithms possible by modeling data as graphs.
As the talk is part of admittance to the PhD program of Theoretical Computer Science, it will be heavy in math, while still trying to explaining the more general dynamics and uses of the formulas and algorithms sketched. In some cases IPython Notebook will be used to show actual working code of typical RecSys algorithms (Clustering, Matrix Factorization, etc.) on a sub-set of the MovieLens dataset.
14 May 2014 at 12:00 in 1537
Separating path systems
(Victor Falgas-Ravry, Umeå University)
Let G be a graph on n vertices. A family F of paths in G constitutes a separating system of G if for ever pair of distinct edges e,f in E(G) there exists a path p in F which contains exactly one of e and f. How small a separating path system can we find?
This question arises naturally in the context of network design. The graph G represents a communication network in which one link is defective; to identify this link, we can send messages between nodes along predetermined paths. If a message does not reach its destination, then we deduce that the defective link lies on the corresponding path. A minimal separating path system thus allows us to determine precisely which link is defective using a minimal number of messages.
We show that for asymptotically almost all n-vertex graphs, we can find a separating system containing at most 48n paths. In addition we prove some exact extremal results in the case where G is a tree.
This is joint work with Teeradej Kittipassorn, Daniel Korandi, Shoham Letzter and Bhargav Narayanan. Similar results have recently and independently been obtained by Balogh, Csaba, Martin and Pluhar.
12 May 2014 at 12:00 in 4523
Models for wireless algorithms
(Magnús M. Halldórsson, Reykjavik University)
The design and analysis of algorithms requires appropriate models — models that capture reality, yet are algorithmically usable; general, yet analyzable. The wireless setting has proved most challenging in this regard.
We survey some of the recent progress on fundamental problems in the SINR (or physical) model, including link capacity and scheduling, aggregation, and the relative value of power control.
The basic SINR model, however, still makes unrealistic assumptions that hold only in idealistic situations. We outline how to allow for arbitrary static environments while maintaining comparable performance guarantees with what holds in the basic SINR model. We might therefore be approaching an algorithmic model that captures reality with high fidelity while maintaining generality and analytic feasibility.
24 Apr 2014 at 12:00 in 4523
Lower bounds from the strong exponential time hypothesis
(Janne H. Korhonen, University of Helsinki)
Exact exponential algorithmics considers the development of faster, still exponential-time, algorithms for problems that are believed to lie outside polynomial time. There have been notable successes in the recent years, such as the Björklund's 1.657^n time Hamiltonian path algorithm; however, for the central problem of CNF-SAT, no such strictly exponential improvement has been obtained. That is, while o(2^n) algorithms are known, there is no known algorithm with running time c^n poly(n,m) for some c < 2.
Following the influential work of Impagliazzo, Paturi and Zane, connections between the exponential complexity of CNF-SAT and other problems have been investigated extensively. The basic idea is that if we assume CNF-SAT in fact cannot be solved in time c^n poly(n,m) for any c < 2 -- this assumption is known as the strong exponential time hypothesis -- we can then derive conditional lower bounds for other problems. These results can then be viewed as hardness results or new attacks on the complexity of CNF-SAT, depending on one's view on the strong exponential time hypothesis.
In this talk, we will survey these connections between the strong exponential time hypothesis and other problems. In particular, we will focus on the perhaps somewhat unexpected conditional lower bounds for polynomial-time problems, and the basic strategies used in the proofs of these results.
22 Apr 2014 at 12:00 in 4523
Bi-Lipschitz bijection between the Boolean cube and the Hamming ball
(Igor Shinkar, Weizmann Institute of Science)
We construct a bi-Lipschitz bijection from the Boolean cube to the Hamming ball of equal volume. More precisely, we show that for all even n there exists an explicit bijection f from the n-dimensional Boolean cube to the Hamming ball of equal volume embedded in (n+1)-dimensional Boolean cube, such that for all x and y it holds that distance(x,y) / 5 <= distance(f(x),f(y)) <= 4 distance(x,y) , where distance(,) denotes the Hamming distance. In particular, this implies that the Hamming ball is bi-Lipschitz transitive.
This result gives a strong negative answer to an open problem of Lovett and Viola (CC 2012), who raised the question in the context of sampling distributions in low-level complexity classes. The conceptual implication is that the problem of proving lower bounds in the context of sampling distributions will require some new ideas beyond the sensitivity-based structural results of Boppana (IPL 97).
We also study the mapping f further and show that it (and its inverse) are computable in DLOGTIME-uniform TC0, but not in AC0. Moreover, we prove that f is "approximately local" in the sense that all but the last output bit of f are essentially determined by a single input bit.
Joint work with Itai Benjamini and Gil Cohen.
07 Apr 2014 at 12:00 in 3721
A relative Szemerédi theorem
(David Conlon, University of Oxford)
The celebrated Green-Tao theorem states that there are arbitrarily long arithmetic progressions in the primes. One of the main ingredients in their proof is a relative Szemerédi theorem which says that any subset of a pseudorandom set of integers of positive relative density contains long arithmetic progressions.
In this talk, we will discuss a simple proof of a strengthening of the relative Szemerédi theorem, showing that a much weaker pseudorandomness condition is sufficient. Our strengthened version can be applied to give the first relative Szemerédi theorem for k-term arithmetic progressions in pseudorandom subsets of Z_N of density N^{-c_k}.
The key component in our proof is an extension of the regularity method to sparse pseudorandom hypergraphs, which we believe to be interesting in its own right. From this we derive a relative extension of the hypergraph removal lemma. This is a strengthening of an earlier theorem used by Tao in his proof that the Gaussian primes contain arbitrarily shaped constellations and, by standard arguments, allows us to deduce the relative Szemerédi theorem.
This is joint work with Jacob Fox and Yufei Zhao.
04 Apr 2014 at 13:15 in Room 4523, Lindstedtsvägen 5, 5th floor
Certification in Coq of an Intrusion Resilient Key Exchange Protocol
(Mathilde Duclos, Verimag, Grenoble)
Security proofs for cryptographic systems can be carried out in different models which reflect different kinds of security assumptions. In the symbolic model, an attacker cannot guess a secret at all and can only apply a pre-defined set of operations, whereas in the computational model, he can hope to guess secrets and apply any polynomial-time operation. Security properties in the computational model are more difficult to establish and to check.
During this work, we designed a framework for certified proofs of computational indistinguishability, written using the Coq proof assistant, and based on CIL, a specialized logic for computational frames that can be applied to primitives and protocols. We demonstrate how CIL and its Coq-formalization allow proofs beyond the black-box security framework, where an attacker only uses the input/output relation of the system by executing on chosen inputs without having additional information on the state. More specifically, we use it to prove the security of a protocol against a particular kind of side-channel attack which aims at modeling leakage of information caused by an intrusion into Alice and Bob's computers.
02 Apr 2014 at 13:15 in Room 1537, Lindstedtsvägen 5, 5th floor
Pervasive Formal Verification of Multicore Operating System Kernels
(Christoph Baumann, Univ. des Saarlandes, Saarbrücken)
The highest level of confidence in the correct functionality of system software can be gained from a pervasive formal verification approach, where the high-level language application layer is connected to the gate-level hardware layer through a stack of semantic layers coupled by simulation theorems. In this spirit, in 2007 the German government started the Verisoft XT project which aimed at the pervasive formal verification of real-world applications like the Microsoft Hyper-V hypervisor and the PikeOS microkernel used in avionics.
A big challenge in these projects was that at the beginning most of the theoretical foundations for the pervasive verification of realistic and concurrent software were still missing. While project work ended in 2010, the development of the required theories never stopped. By today there exist a fairly complete stack of theories underpinning the code verification effort. In my talk I will give an overview of this stack and highlight selected theories to which I have been contributing. I will also report on my experiences in the Verisoft XT avionics subproject and the challenges one faces when trying to verify a microkernel that was not designed for verification.
02 Apr 2014 at 11:15 in in 4523, Lindstedsvägen 5
Induced matchings, arithmetic progressions and communication
(Benjamin Sudakov, UCLA)
Extremal Combinatorics is one of the central branches of discrete mathematics which deals with the problem of estimating the maximum possible size of a combinatorial structure which satisfies certain restrictions. Often, such problems have also applications to other areas including Theoretical Computer Science, Additive Number Theory and Information Theory. In this talk we will illustrate this fact by several closely related examples focusing on a recent work with Alon and Moitra.
31 Mar 2014 at 12:00 in in 3721, Lindstedsvägen 25
The graph regularity method
(Jacob Fox, MIT)
Szemerédi's regularity lemma is one of the most powerful tools in graph theory, with many applications in combinatorics, number theory, discrete geometry, and theoretical computer science. Roughly speaking, it says that every large graph can be partitioned into a small number of parts such that the bipartite subgraph between almost all pairs of parts is random-like. Several variants of the regularity lemma have since been established with many further applications. In this talk, I will survey recent progress in understanding the quantitative aspects of these lemmas and their applications.
28 Mar 2014 at 15:15 in Room 4523, Lindstedtsvägen 5, 5th floor
Rounding Sum-of-Squares Relaxations
(David Steurer, Cornell)
I will survey recent developments around the sum-of-squares method and its impact on computational complexity and algorithm design. In particular, we will discuss a general approach for showing guarantees of the sum-of-squares method based on a connection to the Positivstellensatz proof system. This approach leads to improved algorithms for several problems arising in machine learning and optimization, in particular finding sparse vectors in subspaces, tensor optimization, and learning sparse dictionaries. Some of these problems are closely related to the Unique Games Conjecture and further quantitative improvements could refute the conjecture. Based on joint works with Boaz Barak and Jonathan Kelner.
The speaker was invited by Johan Håstad.
26 Mar 2014 at 13:15 in Room 4523, Lindstedtsvägen 5, 5th floor
Program Extraction from Coinductive Proofs
(Kenji Miyamoto, LMU, Munich)
Program extraction is a method to figure out the computational content of a mathematical proof. In our theory TCF, program extraction is given as a symbolic transformation from a proof M of a formula A into a term in T${}^+$, an extension of G\"odel's T. The obtained term satisfies a correctness property, namely, the term is the computational content of M. From a computer-scientific viewpoint, program extraction can be regarded as the study of a formal method to obtain provably correct programs, taking A as a specification.
The recent developments of TCF include definitions by a certain combination of induction and coinduction. In addition to the theoretical results, we also describe the proof assistant Minlog, which is an implementation of TCF. We illustrate new features of Minlog, which allow program extraction from proofs involving coinduction. We formulate a case study by Ulrich Berger within TCF and Minlog. An executable corecursive program in exact real arithmetic is extracted and presented on a computer.
24 Mar 2014 at 12:00 in 4523
Improved Inapproximability results for Hypergraph Coloring
(Prahladh Harsha, Tata Institute of Fundamental Research, Mumbai)
Despite the tremendous progress in understanding the approximability of several problems, the status of approximate coloring of constant colorable (hyper)graphs is not resolved and in fact, there is an exponential (if not doubly exponential) gap between the best known approximation algorithms and inapproximability results. The best known approximation algorithms which are a combination of combinatorial and semi-definite programming methods, require at least n^delta colors to color a 2 colorable 4-uniform hypergraph for some constant delta in (0,1). On the contrary, till recently, the best known hardness results could rule out at best coloring a 2-colorable hypergraph with polylog n colors (if not polyloglog n colors in some cases).
Recently, with the discovery of the low-degree polynomial long code (aka short code of Barak et al [FOCS 2012]), there has been a super-polynomial (and in some cases, exponential) improvement in the inapproximability results. In particular, we prove quasi-NP-hardness of the following problems on n-vertex hypergraphs: - Coloring a 2-colorable 8-uniform hypergraph with 2^{2^{\sqrt{log log n}} colors. - Coloring a 4-colorable 4-uniform hypergraph with 2^{2^{\sqrt{log log n}} colors - Coloring a 3-colorable 3-uniform hypergraph with (log n)^{1/log log log n} colors. These results are obtained using the low-degree polynomial code and the techniques proposed by Dinur and Guruswami [FOCS 2013] to incorporate this code for inapproximability results.
In this talk, I'll explain the bottleneck in obtaining improved coloring inapproximability results using the long code and how derandomizations of the long code (the short code in our setting) can be used to improve the inapproximability factors.
Joint work with V. Guruswami, J. Håstad, S. Srinivasan, G. Varma.
03 Mar 2014 at 13:15 in 4523
On finite-time convergent gossip algorithms: existence, complexity, and some applications
(Guodong Shi, KTH)
Gossip algorithms have been widely applied in modern distributed systems, for which we have witnessed examples ranging from sensor networks and peer-to-peer networks, to mobile vehicle networks and social networks. Tremendous research has been devoted to analyzing, or even improving, the asymptotic rate of convergence for gossip algorithms. In this work we study finite-time convergence of deterministic gossiping. We show that there exists a symmetric gossip algorithm that converges in finite time if and only if the number of network nodes is a power of two, while there always exists a globally finite-time convergent gossip algorithm despite the number of nodes in cooperation with asymmetric updates. For $n=2^m$ nodes, we prove that a fastest convergence can be reached in $mn$ node updates via symmetric gossiping. On the other hand, for $n=2^m+r$ nodes with $0\leq r<2^m$, it requires at least $mn+2r$ node updates for achieving a finite-time convergence when asymmetric gossiping is allowed. It is also shown that the existence of finite-time convergent gossiping often raises strong structure requirement to the underlying interaction graph.
Two applications of the derived results will also be briefly discussed. First for a model of opinion dynamics over signed social networks defined by randomized attraction-repulsion gossiping, we show that finite-time convergence leads to fundamental robustness in view of the Borel-Cantelli Lemma. This provides a reflection of the obtained finite-time convergence results in randomized gossiping algorithms. Second, we apply our results to recent studies on gossip algorithms in quantum networks, where the goal is to control the state of a quantum system via pairwise interactions, and show that finite-time convergence is never possible in this case.
This is a joint work with Bo Li, Academy of Mathematics and Systems Science, Chinese Academy of Sciences; Alexandre Proutiere, Mikael Johansson, Karl H. Johansson, Automatic Control, KTH; and John S. Baras, University of Maryland College Park.
02 Dec 2013 at 12:00 in room 4523
Pebble Games and Complexity
(Siu Man Chan, Princeton University)
We will discuss space and parallel complexity, ranging from some classical results which motivated the study, to some recent results concerning combinatorial lower bounds in restricted settings. The recurring theme is some pebble games. We will highlight some of their connections to Boolean complexity and proof complexity.
27 Nov 2013 at 13:15 in room 4523
SAT and friends: a single engine for many purposes
(Daniel Le Berre, Université d'Artois)
Over the last decade, Boolean reasoning engines received a lot of interest: SAT, MAXSAT, Pseudo-Boolean solvers are now used routinely in hardware or software verification, bioinformatics, software engineering, etc. Using so called "incremental SAT solvers", it is currently possible to solve a lot of different problems by encoding everything into SAT. The presentation will focus on the relationship between various problems based on Boolean reasoning such as SAT, MAXSAT, Pseudo-Boolean Optimization, and how they can be solved using a SAT solver. We will focus particularly on the approach taken in the Sat4j library. A real application, software dependency management, will be used through the presentation to illustrate a possible use of each Boolean reasoning task.
25 Nov 2013 at 12:00 in room 4523
Integrating cutting planes in a modern SAT solver
(Daniel Le Berre, Université d'Artois)
SAT solvers used in a daily basis in hardware or software verification are based on the so called "conflict driven clause learning (CDCL)" architecture. Such architecture is based on a proof system equivalent to full resolution. Resolution in that context is used to derive new clauses during conflict analysis. SAT solvers can easily be extended to deal with cardinality constraints and Pseudo-Boolean constraints while keep a resolution -based proof system. A major focus has been to study the translation of those constraints into CNF to reuse without modifications the current SAT solvers. Another approach is to extend the CDCL architecture to use cutting planes instead of Resolution on cardinality or pseudo Boolean constraints. The presentation will focus on the design of such kind of solver, from the specific properties of Pseudo-Boolean constraints to the design of a cutting planes based conflict analysis procedure. Some experimental results based on the implementation of such procedure available in Sat4j will conclude the talk.
22 Nov 2013 at 13:15 in room 4523
Introspective Static Analysis via Abductive Inference
(Thomas Dillig, College of William and Mary)
When software veriﬁcation tools fail to prove the correctness a program, there are two possibilities: Either the program is indeed buggy or the warning generated by the analysis is a false alarm. In this situation, the burden is on the programmer to manually inspect the warning and decide whether the program contains a real bug. However, this task is quite error-prone and time consuming and remains an impediment to the adoption of static analysis in the real world. In this talk, we present a new "introspective" approach to static analysis that helps programmers understand and diagnose error reports generated by the analysis. When our analysis fails to verify the program, it can meaningfully interact with users by generating small and relevant queries that capture exactly the information the analysis is missing to validate or refute the existence of an error in the program. The inference of such missing facts is an instance of logical abduction, and we present a new algorithm for performing abductive inference in logics that admit quantifier elimination. Since our abduction algorithm computes logically weakest solutions with as few variables as possible, the queries our technique presents to the user capture the minimum information needed to diagnose the error report. A user study we performed involving 56 programmers hired through ODesk indicates that our new technique dramatically improves the usefulness of static analysis to programmers.
18 Nov 2013 at 13:15 in room 4523
Pomax games are PSPACE-complete
(Jonas Sjöstrand, Department of Mathematics, KTH)
A pomax game is played on a poset whose elements are colored black or white. The players Black and White take turns removing any maximal element of their own color. If there is no such element, the player loses.
Pomax games have the following two properties which make them unique among games in the literature. * Being integer-valued, they play a simple role in the algebraic framework of combinatorial games. * Being PSPACE-complete, they are computationally hard to analyze.
I will give a very short introduction to combinatorial game theory -- a beautiful piece of combinatorics -- but most of the talk will be focusing on the PSPACE-completeness of pomax games.
This is joint work with Erik Järleberg.
13 Nov 2013 at 12:00 in room 1537
Turan-problem for hypergraphs
(Klas Markström, Umeå University)
The classical Turan-problem asks the following question. Given a graph H, what is the maximum number of edges in a graph on n vertices which does not contain a copy of H? For ordinary graphs a results of Erdös, Stone and Simonovits gives an asymptotic solution to this problem. However the asymptotics for bipartite graphs H and graphs H which do not have constant size still present problems. The latter connects to the well known triangle removal lemma.
A 3-graph, or 3-uniform hypergraph, consists of a vertex set and a set of edges, which are vertex sets of size 3. Unlike the Turan-problem for graphs, the Turan-problem for 3-graphs is still far from understood, for example we do not know the correct asymptotics for any complete 3-graph.
I will survey some methods and problems in this area, discussing how lower and upper bounds have been proven.
Lunch is served at 12:00 noon (register at http://doodle.com/i2gde6v9e4kbuqyt by Monday Nov 11 at 8 pm).
04 Nov 2013 at 13:15 in room 4523
Community detection in large random graphs: Fundamental limits and efficient algorithms
(Alexandre Proutiere, Automatic Control Lab, KTH EES)
We consider the problem of identifying communities in graphs. This problem has been extensively studied in statistics, physics, and computer science, and has many important applications in diverse contexts such as biology, social networks, and distributed computing. A central question for such a problem is to characterize conditions under which communities can be efficiently detected using low complexity algorithms. We address this question (i) when the graph is generated through the so-called stochastic block model (also known as the planted partition model), and (ii) when the graph can be only partially observed. We report recent results related to this model. We provide fundamental performance limits satisfied by any community detection algorithm (irrespective of its complexity), and design a spectral-based algorithm (an extension of a scheme recently proposed by A. Coja-Oghlan) whose performance approaches these limits in some relevant cases.
31 Oct 2013 at 15:15 in room 4523
Program Boosting or Crowd-Sourcing for Correctness
(Ben Livshits, Microsoft Research / University of Washington)
A great deal of effort has been spent on both trying to specify software requirements and on ensuring that software actually matches these requirements. A wide range of techniques that includes theorem proving, model checking, type-based analysis, static analysis, runtime monitoring, and the like have been proposed. However, in many areas adoption of these techniques remains spotty. In fact, obtaining a specification or a precise notion of correctness is in many cases quite elusive. For many programming tasks, even expert developers are unable to get them right because of numerous tricky corner cases.
In this paper we investigate an approach we call program boosting, which involves crowd-sourcing partially correct solutions to a tricky programming problem from developers and then blending these programs together in a way that improves correctness.
We show how interesting and highly non-trivial programming tasks such as writing regular expressions to match URLs and email addresses can be effectively crowd-sourced. We demonstrate that carefully blending the crowd-sourced results together frequently yields results that are better than any of the individual responses. Our experiments on 465 of programs show consistent boosts in accuracy and demonstrate that program boosting can be performed at a relatively modest monetary cost.
21 Oct 2013 at 13:15 in room 4523
Formal Verification of Information Flow Security for a Simple ARM-Based Separation Kernel
(Roberto Guanciale, KTH CSC)
A separation kernel simulates a distributed environment using a single physical machine by executing partitions in isolation and appropriately controlling communication among them. We present a formal verification of information flow security for a simple separation kernel for ARMv7. We propose an approach where communication between partitions is made explicit and the information flow is analyzed in the presence of such a channel.
14 Oct 2013 at 12:00 in room 4523
Lower Bounds (Slightly) Beyond Resolution
(Marc Vinyals, KTH CSC)
One of the goals of proof complexity is to show lower bounds for stronger and stronger proof systems. For the well-known resolution proof systems exponential lower bounds are known for many formula families, but for stronger proof systems many of these formulas quickly become either provably easy or very hard to analyze.
In this talk, we will look at k-DNF resolution, an extension of resolution that works with k-DNF formulas instead of clauses, which is exponentially more powerful than resolution yet weak enough so that one can prove interesting lower bounds and see how the hardness of different formulas change. We will show exponential lower bounds for the weak pigeonhole principle and for random k-CNF formulas as well as separations between k-DNF and (k+1)-DNF resolution for increasing k. The main technical tool is a version of the switching lemma that works for restrictions that set only a small fraction of the variables and is applicable to DNF formulas with small terms.
This presentation is based on the paper Segerlind, Buss, Impagliazzo: A Switching Lemma for Small Restrictions and Lower Bounds for k-DNF Resolution (http://dx.doi.org/10.1137/S0097539703428555).
30 Sep 2013 at 12:00 in room 4523
The Complexity of Proving that a Graph is Ramsey
(Massimo Lauria, KTH CSC)
We say that a graph with n vertices is c-Ramsey if it does not contain either a clique or an independent set of size c*log(n). We define a CNF formula which expresses this property for a graph G. We show a superpolynomial lower bound on the length of resolution proofs that G is c-Ramsey, for every graph G. Our proof makes use of the fact that every Ramsey graph must contain a large subgraph with some of the statistical properties of the random graph.
Joint work with Pavel Pudlák, Vojtěch Rödl and Neil Thapen. The paper appeared at ICALP 2013
Lunch is served at 12:00 noon (register at http://doodle.com/226vivp4trzzy8v5 by Thursday Sep 26 at 8 pm). The presentation starts at 12:10 pm and ends at 1 pm. Those of us who wish reconvene after a short break for ca two hours of more technical discussions.
09 Sep 2013 at 12:00 in room 1537
Average-case Complexity of Lattice Problems
(Rajsekar Manokaran, KTH CSC)
The average-case complexity of a problem is the complexity of solving instances of it picked from a specific distribution. In a seminal work, Ajtai [STOC '96] showed a relation between the average-case complexity and the worst-case complexity of lattice problems. This result is central to cryptography implemented using lattices. Subsequently, Micciancio and Regev [FOCS '04] improved the parameters involved in the proof while also simplifying the presentation.
In this talk, I will describe the result along the lines of the work of Micciancio and Regev. Time permitting, I will describe the recent improvements achieved by Micciancio and Peikert [CRYPTO '13].
26 Aug 2013 at 12:00 in room 1537
Weak pigeonhole principles, circuits for approximate counting,
and propositional proofs of bounded depth
(Albert Atserias, Universitat Politècnica de Catalunya)
The pigeonhole principle (PHP) states that m pigeons cannot sit injectively into n holes if m is bigger than n. An often quoted result of the research in propositional proof complexity is that the PHP with m = n+1 does not have small proofs in proof systems that "lack the ability to count". These include resolution [Haken] and, more generally, proof systems that manipulate formulas with a bounded number of alternations between disjunctions and conjunctions (a.k.a. bounded-depth formulas) [Ajtai]. In contrast, for proof systems that manipulate arbitrary propositional formulas, or even bounded-depth formulas with "threshold gates", the PHP with m = n+1 admits small proofs [Buss]. For weaker PHPs where m is at least as large as a constant factor larger than n, the situation is much less understood. On one hand it looks clear that the ability to count approximately should be enough to establish these weaker PHPs. On the other, while bounded-depth circuits exist for approximate counting [Stockmeyer, Ajtai], their mere existence is not enough to get bounded-depth small proofs since one would also need elementary (i.e. comparably complex) proofs of their correctness.
In this talk we will survey the status of this classical problem in propositional proof complexity. Along the way we will discuss some new recent related results showing that a close variant of the weak PHP admits and requires quasipolynomial-size depth-2 proofs. To our knowledge, this is the first natural example that requires superpolynomial but not exponential proofs in a natural proof system. It also shows that the method of proof is in some sense "the right method"; a remarkable and rather unexpected fact by itself.
Lunch is served at 12:00 noon (register at http://doodle.com/52n2v8ctrx69grwu by Thursday Aug 22 at 8 pm). The presentation starts at 12:10 pm and ends at 1 pm. Those of us who wish reconvene after a short break for ca two hours of more technical discussions.
10 Jun 2013 at 13:15 in room 4523
Constrained multilinear detection and generalized graph motifs
(Petteri Kaski, HIIT & Department of Information and Computer Science, Aalto University)
Many hard combinatorial problems can be reduced to the framework of detecting whether a multivariate polynomial P(x)=P(x_1,x_2,...,x_n) has a monomial with specific properties of interest. In such a setup P(x) is not available in explicit symbolic form but is implicitly defined by the problem instance at hand, and our access to P(x) is restricted to having an efficient algorithm for computing values of P(x) at points of our choosing. This talk will focus on detecting multilinear monomials with auxiliary color constraints and applications to graph motif problems parameterized by motif size.
Joint work with Andreas Björklund (Lund) and Łukasz Kowalik (Warsaw).
http://arxiv.org/abs/1209.1082
23 May 2013 at 13:15 in room 4523
Protecting Distributed Applications Through Software Diversity
(Christian Collberg, University of Arizona)
Remote Man-at-the-end (R-MATE) attacks occur in distributed applications where an adversary has physical access to an untrusted client device and can obtain an advantage from inspecting, reverse engineering, or tampering with the hardware itself or the software it contains.
In this talk we give an overview of R-MATE scenarios and present a system for protecting against attacks on untrusted clients. In our system the trusted server overwhelms the client's analytical abilities by continuously and automatically generating and pushing to him diverse variants of the client code. The diversity subsystem employs a set of primitive code transformations that provide temporal, spatial, and semantic diversity in order to generate an ever-changing attack target for the adversary, making tampering difficult without this being detected by the server.
22 May 2013 at 13:15 in room 4523
Learning Bounded Tree-width Bayesian Networks
(Pekka Parviainen, KTH CB and Science for Life Laboratory)
Inference in Bayesian networks is known to be NP-hard but if the network has bounded tree-width, then inference becomes tractable. Thus, in some applications one may want to learn networks that closely match the given data and have bounded tree-width.
I will present some recent results on exact exponential dynamic programming algorithms for learning bounded tree-width Bayesian networks. I will also discuss ongoing work on developing more scalable algorithms using integer linear programming.
13 May 2013 at 12:00 in room 1537
On fitting too many pigeons into too few holes
(Mladen Miksa, KTH CSC)
If m pigeons want to nest in n separate pigeonholes, they will run into serious problems if m > n. Although this might seem trivial, it is one of the most extensively studied (and fascinating) combinatorial principles in all of proof complexity.
In a breakthrough result, Haken (1985) showed that it is exponentially hard for the resolution proof system to prove that n+1 pigeons don't fit into n holes. However, what happens when the number of pigeons increases? Since it becomes increasingly obvious that not every pigeon can get a different hole, perhaps one can find shorter proofs as the number of pigeons goes to infinity? This notorious open problem was finally resolved by Raz (2001), who established that even for infinitely many pigeons one still needs proofs of exponential length. Raz's result was subsequently simplified and strengthened by Razborov.
In the lunch seminar part of this talk, we will give an overview of Razborov's proof. The presentation will be self-contained and no prerequisites in proof complexity are required. After the break, we will go into technical details and prove the full result.
Lunch is served at 12:00 noon (register at http://doodle.com/vwcqud2nwrg5fwbm by Thursday the week before at 8 pm).
06 May 2013 at 12:00 in room 4523
Correct-by-Construction System Design: Forever a Dream or Approaching Reality?
(Ingo Sander)
Advances in process technology have led to extremely powerful many-core platforms, but in practice it is very difficult to exploit their inherent potential. Instead, in particular in the real-time domain, the verification costs often dominate the design costs due to a lack of design methodologies capable of handling the increasing complexity of large heterogeneous and inherently parallel embedded systems. Simulation is still the dominating verification technique in industry and can only indicate, but not proof the correctness of an implementation. In recent years, research in different fields has significantly improved the possibility for a correct-by-construction design flow, especially in areas like predictable architectures and models of computation. The presentation discusses the current situation, analyzes prerequisites and provides suggestions towards a correct-by-construction design flow, which could drastically reduce the verification costs.
A light lunch will be available at 12:00 and the talk starts approximatelly 10 minutes later. If you want to have lunch, sign up at http://www.doodle.com/sy8xg9sxrfqzv4y5 no later than May 1st.
29 Apr 2013 at 15:00 in room 1625
A Library For Removing Cache-based Attacks in Concurrent Information Flow Systems
(Alejandro Russo, Chalmers)
Information-flow control (IFC) allows untrusted code to manipulate sensitive data while preserving confidentility. Although promising, this technology suffers from the presence of covert channels. We demonstrate that LIO, a concurrent IFC system, is vulnerable to cache-based attacks exploiting the internal timing covert channel. To avoid such leaks, without modifying the runtime system, we propose a library that leverages the notion of resumptions as a model of interleaved computations. Regardless the state of the cache, resumptions allow a fine-gain control where interleavings are produced. By yielding control after each instruction completes execution, our library is capable to remove cache-based leaks. We extend resumptions to deal with local state and exceptions, both features present in LIO. To amend for performance degradation, our library supplies primitives to securely control the granularity of atomic actions. Additionally, our approach allows some degree of parallelism without jeopardizing security, a novel feature for IFC system implementations. The library suffers from leaks due to the termination covert channel. Nevertheless, this channel can only be exploited by brute-force attacks as it occurs in most of the state-of-the-art IFC tools.
This is a joint-work-in-progress with Pablo Buiras (Chalmers), Deian Stefan (Stanford), Amit Levy (Stanford), and David Mazieres (Stanford).
22 Apr 2013 at 12:00 in room 4523
Approximability of Some Constraint Satisfaction Problems
(Sangxia Huang, KTH CSC)
I will talk about approximability of Constraint Satisfaction Problems (CSPs). In particular, we focus on CSPs of "sparse" Boolean predicates. This is also related to other optimization problems, such as finding maximum independent set. For CSP instances that are almost satisfiable, Siu On Chan proved recently that there is a predicate on k variables with k+1 accepting assignments that is NP-hard to approximate better than (k+1)/2^k. The case of approximation on satisfiable instances is rather different. This is also an important question, and I will describe many recent developments, including my own work.
For the first part, I will give an overview of the state of the art and some techniques --- mainly reductions for showing hardness. In the second part, I will go over some proofs in Chan's paper and prove part of the main result.
The talk does not require prior experience in the PCP business.
Lunch is served at 12:00 noon (register at http://doodle.com/b38thtqsmbvaw4n3 by Thursday April 18 at 8 pm). The presentation starts at 12:10 pm and ends at 1 pm. Those of us who wish reconvene after a short break for ca two hours of more technical discussions.
17 Apr 2013 at 12:00 in room 4523
Rediscovering the Joys of Pebbling
(Jakob Nordström, KTH CSC)
In the early 1970s, combinatorial pebble games played on directed acyclic graphs were introduced as a way of studying programming languages and compiler construction. These games later found a broad range of applications in computational complexity theory and were extensively studied up to ca 1985. Then they were mercifully forgotten, more or less...
Until during the last decade, when pebbling quite surprisingly turned out to be very useful in proof complexity. In this talk, we will describe the connections between the two settings and how tight they are, present an improved reduction, and discuss the gap that remains.
In order to use this reduction to obtain interesting results in proof complexity, one needs pebbling results with quite specific (and rather strong) properties. We will also discuss a new such result, that broke the 25-year hiatus in the pebbling literature by appearing in CCC ’10.
This seminar is intended to be completely self-contained. In particular, no prerequisites in proof complexity or pebbling are required. After the break, in the technical sequel we intend to have some fun and actually prove some pebbling time-space trade-off results.
Lunch is served at 12:00 noon (register at http://doodle.com/8numsk6siw62e3z6 by Monday April 15 at 8 pm). The presentation starts at 12:10 pm and ends at 1 pm. Those of us who wish reconvene after a short break for ca two hours of more technical discussions.
09 Apr 2013 at 12:00 in room 1537
On the Complexity of Finding Width-k Resolution Refutations
(Christoph Berkholz, RWTH Aachen University)
One approach to attack NP-hard satisfiability problems such as 3-SAT or the Constraint Satisfaction Problem (CSP) is to design algorithms that run in polynomial time but do not always succeed. In this talk I gently introduce the approach of searching for width-k resolution refutations for 3-SAT (also known as k-consistency test in the CSP-community). This technique can be implemented in time n^O(k), hence is in polynomial time for every fixed k.
One drawback of this approach is that the degree of the polynomial increases with the parameter k. My main result is a lower bound showing that this cannot be avoided: Deciding whether there is a width-k resolution refutation requires time n^{ck} for an absolute constant c>0. Furthermore, the problem is EXPTIME-complete (if k is part of the input).
Lunch is served at 12:00 noon (register at http://doodle.com/npng4d48ghc5sd95 by Friday the week before at 8 pm). The presentation starts at 12:10 pm and ends at 1 pm. Those of us who wish reconvene after a short break for ca two hours of more technical discussions.
25 Mar 2013 at 12:00 in room 4523
Faster Hamiltonicity
(Per Austrin, KTH CSC)
I will talk about exact algorithms for Hamiltonicity and Travelling Salesperson. The classic Bellman/Held-Karp dynamic programming algorithm for these problems has a running time of O(n^2*2^n). This remained unbeaten for almost half a century until 2010 when Björklund gave an algorithm with running time O(2^{0.73n}) for undirected Hamiltonicity. Since then there has been a flurry of generalizations and simplifications, but many open questions remain.
My aim for the first part is to describe the state of the art and some general techniques that are used, and for the second part to go over one or two algorithms in detail.
The talk should be accessible without any specific background knowledge and in particular you don't need to know what the Hamiltonicity and Travelling Salesperson problems are beforehand.
Lunch is served at 12:00 noon (register at http://doodle.com/t9hsa8y383qye637 by Thursday Mar 21 at 8 pm). The presentation starts at 12:10 pm and ends at 1 pm. Those of us who wish reconvene after a short break for ca two hours of more technical discussions.
18 Mar 2013 at 12:00 in room 4523
Subgraphs Satisfying MSO Properties on z-Topologically Orderable Digraphs
(Mateus de Oliveira Oliveira, KTH CSC)
We introduce the notion of z-topological orderings for digraphs. We prove that given a digraph G admitting a z-topological ordering, we may count the number of subgraphs of G that satisfy a monadic second order formula φ and that are the union of k directed paths in time f(φ, k, z)•n^O(k•z) . Our result implies the polynomial time solvability of a vast number of natural counting problems on digraphs admitting z- topological orderings for constant z. For instance, we are able to answer in polynomial time questions of the form "How many planar subgraphs of G are the union of k directed paths?" Concerning the relationship between z-topological orderability and other digraph measures, we observe that any digraph of directed path-width d has a z-topological ordering for z ≤ 2d + 1. Since graphs of directed path-width can have both arbitrarily large undirected tree width and arbitrarily large clique width, our result provides for the first time a suitable way of partially transposing metatheorems developed in the context of the monadic second order logic of graphs of bounded undirected tree width and bounded clique width to the realm of digraph width measures that are closed under taking subgraphs and whose constant levels incorporate families of graphs of arbitrarily large tree width and arbitrarily large clique width.
Lunch is served at 12:00 noon (register at http://doodle.com/66sqtfib9k9ir79x by Thursday Mar 14 at 8 pm). The presentation starts at 12:10 pm and ends at 1 pm. Those of us who wish reconvene after a short break for ca two hours of more technical discussions.
18 Feb 2013 at 12:00 in room 4523
Exponential Lower Bounds for the PPSZ k-SAT Algorithm
(Dominik Scheder, Aarhus University)
In 1998, Paturi, Pudlak, Saks, and Zane presented PPSZ, an elegant randomized algorithm for k-SAT. Fourteen years on, this algorithm is still the fastest known worst-case algorithm. They proved that its expected running time on k-CNF formulas with n variables is at most 2^((1 - epsilon_k)n), where epsilon_k = Omega(1/k). So far, no exponential lower bounds at all have been known.
We construct hard instances for PPSZ. That is, we construct satisfiable k-CNF formulas over n variables on which the expected running time is at least 2^((1 - epsilon_k)n), for epsilon_k in O(log^2 (k) / k).
This is joint work with Shiteng Chen, Navid Talebanfard, and Bangsheng Tang.
Lunch is served at 12:00 noon (register at http://doodle.com/29v43i4yy38rpi52 by Thursday Feb 14 at 8 pm). The presentation starts at 12:10 pm and ends at 1 pm.
11 Feb 2013 at 12:00 in room 4523
Modular Verification of Temporal Safety Properties of Procedural Programs
(Dilian Gurov, KTH CSC)
Modularity as a software design principle aims at controlling the complexity of developing and maintaining large software. When applied to verification, modularity means that the individual modules are specified and verified independently of each other, while global, system-wide properties are verified relative to the specifications of the modules rather than to their implementatons. Such a relativization is the key to verifying sofware in the presence of variability, that is, of modules the implementation of which is expected to either evolve or be dynamically upgraded, or is not even available at verification time, or exists in multiple versions as resulting from a software product line.
In this tutorial I will present modular verification in the context of temporal safety properties and a program model of recursive programs that abstracts away from data. The proposed method is algorithmic and is based on the construction of maximal program models that replace the local specifications when verifying global properties.
Please register for lunch at http://doodle.com/mbngq5qt8mw6puhe by Thu Feb 7 at 8 pm. The presentation starts at 12:10 pm and ends at 1 pm. Those of us who wish reconvene after a short break for roughly one more hour of a possibly slightly more technical presentation.
05 Feb 2013 at 13:15 in room 1537
Towards a Marketplace of Open Source Software Data
(Fernando Parreiras, FUMEC University, Brazil)
Development, distribution and use of open source software comprise a market of data (source code, bug reports, documentation, number of downloads, etc.) from projects, developers and users. This large amount of data makes it difficult for people involved to make sense of implicit links between software projects, e.g., dependencies, patterns, licenses. This context raises the question of what techniques and mechanisms can be used to help users and developers to link related pieces of information across software projects. In this paper, we propose a framework for a marketplace enhanced using linked open data (LOD) technology for linking software artifacts within projects as well as across software projects. The marketplace provides the infrastructure for collecting and aggregating software engineering data as well as developing services for mining, statistics, analytics and visualization of software data. Based on cross-linking software artifacts and projects, the marketplace enables developers and users to understand the individual value of components, their relationship to bigger software systems. Improved understanding creates new business opportunities for software companies: users will be better able to analyze and compare projects, developers can increase the visibility of their products, hosts may offer plug-ins and services over the data to paying customers.
04 Feb 2013 at 12:00 in room 4523
Boolean influence
(Erik Aas, Math department, KTH)
The influence of a variable of a boolean function is a measure of how likely that variable is to control the output of the function. I'll present some fundamental results concerning the influences of threshold functions (a special kind of boolean function). If time permits we will prove the Kahn-Kalai-Linial theorem, giving a lower bound for the largest influence of a variable of a "balanced" boolean function.
Lunch is served at 12:00 (register at http://doodle.com/yd8uez9idrvba4wp by Thu Jan 31 at 8 pm). The presentation starts at 12:10 pm and ends at 1 pm. Those of us who wish reconvene after a short break for ca two hours of more technical discussions.
30 Jan 2013 at 13:15 in room 1537
Space Lower Bounds in Algebraic Proof Systems
(Nicola Galesi, Sapienza University, Rome)
The study of space measure in Proof Complexity has a gained in the last years more and more importance: first it is clearly of theoretical importance in the study of complexity of proofs; second it is connected with SAT solving, since it might provide theoretical explanations of efficiency or inefficiency of specific Theorem Provers or SAT-solvers; finally in certain cases (like the calculus of Resolution) it is connected with important characterizations studied in Finite Model Theory, thus providing a solid link between the two research fields.
In the talk I will present a recent work, joint with Ilario Bonacina, where we devise a new general combinatorial framework for proving space lower bounds in algebraic proof systems like Polynomial Calculus (PC) and Polynomial Calculus with Resolution (PCR). A simple case of our method allows us to obtain all the currently known space lower bounds for PC/PCR (CT_n, PHP^m_n, BIT PHP^m_n, XOR-PHP^m_n).
Our method can be view as a Spoiler-Duplicator game, which is capturing boolean reasoning on polynomials. Hence, for the first time, we move the problem of studying the space complexity for algebraic proof systems in the range of 2-players games, as is the case for Resolution. This can be seen as a first step towards a precise characterization of the space for algebraic systems in terms of combinatorial games, like Ehrenfeucht-Fraisse, which are used in Finite Model Theory.
More importantly, using our approach in its full potentiality, we answer to the open problem of proving space lower bounds in Polynomial Calculus and Polynomials Calculus with Resolution for the polynomial encoding of randomly chosen k-CNF formulas. Our result holds for k >= 4. Then in PC and in PCR refuting a random k-CNF over n variables requires high space measure of the order of Omega(n). Our method also applies to the Graph-PHP^m_n, which is a PHP^m_n defined over a constant (left) degree bipartite expander graph.
In the talk I will discuss a number of open problems which arise form our works which might be solved generalizing our approach.
22 Jan 2013 at 12:00 in room 1537
Verifying a fault-tolerant distributed aggregation protocol with the Coq proof assistant
(Karl Palmskog, KTH CSC)
Decentralized aggregation of data from network nodes is an important way of determining system-wide properties in distributed systems, e.g. in sensor networks. A scalable way of performing such aggregation is by maintaining a network overlay in the form of a tree, with data flowing from the leaves towards the root.
We describe a variant of the tree-based Generic Aggregation Protocol which is well suited to secure aggregation, and argue formally that the protocol has the expected behaviour for networks of arbitrary size, even in the presence of crash failures.
Practical verification techniques for communication protocols such as model checking usually require models with bounded state space. To reason about the protocol without such undue abstraction, we encode it as a transition system in the inductive logical framework of the Coq proof assistant and perform machine-assisted proofs.
Using Coq for verification requires knowledge of many techniques and theories unrelated to the problem domain, and we give an overview of the libraries, tools, and trade-offs involved.
Lunch is served at 12:00 noon (register at http://doodle.com/bkgrqqq7xq4i6vuk no later than Jan 20 at 8 pm).
07 Dec 2012 at 13:15 in room 4523
On the Power of Conflict-Driven Clause-Learning SAT Solvers
(Johan Frisk, Stockholm University)
I will present two theoretical results on modern SAT solvers regarding their power as proof systems compared to general resolution. We will see that a formalization of the algorithm used by conflict-driven clause learning (CDCL) SAT solvers polynomially simulates general resolution assuming that a specific decision strategy is used. Earlier proofs of this required an extra preprocessing step on the input formula or modifications to the solver that do not correspond to how CDCL solvers work in practical implementations.
In contrast to the first result, which works only under a non-deterministic perfect-choice decision strategy, our second result shows that with a probability of 50% the same solvers, now equipped with a totally random decision strategy, end up behaving like width-k resolution after O(n^2k+1) conflicts and restarts.
The results are from the two articles: - "On the Power of Clause-Learning SAT Solvers with Restarts" by Knot Pipatsrisawat and Adnan Darwiche (CP 2009). - "Clause-Learning Algorithms with Many Restarts and Bounded-Width Resolution" by Albert Atserias, Johannes Klaus Fichte and Marc Thurley (SAT 2009).
This is a presentation of the theoretical part of my Master's degree project.
The first part of the seminar will be a self-contained 45-minute overview of the two papers. After a break, those who so wish can return for a somewhat more detailed discussion of some of the proofs.
26 Nov 2012 at 12:00 in room 4523
Monotone submodular maximization over a matroid using local search
(Yuval Filmus, University of Toronto)
Maximum coverage is the relative of set cover in which instead of trying to cover the entire universe with as few sets as possible, we are trying to cover as many elements as possible using a fixed number of sets. The greedy algorithm gives a 1-1/e approximation. Surprisingly, Feige proved that this ratio is optimal unless P=NP.
Things get more complicated when we replace the cardinality constraint on the sets with a matroid constraint (say, there are K types of sets, and we should choose at most one of each type). The greedy algorithm now gives only a 1/2 approximation. Calinescu et al. developed a sophisticated algorithm that gives a 1-1/e approximation. Their algorithm combines gradient ascent on a continuous relaxation with optimal rounding, and also works in the more general setting of monotone submodular maximization.
We show that non-oblivious local search also gives the optimal approximation ratio. The auxiliary objective function, with respect to which the local search proceeds, gives more weight to elements covered multiple times in the current solution. We also extend our algorithm to the setting of monotone submodular maximization.
Joint work with Justin Ward.
Register for a light lunch at http://doodle.com/v5wyzey6usa78yu4 no later than Thursday at 8 pm. Lunch is served at 12:00 noon and the talk starts at 12:10 pm sharp.
13 Nov 2012 at 10:15 in room 1537
Incentive Dynamics of Interdependent Network Security
(John Chuang, UC Berkeley)
Are we investing too little in information security? Are we investing too much? Since Anderson and Varian posed these questions in 2002, much progress has been made in understanding rational decision-making in information security. In this talk, I will explore the incentive dynamics of interdependent network security, the strategic tradeoffs between protection-based and insurance-based risk mitigation alternatives, the role of experts and intermediaries, and the scalability of game theoretic models for Internet-scale security threats.
07 Nov 2012 at 13:15 in room E53
Applying Boolean Groebner Basis to Cryptography and Formal Verification
(Alexander Dreyer, Fraunhofer Institute for Industrial Mathematics ITWM)
We will present and discuss some application examples for Boolean Groebner basis from cryptography and formal verification.
Computing Groebner basis is of double-exponential complexity in worst case. But thanks to very sophisticated heuristics a plenty of practical problem can be solved in reasonable time.
We will also talk about (still) open problems.
06 Nov 2012 at 12:00 in room E53
Basics on Boolean Groebner Basis and Algebraic SAT solving
(Alexander Dreyer, Fraunhofer Institute for Industrial Mathematics ITWM)
We will give a short introduction into the topic of Groebner basis, in particular we will scratch the surface of Boolean polynomials, i.e. those polynomials with coefficients in {0,1} and a fixed degree bound (per variable) of one.
This is accompanied with a brief presentation of the data structure proposed for Boolean polynomials in our software framework PolyBoRi for computing wit POlynomials in BOolean RIngs. These polynomials are the algebraic representation of Boolean function. This enables us to use techniques (like the Groebner basis formalism) from computational algebra for solving Boolean problems. Their special properties allow for representing them effectively as binary decision diagrams, but this will need for new heuristics and algorithms.
Finally, the talk will connect Groebner basis with the classical SAT solver approach for solving satisfiability problem of propositional logic.
Please register for lunch before Sunday, 6pm.
05 Nov 2012 at 14:00 in room 4523
Privacy and Security for Location-based Applications
(Urs Hengartner, University of Waterloo)
Recently, location-based applications have become popular, with applications like Foursquare or Yelp having hundreds of thousands of users. This trend has also highlighted several security and privacy challenges of these applications. I will focus on two such challenges in my talk. First, a user's location is a crucial factor for enabling these applications. Many applications rely on users to correctly report their location. However, if there is an incentive, users might lie about their location. A location proof architecture enables users to collect proofs for being at a location and applications to validate these proofs. It is essential that proof collection and validation do not violate user privacy. I will introduce a location proof architecture with user privacy as a key design component. Second, matchmaking is a crucial part of location-based social networking applications. It notifies users of nearby people who fulfil some criteria, such as having shared interests or friends, and who are therefore good candidates for being added to a user's social network. A danger of matchmaking is that malicious users may be able to learn any nearby user's profile. I will introduce a privacy-preserving matchmaking protocol for location-based social networking applications. The protocol lets a potentially malicious user learn only the interests (or some other traits) that he has in common with a nearby user, but no other interests. Finally, I will present an implementation and evaluation of our work on Nexus One smartphones and demonstrate that the work is practical.
29 Oct 2012 at 12:00 in room 4523
Some parts of "Some optimal inapproximability results"
(Johan Håstad, KTH CSC)
On popular demand, Johan Håstad has promised to tell us about the paper that was awarded the Gödel Prize last year. In particular, he will focus on his result concerning how hard it is to find good approximate solutions to systems of linear equations.
Suppose that you have a system of equations x + y + z = ODD; x + z + w = EVEN; y + z + w = EVEN; et cetera, where you want to choose integer values (0 or 1 without loss of generality) so that as many equations as possible are satisfied. Perhaps the most naive solution you could think of is to simply assign random values to the variables, and it is not hard to see that this will satisfy 50% of the equations. Surprisingly, it turns out that this naive solution is also optimal --- there is nothing better you can do unless P is equal to NP!
This result is established using so-called probabilistically checkable proofs (PCPs), which have the amazing property that you can check the validity of such proofs just by making a few random spot checks. The famous PCP theorem says that you can code solutions to NP-complete problems in such a way that only a constant number of bits need to be read in order to verify them (with high probability). This theorem will be explained in more detail (but not proven) during the presentation.
This will be a fully self-contained lunch seminar, and no prior knowledge of the subject is assumed. We meet for a light lunch at 12:00 noon sharp. The presentation will start at 12:15 pm and last for 45-60 minutes. If there is interest, there will be a possibility to reconvene after a break and go into more details during 90-120 additional minutes of more technical discussions.
Please sign up for the seminar before 24th of October if you want to have lunch.
16 Oct 2012 at 12:00 in room 4523
Optimality of size-degree tradeoffs for polynomial calculus
(Massimo Lauria, KTH CSC)
Polynomial calculus is a way of refuting unsatisfiable CNF formulas by translating them to polynomials and proving that these polynomials have no common root. To show lower bounds on the size of such proofs, one usually proves strong lower bounds on degree, and then uses a general size-degree trade-off theorem saying that very high degree implies very high size.
There is an annoying gap in this theorem, however, in that fairly high degree, up to sqrt(n), cannot be proven to say anything about size. A natural question is whether this is inherent or whether the theorem could be strengthened so that somewhat strong degree lower bounds would yield somewhat strong size lower bounds.
We rule out this possibility by proving that the size-degree trade-off in its current form is essentially optimal. We do so by studying formulas encoding properties of linear orderings, which are known to have small proofs, and showing that these formulas require sqrt(n) degree.
Joint work with Nicola Galesi.
18 Sep 2012 at 13:15 in room 1537
Formalizing Physical Security Procedures
(Catherine Meadows, Naval Research Laboratory, Washington DC)
Although the problems of physical security emerged more than 10,000 years before the problems of computer security, no formal methods have been developed for them, and the solutions have been evolving slowly, mostly through social procedures. But as the traffic on physical and social networks is now increasingly expedited by computers, the problems of physical and social security are becoming technical problems. From various directions, many security researchers and practitioners have come to a realization that the areas such as transportation security, public and private space protection, or critical infrastructure defense, are in need of formalized engineering methodologies. Following this lead, we extended Protocol Derivation Logic (PDL) to Procedure Derivation Logic (still PDL). In contrast with a protocol, where some principals send and receive some messages, in a procedure they can also exchange and move some objects. For simplicity, in the present paper we actually focus on the security issues arising from the traffic of objects, and leave the data flows, and the phenomena emerging from the interaction of data and objects, for future work. We illustrate our approach by applying it to a flawed airport security procedure described by Schneier.
10 Sep 2012 at 13:15 in room 1537
Space Complexity in Polynomial Calculus
(Massimo Lauria, KTH CSC)
During the last decade, an active line of research in proof complexity has been to study space complexity and time-space trade-offs for proofs. Besides being a natural complexity measure of intrinsic interest, space is also an important issue in SAT solving, and so research has mostly focused on weak systems that are used by SAT solvers.
There has been a relatively long sequence of papers on space in resolution and resolution-based proof systems, and it is probably fair to say that resolution is reasonably well understood from this point of view. For other natural candidates to study, however, such as polynomial calculus or cutting planes, very little has been known. We are not aware of any nontrivial space lower bounds for cutting planes, and for polynomial calculus the only lower bound has been for CNF formulas of unbounded width in [Alekhnovich et al. '02], where the space lower bound is smaller than the initial width of the clauses in the formulas. Thus, in particular, it has been consistent with current knowledge that polynomial calculus could be able to refute any k-CNF formula in constant space.
In this paper, we prove several new results on space in polynomial calculus (PC), and in the extended proof system polynomial calculus resolution (PCR) studied in [Alekhnovich et al. '02]:
1. We prove an Omega(n) space lower bound in PC for the canonical 3-CNF version of the pigeonhole principle formulas PHP^m_n with m pigeons and n holes, and show that this is tight.
2. For PCR, we prove an Omega(n) space lower bound for a bitwise encoding of the functional pigeonhole principle with m pigeons and n holes. These formulas have width O(log(n)), and so this is an exponential improvement over [Alekhnovich et al. '02] measured in the width of the formulas.
3. We then present another encoding of a version of the pigeonhole principle that has constant width, and prove an Omega(n) space lower bound in PCR for these formulas as well.
4. Finally, we prove that any k-CNF formula can be refuted in PC in simultaneous exponential size and linear space (which holds for resolution and thus for PCR, but was not obviously the case for PC). We also characterize a natural class of CNF formulas for which the space complexity in resolution and PCR does not change when the formula is transformed into a 3-CNF in the canonical way, something that we believe can be useful when proving PCR space lower bounds for other well-studied formula families in proof complexity.
Joint work with Yuval Filmus, Jakob Nordström, Noga Ron-Zewi, and Neil Thapen.
27 Aug 2012 at 13:15 in room 1537
Computational Complexity of Quantum Satisfiability
(Martin Ziegler, TU Darmstadt)
Quantum logic was introduced in 1936 by Garrett Birkhoff and John von Neumann as a framework for capturing the logical peculiarities of quantum observables. It generalizes, and on 1-dimensional Hilbert space coincides with, Boolean propositional logic.
We introduce the weak and strong satisfiability problem for quantum logic terms. It turns out that in dimension two both are also $\mathcal{NP}$--complete.
For higher-dimensional spaces $\mathbb{R}^d$ and $\mathbb{C}^d$ with $d\geq3$ fixed, on the other hand, we show both problems to be complete for the nondeterministic Blum-Shub-Smale model of real computation. This provides a unified view on both Turing and real BSS complexity theory; and extends the still relatively scarce family of $\mathcal{NP}_{\mathbb{R}}$--complete problems with one perhaps closest in spirit to the classical Cook-Levin Theorem.
Our investigations on the dimensions a term is weakly/strongly satisfiable in lead to satisfiability problems in indefinite finite and imension. Here, strong satisfiability turns out as polynomial-time equivalent to the feasibility of noncommutative integer polynomial equations over matrix rings.
Joint work with Christian Herrmann.
20 Aug 2012 at 15:15 in room 1537
Security Strategies for Data Outsourcing and the Usability Trade-Off
(Lena Wiese)
When outsourcing data to a storage provider, several security concerns arise. We survey mechanisms for achieving confidentiality of outsourced data with a special focus on strategies for data distribution on several independent providers. A logical definition of confidentiality based on non-derivability of secrets is given. The data distribution strategies can be accompanied by cryptographic mechanisms. However, there is an intricate trade-off between confidentiality and usability (in particular, in terms of collaborating users) of outsourced data which we will briefly discuss.
29 Jun 2012 at 13:15 in room 1537
Modelling and Defence against Propagation of Worms
(Wanlei Zhou, Deakin University, Melbourne, Australia)
Each year, large amounts of money and labor are spent on patching the vulnerabilities in operating systems and various popular software to prevent exploitation by worms. Modeling the propagation process can help us to devise effective defence strategies against those wormsí spreading. In this talk we present a microcosmic analysis of worm propagation procedures.
Our proposed model is different from traditional methods and examines deep inside the propagation procedure among nodes in the network by concentrating on the propagation probability and time delay described by a complex matrix. Moreover, since the analysis gives a microcosmic insight into a wormís propagation, the proposed model can avoid errors that are usually concealed in the traditional macroscopic analytical models. The objectives of this talk are to address three practical aspects of preventingworm propagation: (i) where do we patch? (ii) how many nodes do we need to patch? (iii) when do we patch? We also carried out a series of experiments to evaluate the effects of each major component in our microcosmic model. This talk will be based on the following papers:
1. Yini Wang, Sheng Wen, Silvio Cesare, Wanlei Zhou and Yang Xiang, "The Microcosmic Model of Worm Propagation", The Computer Journal, Vol. 54 No. 10, pp. 1700-1720, 2011
2. Yini Wang, Sheng Wen, Silvio Cesare, Wanlei Zhou, and Yang Xiang, "Eliminating Errors in Worm Propagation Models", IEEE Communication Letters, VOL. 15, NO. 9, pp. 1022-1024, SEPTEMBER 2011.
3. Sheng Wen, Wei Zhou, Yini Wang, Wanlei Zhou, and Yang Xiang, "Locating Defense Positions for Thwarting the Propagation of Topological Worms", IEEE Communication Letters, VOL. 16, NO. 4, pp. 560-563, APRIL 2012.
04 Jun 2012 at 12:15 in room 1537
On the Combinatorics of Minimally Unsatisfiable Formulas
(Jakob Nordström, KTH CSC)
Suppose that we have a formula in conjunctive normal form (CNF) that is unsatisfiable but has a satisfying assignment as soon as any one clause is removed. How many variables can such a minimally unsatisfiable CNF formula constrain counted in the number of clauses m? A classic result says that the answer is m-1 variables.
Motivated by problems in proof complexity, [Ben-Sasson and Nordstrom 2009] extended the concept of minimal unsatisfiability to sets of k-DNF formulas and proved that a minimally unsatisfiable k-DNF set with m formulas can contain at most on the order of (mk)^(k+1) variables. This was far from tight, however, since they could only present explicit constructions of minimally unsatisfiable sets with mk^2 variables. In this talk, we present a significant improvement of the lower bound to (roughly) m^k, which asymptotically almost matches the upper bound above.
This talk will be completely elementary but will contain some (in our opinion) pretty cute combinatorics as well an accessible open problem. It is based on the joint paper with Alexander Razborov "On Minimal Unsatisfiability and Time-Space Trade-offs for k-DNF Resolution" presented at ICALP '11.
24 May 2012 at 10:15 in room 1537
Towards a Framework for Conflict Analysis of Normative Texts Written in Controlled Natural Language
(Gerardo Schneider, Chalmers)
Our aim is to detect whether texts written in natural language contain normative conflicts (i.e., whether there are conflicting obligations, permissions and prohibitions). In this talk we present AnaCon, a framework where such texts are written in Controlled Natural Language (CNL) and automatically translated into the formal language CL using the Grammatical Framework (GF). In AnaCon such CL expressions are analyzed for normative conflicts by the tool CLAN which gives a counter-example in case a conflict is found. We will also show the usability of AnaCon on a case study, and discuss research challenges and future directions in the area.
22 May 2012 at 12:15 in room 1537
A Dichotomy for Real Weighted Holant Problems
(Sangxia Huang, KTH CSC)
Holant is a framework of counting characterized by local constraints. It is closely related to other well-studied frameworks such as #CSP and Graph Homomorphism. An effective dichotomy for such frameworks can immediately settle the complexity of all combinatorial problems expressible in that framework. Both #CSP and Graph Homomorphism can be viewed as sub-families of Holant with the additional assumption that the equality constraints are always available. Other sub-families of Holant such as Holant^* and Holant^c problems, in which we assume some specific sets of constraints to be freely available, were also studied. The Holant framework becomes more expressive and contains more interesting tractable cases with less or no freely available constraint functions, while, on the other hand, it also becomes more challenging to obtain a complete characterization of its time complexity. Recently, complexity dichotomy for a variety of sub-families of Holant such as #CSP, Graph Homomorphism, Holant^* and Holant^c were proved. The dichotomy for the general Holant framework, which is the most desirable, still remains open. In this paper, we prove a dichotomy for the general Holant framework where all the constraints are real symmetric functions. This setting already captures most of the interesting combinatorial problems defined by local constraints, such as (perfect) matching, independent set, vertex cover and so on. This is the first time a dichotomy is obtained for general Holant Problems without any auxiliary functions.
One benefit of working with Holant framework is some powerful new reduction techniques such as Holographic reduction. Along the proof of our dichotomy, we introduce a new reduction technique, namely realizing a constraint function by approximating it. This new technique is employed in our proof in a situation where it seems that all previous reduction techniques fail, thus this new idea of reduction might also be of independent interest. Besides proving dichotomy and developing new technique, we also obtained some interesting by-products. We prove a dichotomy for #CSP restricting to instances where each variable appears a multiple of d times for any d. We also prove that counting the number of Eulerian-Orientations on 2k-regular graphs is #P-hard for any k>=2.
07 May 2012 at 10:15 in room 4523
Algorithms for Semi-random instances of Unique Games and Graph Partitioning Problems
(Konstantin Makarychev, Microsoft Research)
Many combinatorial optimization problems are much simpler in practice than in the worst-case. One of the challenges in the area of approximation algorithms is to explain this phenomenon and to design algorithms that work well in real-life. In this lecture, we will discuss one of the models of real-life instances -- the semi-random model, which was originally introduced by Blum and Spencer for the k coloring problem. I will present a new semi-random model for graph partitioning problems and give a constant factor approximation algorithm for semi-random instances of Balanced Cut. I will also talk about semi-random Unique Games.
Based on joint works with Alexandra Kolla (UIUC), Yury Makarychev (TTIC), Aravindan Vijayaraghavan (Princeton)
04 May 2012 at 15:15 in room 1537
The Grothendieck constant is strictly smaller than Krivine's bound
(Yury Makarychev, Microsoft Research)
I will talk about Grothendieck's Inequality. The inequality was proved by Grothendieck in 1953, and since then it has found numerous applications in Analysis, Quantum Mechanics and Computer Science. From the point of view of combinatorial optimization, the inequality states that the integrality gap of a certain semi-definite program is less than some absolute constant. The optimal value of this constant is called the Grothendieck constant K_G. The Grothendieck constant lies between 1.67 and 1.79, however, its exact value is unknown. The last progress on this problem was in 1977, when Krivine proved that K_G \leq \pi / (2 log(1+\sqrt{2})) and conjectured that his bound is optimal. In this talk, we will disprove this conjecture and show that K_G is strictly less than Krivine's bound. We will show that for this problem a new binary rounding scheme, which projects vectors on a random 2 dimensional subspace, performs better than the ubiquitous random hyperplane technique.
Joint work with Mark Braverman (Princeton University), Konstantin Makarychev (Microsoft Research), Assaf Naor (Courant Institute).
18 Apr 2012 at 13:15 in room 1537
Protecting the Collective Trust in the Democratic Process
(Carsten Schürmann, DemTech)
The DemTech research project aims to test the hypothesis that is possible to modernize the Danish democratic process while preserving the collective trust. In my talk I will present preliminary research results from the recent parliamentary elections in Denmark and discuss them in the light of the recent Norwegian pilot project on internet elections.
13 Apr 2012 at 13:15 in room 1537
Towards proving the correctness of a pipelined multi core MIPS processor at the gate level
(Prof. Dr. Wolfgang Paul, Universität des Saarlandes, Saarbrücken, Germany)
We outline the correctness proof of a pipelined multi core MIPS processor. The major ingredients are
1) the well known theory of number represemtations and modulo computation for the ALU construction
2) a simple theory of bus control with tri state drivers. Surprisingly, the binary hardware model is provably inadequate for this. One has to refer to propagation delays, set up and hold times.
3) a tool box for constructing all kinds of random access memories with up to 3 ports
4) a theory of shifting bytes and half words in load and store operations when accessing wide memories
5) a concise description of the sequential instruction set architecture (ISA) exploiting the specification of units like the ALU. This results in an almost trivial construction of a sequential reference machine and the corresponding correctness proof.
6) a theory of pipelining sequential reference machines using forwarding and hardware interlocks; proofs rely on scheduling functions, that keep track, what instruction are in what stage in what hardware cycle.
7) the concept of i) abstract caches, which allows in many situations to abstract from tag RAMs, and ii) of memory system slices, which project memory systems to single addresses
8) the classical theory of cache coherence protocols, showing that desirable state invariants and memory abstractions are maintained if cache accesses are performed in an arbitrary order and sequentially (!) in an atomic fashion. Observe, that in this model sequential consistency is trivial.
9) an implementation of the MOESI protocol in hardware
10) a proof that the buses for the exchange of protocol information and data between caches and main memory are free of contention. This requires to show that bus arbitration works correctly and that control automata of caches are synchronized during global transactions.
11) a classification of accessses that have the same address and can end in the same cycle
12) a proof that the hardware emulates the atomic protocol of the classical theory if hardware cache accesses are ordered by their end times (the proof is easier if flushes ending in cycle t are ordered behind read hits ending in cycle t)
13) the definition of multicore ISA referring to multiple cores which access a shared memory sequentially in an unknown order
14) the construction of a non pipelined reference implementation which steps the cores using an oracle
15) the construction of a pipelined multi core hardware by integration of the pipelined processor implementations into the shared memory implementation
16) an integration of the correctness proofs for the hardware memory system and the pipelined implemntations showing that the multi core hardware emulates the multi core reference implemntation; The oracle which makes the proof work is constructed from the sequential order of the cache accesses.
17) a subtle liveness argument involving a nontrivial use of the hardware interlock together with the shared memory implementation
This is work in progress.Thecurrentversionoftheproofunderconstructioncanbelookedupathttp://www-wjp.cs.uni-saarland.de/lehre/vorlesung/rechnerarchitektur2/ws1112/layouts/multicorebook.pdf.
At the time of writing this abstract the construction site is at item 16)
04 Apr 2012 at 13:15 in room 1537
Linear time decoding of regular expander codes
(Michael Viderman, Technion)
Sipser and Spielman (IEEE IT, 1996) showed that any $(c,d)$-regular expander code with expansion parameter $> \frac{3}{4}$ is decodable in \emph{linear time} from a constant fraction of errors. Feldman et al. (IEEE IT, 2007) proved that expansion parameter $> \frac{2}{3} + \frac{1}{3c}$ is sufficient to correct a constant fraction of errors in \emph{polynomial time} using LP decoding.
In this work we give a simple combinatorial algorithm that achieves even better parameters. In particular, our algorithm runs in \emph{linear time} and works for any expansion parameter $> \frac{2}{3} - \frac{1}{6c}$. We also prove that our decoding algorithm can be executed in logarithmic time on a linear number of parallel processors.
03 Apr 2012 at 13:15 in room 1537
Better balance by being biased: a 0.8776-algorithm for Max Bisection
(Per Austrin, University of Toronto)
Recently Raghavendra and Tan (SODA 2012) gave a 0.85-approximation algorithm for the Max Bisection problem. We improve their algorithm to a 0.8776-approximation. As Max Bisection is hard to approximate within roughly 0.8786 (under the Unique Games Conjecture) our algorithm is very nearly optimal.
We also obtain an improved algorithm for the analogous variant of Max 2-Sat. Our approximation ratio for this problem exactly matches the optimal (assuming the UGC) ratio of roughly 0.9401 for Max 2-Sat, showing that the bisection constraint does not make Max 2-Sat harder.
(Joint work with Siavosh Benabbas and Konstantinos Georgiou.)
02 Apr 2012 at 13:15 in room 1537
An Empirical Approach to Understand BitTorrent
(Boxun Zhang, TU Delft)
BitTorrent has been one of the most popular Peer-to-Peer file sharing applications in the last decade, and it generates significant amount of Internet traffic. Although the protocols and algorithms used by BitTorrent are simple, it is not trivial to understand how BitTorrent is operated and how BitTorrent users behave in real world, and such knowledge is key for researchers to improve BitTorrent and build new applications like video streaming, file backup, and social networks upon it.
To obtain such knowledge, we have performed measurements of many different BitTorrent communities, and we turn those effects into the P2P Trace Archive, a place for researchers to exchange traces. We also look into the strength and limits of various measurement techniques, and try to understand how those techniques affect the measurement results. Using our recent datasets from two of the largest BitTorrent trackers, we investigate the flash crowd phenomenon in BitTorrent, and reveal many interesting facts that are previously unknown to the community. Interestingly, we find that BitTorrent does not always perform well in flash crowds, which differs from the impressions of many of us.
15 Mar 2012 at 12:15 in room 1537
The Devil is in the Metadata – New Privacy Challenges in Decentralised Online Social Networks
(Benjamin Greschbach, KTH CSC)
Decentralised Online Social Networks (DOSN) are evolving as a promising approach to mitigate design-inherent privacy flaws of logically centralised services such as Facebook, Google+ or Twitter. A common approach to build a DOSN is to use a peer-to-peer architecture. While the absence of a single point of data aggregation strikes the most powerful attacker from the list of adversaries, the decentralisation also removes some privacy protection afforded by the central party's intermediation of all communication. As content storage, access right management, retrieval and other administrative tasks of the service become the obligation of the users, it is non-trivial to hide the metadata of objects and information flows, even when the content itself is encrypted. Such metadata is, deliberately or as a side effect, hidden by the provider in a centralised system.
In this work, we aim to identify the dangers arising or made more severe from decentralisation, and show how inferences from metadata might invade users' privacy. Furthermore, we discuss general techniques to mitigate or solve the identified issues.
The talk is a rehearsal for the paper presentation at SESOC'12 (Workshop for Security and Social Networking in Lugano on March 19th).
20 Feb 2012 at 12:15 in room 1537
Secure and insecure mixing
(Shahram Khazaei, KTH CSC)
A mix-net, first introduced by Chaum in 1981, is a tool to provide anonymity for a group of senders. The main application is electronic voting, in which each sender submits an encrypted vote and the mix-net then outputs the votes in sorted order. This talk is divided in two parts. First, I present our results on mix-nets with randomized partial checking (RPC), a heuristic protocol proposed by Jakobsson, Juels, and Rivest (2002). We identify serious issues in the original description of mix-nets with RPC and show how to exploit these to break both correctness and privacy. Our attacks are practical and applicable to real world mix-net implementations including Scantegrity, developed by a team of researchers including Chaum and Rivest. We can replace the complete output without detection. In the second part, I will describe a provably secure mix-net called TWT (trip-wire tracing). TWT is the first provably secure mix-net that can be based on any CCA2 secure cryptosystem. It is fairly efficient and uses no zero-knowledge proofs at all.
The first part is a joint work with Douglas Wikström and the second part is a joint work with Tal Moran and Douglas Wikström.
13 Feb 2012 at 13:15 in room 1537
The SAT Problem and Boolean Gröbner Bases
(Samuel Lundqvist, Stockholm University)
The Stone transformation interprets a Boolean formula as a set of polynomials with coefficients in F_2. I will explain how this set of polynomials can be analyzed in order to determine if the original Boolean formula is satisfiable. The talk is intended for computer scientists who are familiar with the SAT problem.
The seminar will be 2 x 45 minutes with a 15 minute break.
30 Jan 2012 at 13:15 in room 1537
Width-parameterized SAT: Time-Space Tradeoffs
(Bangsheng Tang, Tsinghua University)
Width parameterizations of SAT, such as tree-width and path-width, enable the study of computationally more tractable and practical SAT instances. We give two simple algorithms. One that runs simultaneously in time-space $(O^*(2^{2tw(\phi)}), O^*(2^{tw(\phi)}))$ and another that runs in time-space $(O^*(3^{tw(\phi)\log{|\phi|}}),|\phi|^{O(1)})$, where $tw(\phi)$ is the tree-width of a formula $\phi$ with $|\phi|$ many clauses and variables. This partially answers the question of Alekhnovitch and Razborov, who also gave algorithms exponential both in time and space, and asked whether the space can be made smaller. We conjecture that every algorithm for this problem that runs in time $2^{tw(\phi)\mathbf{o(\log{|\phi|})}}$ necessarily blows up the space to exponential in $tw(\phi)$.
We introduce a novel way to combine the two simple algorithms that allows us to trade \emph{constant} factors in the exponents between running time and space. Our technique gives rise to a family of algorithms controlled by two parameters. By fixing one parameter we obtain an algorithm that runs in time-space $(O^*(3^{1.441(1-\epsilon)tw(\phi)\log{|\phi|}}), O^*(2^{2\epsilon tw(\phi)}))$, for every $0<\epsilon<1$. We systematically study the limitations of this technique, and show that these algorithmic results are the best achievable using this technique.
We also study further the computational complexity of width parameterizations of SAT. We prove non-sparsification lower bounds for formulas of path-width $\omega(\log|\phi|)$, and a separation between the complexity of path-width and tree-width parametrized SAT modulo plausible complexity assumptions.
This is joint work with Eric Allender, Shiteng Chen, Tiancheng Lou, Periklis Papakonstantinou.
23 Jan 2012 at 13:15 in room 1537
Time-Space Tradeoffs in Resolution: Superpolynomial Lower Bounds for Superlinear Space
(Christopher Beck, Princeton University)
For modern SAT solvers based on DPLL with clause learning, the two major bottlenecks are the time and memory used by the algorithm. This raises the question of whether this memory bottleneck is inherent to Resolution based approaches, or an artifact of the particular search heuristics currently used in practice? There is a well known correspondence between these algorithms and the Resolution proof system, in which these resources correspond to the length and space of proofs. While every tautology has a linear-space proof, this proof is in general of exponential size, raising the issue of size-space tradeoffs: perhaps, in high space, there is a short proof, but with constrained space only much longer proofs exist. Space complexity and time-space tradeoffs have been the subject of much recent work in proof complexity, but until this work, no such bound applied to super-linear amounts of space.
We obtain strong time-space tradeoff lower bounds for a simple and explicit collection of formulas - in particular for any k, we give a sequence of formulas of length n such that with n^k space there is a proof of length polynomial in n, but for which every proof is superpolynomial when the space is constrained to n^{k/2}. Thus, on these instances, if you want to run in polynomial time, you need a large polynomial amount of space.
Joint work with Paul Beame and Russell Impagliazzo.
The seminar will be about 45-60 minutes and does not require any prior knowledge of proof complexity.
03 Jan 2012 at 13:15 in room 1537
An Epistemic Approach to Mechanism Design
(Rafael Pass, Cornell University)
We introduce an epistemic framework for analyzing mechanisms. This framework enables mechanism designers to define desirability of outcomes not only based on players' actual payoff types and their beliefs about the payoff types of other players (as in the classic models), but also based on higher order beliefs of the players (i.e., beliefs about beliefs about ... the payoff types of the players). In this framework, we may also use epistemic solution concepts to analyze what outcomes are consistent with different levels of rationality: a player is k-level rational if he is rational and considers all other players (k-1)-level rational; following Aumann, we consider a very weak notion of rationality: player i is *rational* if he uses a strategy \sigma such that for every alternative strategy \sigma', i considers some world possible where \sigma performs at least as well as \sigma'.
We showcase this framework in the context of single-good auctions, presenting an interim individually-rational mechanism with the following revenue guarantee: for any k\geq 0, any outcome consistent with all players being (k+1)-level rational guarantees the seller a revenue of G^k - \epsilon (for any \epsilon > 0), where G^k is the second highest belief about belief about ... (k times) about the highest valuation of some player. We additionally show that no interim individually rational mechanism can guarantee a revenue of G^k - \epsilon for any constant \epsilon, if only assuming players are k-level rational (as opposed to (k+1)-level rational). Taken together, these results demonstrate the existence of a ``revenue-rationality hierarchy'': strictly higher revenue may be extracted by assuming players satisfy higher levels of rationality.
Towards analyzing our mechanism and proving our lower bounds, we introduce an iterative deletion procedure of dominated strategies that precisely characterizes strategies consistent with k-level rationality.
Prior knowledge of mechanism design or epistemic logic will not be assumed.
Joint work with Jing Chen and Silvio Micali.
15 Dec 2011 at 10:15 in room 1537
Modern SAT Solving: CDCL and Inprocessing
(Matti Järvisalo, University of Helsinki)
Boolean satisfiability (SAT) has become an attractive approach to solving hard decision and optimization problems arising from artificial intelligence, knowledge representation, and various industrially relevant domains. The success of the SAT-based approach relies heavily on the development of increasingly robust and efficient SAT solvers. This talk gives a two-part overview of the current state-of-the-art SAT solver technology based on the conflict-driven clause learning (CDCL) paradigm. In the first part, I will provide a basic overview of the most important components of CDCL SAT solvers today. The second part of the talk concentrates on the important aspect of practical preprocessing for SAT and the inprocessing SAT solving paradigm in which more extensive reasoning is interleaved with the core satisfiability search (not only before search). I will review some of the most successful SAT preprocessing techniques, and give an overview of our recent work (joint work with Armin Biere and Marijn Heule) on developing new reasoning techniques for pre- and inprocessing.
The seminar will be 2*45 minutes with a 15 minute break.
07 Dec 2011 at 13:15 in room 1537
An additive combinatorics approach to the log-rank conjecture in communication complexity
(Noga Zewi, Technion, Haifa)
For a {0,1}-valued matrix M let CC(M) denote he deterministic communication complexity of the boolean function associated with M. The log-rank conjecture of Lovasz and Saks [FOCS 1988] states that CC(M) <= log^c(rank(M)) for some absolute constant c where rank(M) denotes the rank of M over the field of real numbers.
We show that CC(M) <= c rank(M)/ logrank(M) for some absolute constant c, assuming a well-known conjecture from additive combinatorics, known as the Polynomial Freiman-Ruzsa (PFR) conjecture.
Our proof is based on the study of the "approximate duality conjecture" which was recently suggested by Ben-Sasson and Zewi [STOC 2011] and studied there in connection to the PFR conjecture. First we improve the bounds on approximate duality assuming the PFR conjecture. Then we use the approximate duality conjecture (with improved bounds) to get the aforementioned upper bound on the communication complexity of low-rank martices, and this part uses the methodology suggested by Nisan and Wigderson [Combinatorica 1995].
Joint work with Eli Ben-Sasson and Shachar Lovett.
The talk will be 2*45 minutes with the first 45 minutes intended for a general audience.
05 Dec 2011 at 13:15 in room 1537
Robust set-valued prediction in games
(Jörgen Weibull, Handelshögskolan, Stockholm)
Game theory has transformed economics and greatly influenced other social and behavioral sciences. The central solution concept used in applications is that of Nash equilibrium. Yet Nash equilibria can be fragile and Nash equilibrium play does not generally follow from assumptions of rationality or of evolution. It is here argued that an exploration of methods for robust set-valued prediction in games is called for, and some such approaches and avenues for future research are discussed.
24 Nov 2011 at 13:15 in room 4523
Integrity Protection for Authorized Changes - Sanitizable Signatures with Transparency or Detectability
(Heinrich Pöhls, University of Passau, Germany)
Sanitizable Signature Schemes enhance Digital Signatures by allowing the signer to allow a third-party called the sanitizer to make authorized changes to signed document. I will introduce our newest results on re-definition of the security properties Transparency and Detectability for Sanitizable Signatures. Transparency has been defined already by G. Ateniese, D. H. Chou, B. de Medeiros, and G. Tsudik in their 2005 ESORIC's paper titled "Sanitizable signatures". Transparency in a nutshell means that a verifier cannot distinguish with a probability better than 0.5 between a sanitized and an original version of a signed document.
We have introduced and formalized the notion of detectability, which captures that a verifier will always be able to detect a sanitization. This still allows an authorized change to be verified, but it becomes detectable. In this talk I will explain the security properties of sanitizable signatures and show some applications. Finally, I will introduce the legal implications of transparency, which we circumvent by using a detectable sanitizable signature scheme.
14 Nov 2011 at 13:15 in room 4523
Sampling Massive Online Graphs: Challenges, Techniques, and Applications to Facebook
(Maciej Kurant, University of California, Irvine)
Online Social Networks (OSNs) have recently emerged as a new killer application, and are of interest to a number of communities, ranging from computer science and engineering to social sciences. Because of their sheer size (Facebook alone has more than 800 million active users), OSNs are widely studied today based on *samples* collected through measurements of publicly available information. In this talk, I will give an overview of our recent work on sampling OSNs.
First, I will discuss how to efficiently crawl a friendship graph to collect a representative sample of its *nodes*. To this end, I will introduce two novel techniques: (i) "Multigraph Sampling" that exploits different relations between nodes, and (ii) "Stratified Weighted Random Walk" that preferentially crawls nodes more relevant to our measurement. I will evaluate these techniques on real-life OSNs, such as Facebook and LastFM. This will also allow me to study some basic characteristics of these OSNs, e.g., the privacy awareness in Facebook.
Second, I will focus on *topology* rather than on nodes alone. Breadth First Search (BFS) is a natural and widely used sampling technique in this context. Unfortunately, BFS is subject to nontrivial and often very significant biases, which I quantify analytically. As a viable alternative, I propose to study a "coarse-grained" version of the underlying topology, and I show how to estimate it based on a sample of nodes. Finally, I will apply this methodology to Facebook to obtain a global country-to-country friendship network (more examples available at geosocialmap.com).
This work is joint with Athina Markopoulou, Minas Gjoka, and Carter Butts at the University of California, Irvine and with Patrick Thiran at EPFL, Lausanne. Parts of this work appear in IEEE INFOCOM 2010, ITC 2010, ACM SIGMETRICS 2011 and IEEE JSAC 2011.
24 Oct 2011 at 13:15 in room 1537
The Landscape of Structural Graph Parameters
(Michail Lampis, KTH CSC)
In traditional computational complexity we measure algorithm running times as functions of one variable, the size of the input. Though in this setting our goal is usually to design polynomial-time algorithms, most interesting graph problems are unfortunately believed to require exponential time to solve exactly.
Parameterized complexity theory refines this by introducing a second variable, called the parameter, which is supposed to quantify the “hardness” of each specific instance. The goal now becomes to confine the combinatorial explosion to the parameter, by designing an algorithm that runs in time polynomial in the size of the input, though inevitably exponential in the parameter. This will allow us to tackle instances where the parameter value is much more modest than the input size, which will happen often if the parameter is chosen well.
Of course, this setting is rather general and there are countless ways in which one may attempt to measure the hardness of specific instances, depending on the problem. The parameterized graph algorithms literature has been dominated for a long time by a very successful notion called treewidth, which measures graph complexity by quantifying a graph's similarity to a tree. However, more recently the study of alternative graph parameters and widths has become more popular. In this (self-contained) talk we will attempt to explore the algorithmic properties of treewidth, its related graph width parameters, as well as other graph invariants which may serve as measurements of a graph's complexity such as Vertex Cover, or Max-Leaf number. We will focus especially on general results which prove tractability for whole classes of problems characterized by expressibility in certain logics, results often referred to as "algorithmic meta-theorems".
10 Oct 2011 at 13:15 in room 1537
An Anonymity Framework for SaaS
(Ricardo Puttini, University of Brasília)
Cloud computing technology has recently experienced rapid growth and continuous adoption. Enterprises are evermore taking advantage of its dynamicity and ubiquity. However, migrating strategic services to the cloud has put embracing companies in a fragile spot. Entrusting key business knowledge to cloud service providers and consequently relinquishing any claims to complete privacy is the tradeoff applied to adopters of the technology. This happens because the cloud computing service model called SaaS (Software as a Service) makes use of service contracts that inherently give the service provider access to unreasonable amounts of information regarding the consumer’s business. The service consumer has, then, no option but to trust the service provider. Because of that, there emerges a demand for a reliable, private and economically viable framework for cloud service consumption.
Our contribution unfolds as a privacy aware environment for cloud service consumption with a payment mechanism for maintaining the commercially feasibility of the SaaS service model. Regarding the supporting of privacy, anonymity techniques are employed both in service contract and network levels. As for the payment methods, encryption algorithms in the area of blind signatures complete the proposed framework.
27 Sep 2011 at 10:15 in room 1537
Increasing availability in a p2p storage system through a truthful taxation mechanism
(Krzysztof Rzadca, University of Warsaw, Poland)
In peer-to-peer storage systems, peers replicate each others' data in order to increase availability. If the matching is done centrally, the algorithm can optimize data availability in an equitable manner for all participants. However, if matching is completely decentralized, peers' selfishness can greatly alter the results, leading to performance inequities that can render the system unreliable and thus ultimately unusable.
In this presentation, we show game-theoretic mechanisms that reduce the price of anarchy, i.e., the relative loss of efficiency in the decentralized matching scenario. The mechanism "taxes" the highly-available peers. A fraction of their replication slots is used by a centralized algorithm to replicate data of weakly-available peers. We prove the conditions under which the mechanism is incentive-compatible, i.e., no peer gains by artificially lowering its availability. We also experimentally show that the mechanism renders the system usable, as the data availability of weakly-available peers increases by approximately two orders of magnitude.
26 Sep 2011 at 16:15 in room 1537
Codes tailor-made for distributed networked storage
(Anwitaman Datta, NTU Singapore)
Redundancy is essential for fault-tolerance in distributed networked storage systems – which are ubiquitous and come in diverse flavors (e.g., p2p storage, data-centers). Erasure codes provide orders of magnitude better performance than replication in terms of fault-tolerance/storage overhead trade-offs, however traditional erasure codes incur high overhead for recreating lost redundancy in the system. This cardinal drawback has led to a recent flurry in designing codes which are tailor-made with the nuances of distributed storage systems in mind. In this talk, we will provide a brief overview of some such proposed codes, concluding with self-repairing codes.
More details can be found at: http://sands.sce.ntu.edu.sg/CodingForNetworkedStorage/
19 Sep 2011 at 13:15 in room 4523
A little advice can be very helpful
(Arkadev Chattopadhyay, University of Toronto)
Proving superpolylogarithmic lower bounds for dynamic data structures has remained an open problem despite years of research. Recently, Patrascu proposed an exciting new approach for breaking this barrier via a two player communication model in which one player gets private advice at the beginning of the protocol. He gave reductions from the problem of solving an asymmetric version of set-disjointness in his model to a diverse collection of natural dynamic data structure problems in the cell probe model. He also conjectured that, for any hard problem in the standard two-party communication model, the asymmetric version of the problem is hard in his model, provided not too much advice is given.
We prove several surprising results about his model. We show that there exist Boolean functions requiring linear randomized communication complexity in the two-party model, for which the asymmetric versions in Patrascu's model have deterministic protocols with exponentially smaller complexity. For set-disjointness, which also requires linear randomized communication complexity in the two-party model, we give a deterministic protocol for the asymmetric version in his model with a quadratic improvement in complexity. These results demonstrate that Patrascu's conjecture, as stated, is false. In addition, we show that the randomized and deterministic communication complexities of problems in his model differ by no more than a logarithmic multiplicative factor.
We also prove lower bounds in some restricted versions of this model for natural functions such as set-disjointness and inner product. All of our upper bounds conform to these restrictions.
This is joint work with J. Edmonds, F. Ellen and T. Pitassi.
16 Sep 2011 at 10:15 in room 1537
How Unique and Traceable are Usernames?
(Daniele Perito, INRIA, France)
Usernames are ubiquitously used for identification and authentication purposes on web services and the Internet at large, ranging from the local-part of email addresses to identifiers in social networks. Usernames are generally alphanumerical strings chosen by the users and, by design, are unique within the scope of a single organization or web service. In this paper we investigate the feasibility of using usernames to trace or link multiple profiles across services that belong to the same individual. The intuition is that the probability that two usernames refer to the same physical person strongly depends on the “entropy” of the username string itself. Our experiments, based on usernames gathered from real web services, show that a significant portion of the users’ profiles can be linked using their usernames. In collecting the data needed for our study, we also show that users tend to choose a small number of related usernames and use them across many services. To the best of our knowledge, this is the first time that usernames are considered as a source of information when profiling users on the Internet.
12 Sep 2011 at 13:15 in room 1537
Algorithmic analysis and complexity lower bounds
(Rahul Santhanam, University of Edinburgh)
I will discuss some connections between analysis of algorithms for satisfiability and complexity lower bound techniques. First I will present an algorithm for Formula-SAT which runs in time 2^{n-\Omega(n)} on formulae of linear size. The analysis of the algorithm relies on the method of random restrictions, originally used to prove formula size lower bounds. Then I will sketch a connection between the Neciporuk lower bound technique in concrete complexity and the algorithmic technique of memoization. If time permits, I will show that the published analyses of some well-known algorithms for satisfiability are tight, and pose some questions about further connections in this vein.
05 Sep 2011 at 13:15 in room 1537
Hardness amplification for polynomial threshold proof systems
(Trinh Huynh, University of Washington)
Using known results from multiparty number-on-forehead communication complexity, we exhibit a simple hardness amplification method that converts any t-CNF formula of large rank complexity in Resolution, for constant t, into a new related formula requring large rank and tree-like size complexity in much stronger proof systems, namely polynomial threshold proof systems, which consist of all systems for proving propositional tautologies by manipulating degree-k polynomial inequalities (collectively denoted as Th(k) proof systems). This method works for any k<loglog n, where n is the size of the formula. These include Gomory-Chvatal cutting-planes (CP) (as a special case of Th(1)) and Lovasz-Schrijver systems (as special cases of Th(2)). This method thus yields new rank and tree-like size lower bounds for a large number of natural formulas in these systems, whereas previously, lower bounds for only a handful of problems were known.
For every even constant t>5, we also obtain strong integrality gap results for the MAX-t-SAT problem for Th(1) systems, which include CP. We also show that Th(k) systems are strictly stronger than Th(k-1) systems, for every k<loglog n, in terms of rank and tree-like size.
This is joint work with Paul Beame and Toniann Pitassi.
22 Aug 2011 at 15:30 in room 1537
Monotonicity testing and shortest-path routing on the cube
(Arie Matsliah, IBM Haifa Research Laboratory and Technion - Israel Institute of
Technology)
How many edges does one need to remove from a directed n dimensional cube to disconnect t disjoint source-sink vertex pairs? We show a construction where removing t/\sqrt{n} edges suffices. This answers a decade-old question of Lehman and Ron related to monotonicity testing, and gives a stronger counterexample to Szymanski's conjecture on hypercube routing.
The presentation is self-contained and elementary.
Based on a joint work with Jop Briet, Sourav Chakraborty and David Garcia-Soriano.
20 June 2011 13:15 room 1537
Internet Privacy: Who gathers data and how, and what can be done about it
(Balachander Krishnamurthy, AT&T research)
For the last few years we have been examining the leakage of privacy on the Internet: how information related to individual users is aggregated as they browse seemingly unrelated Web sites. Thousands of Web sites across numerous categories, countries, and languages were studied to generate a privacy "footprint". Our results show increasing aggregation of user-related data by a steadily decreasing number of entities. I will present results from three studies on leakage of personally identifiable information (PII) via Online Social Networks (both traditional and mobile OSNs) and popular non-OSN sites. I will discuss various options on what can be done about this serious problem.
30 May 2011 13:15 room 1537
The complexity of conservative valued CSPs
(Stanislav Živný, University College, Oxford, UK)
We study the complexity of valued constraint satisfaction problems (VCSP). A problem from VCSP is characterised by a constraint language, a fixed set of cost functions over a finite domain. An instance of the problem is specified by a sum of cost functions from the language and the goal is to minimise the sum. Under the unique games conjecture, the approximability of finite-valued VCSPs is well-understood (Raghavendra [FOCS’08]). However, there is no characterisation of finite-valued VCSPs, let alone general-valued VCSPs, that can be solved exactly in polynomial time, thus giving insights from a combinatorial optimisation perspective.
In this talk, I'll show Schaefer-like dichotomy theorem for general-valued constraint languages including all unary cost functions over arbitrary domains (such languages are called conservative). This work generalises several previous results, including an algorithm from Cohen et al. [TCS'08], and dichotomies for Boolean languages (Cohen at al. [AIJ'06]), Min-Cost Hom (Takhanov [STACS'10]), and Max-CSPs (Deineko et al. [JACM'08]). Compared to Deineko et al. [JACM'08], our results do not involve any computer-assisted search.
Joint work with V. Kolmogorov.
16 May 2011 10:15 room 1537
Short Propositional Refutations for Dense Random 3CNF Formulas
(Iddo Tzameret, Tsinghua University, Beijing, China)
This is a talk about propositional proofs and satisfiability. First, I am going to give a short high level introduction to proof complexity. Then, I will show that almost all 3CNF formulas with high enough clause-to-variable ratio have short propositional refutations, already in a fairly weak propositional proof system.
No prior knowledge in proof complexity is assumed.
Joint work with Sebastian Muller. (For more details see: http://iiis.tsinghua.edu.cn/~tzameret/3CNF-tcz.pdf)
The talk is intended to last for about 60 minutes.
02 May 2011 13:15 room 1537
Balanced Partitions of Trees
(Andreas Feldmann)
We study the k-BALANCED PARTITIONING problem in which the vertices of a graph are to be partitioned into k sets of size at most \ceiling{n/k} each while minimising the cut-size, which is the number of edges connecting vertices in different sets. The problem on general graphs is well studied, while no results are known for restricted graph classes.
We initiate the research on restricted graph classes by focusing on trees. We show that approximating the cut-size is APX-hard even if the maximum degree of the tree is bounded by a constant. If instead the diameter of the tree is bounded by a constant, we show that it is NP-hard to approximate the cut-size within n^c , for any constant c < 1. On the positive side we show that if the constraint on the balance of the sets is relaxed such that at most (1 + \epsilon) n/k vertices are allowed in any of the k sets, then (for constants \epsilon > 0) there is a polynomial time approximation scheme that computes a partition with a cut-size of at most that of an optimal balanced solution.
28 Apr 2011 13:15 room 1537
Approximating Graphic TSP by Matchings
(Tobias Moemke, KTH CSC)
We present a framework for approximating the metric TSP based on a novel use of matchings. Traditionally, matchings have been used to add edges in order to make a given graph Eulerian, whereas our approach also allows for the removal of certain edges leading to a decreased cost.
For the TSP on graphic metrics (graph-TSP), the approach yields a 1.461-approximation algorithm with respect to the Held-Karp lower bound. For graph-TSP restricted to a class of graphs that contains degree three bounded and claw-free graphs, we show that the integrality gap of the Held-Karp relaxation matches the conjectured ratio 4/3. The framework allows for generalizations in a natural way and also leads to a 1.586-approximation algorithm for the traveling salesman path problem on graphic metrics where the start and end vertices are prespecified.
The talk will be approximately 2*45 min and somewhat technical.
04 Apr 2011 15:15 room 1537
Neutrality Based Symmetric Cryptanalysis
(Shahram Khazaei, KTH CSC)
Cryptographic primitives are the basic components of any cryptographic tool. Block ciphers, stream ciphers and hash functions are the fundamental primitives of symmetric cryptography. In symmetric cryptography, the communicating parties perform essentially the same operation and use the same key, if any.
This presentation concerns cryptanalysis of stream ciphers and hash functions based on my findings during my Ph.D. studies at Swiss Federal Institute of Technology, Lausanne, Switzerland (EFPL). The main contribution of this work is introducing the concept of probabilistic neutrality for the arguments of a function, a generalization of the definition of neutrality. This concept finds application in cryptanalysis. An input argument of a given function is called neutral if it does not affect the output of the function. This simple idea has already been implicitly used in key recovery cryptanalysis of block ciphers and stream ciphers. However, in 2004, Biham and Chen explicitly used the idea of neutrality to speed up collision finding algorithms for hash functions. We call an input argument of a function probabilistic neutral if it does not have a "significant" influence on the output of the function. Simply stated, it means that if the input argument is changed, the output of the function stays the same with a probability "close" to one. We will exploit the idea of probabilistic neutrality to assess the security of several stream ciphers and hash functions. To the best of our knowledge, this is the first time that the probabilistic neutrality has found diverse applications in cryptanalysis.
11 Mar 2011 13:15 room 4523
Behavioral specifications of object-oriented components: How do trace-based and model-based techniques compare?
(Arnd Poetzsch-Heffter, University of Kaiserslautern)
The literature distinguishes between trace-based and state-based specification techniques for object-oriented components. Trace-based techniques describe behavior in terms of the message histories of components. State-based specifications explain component behavior by defining how the state is changed by method calls and what the returned results are. The state space can either be abstract or concrete. Abstract states are used to model the behavior without referring to the implementation. Concrete states are expressed by the underlying implementation. State-based specifications are usually described in terms of pre- and postconditions of methods.
In this talk, we investigate the relationship between trace-based specifications and specifications based on abstract states for sequential, object-based components. We present a technique for specifying interaction patterns of components and show that the technique allows to formulate both trace-based and state-based specifications. In particular, we illustrate how callbacks can be handled. We define the semantics of specifications in terms of transition systems and discuss how different specifications can be compared.
07 Feb 2011 13:15 room 1537
Taming Code Explosion in Supercompilation
(Peter Jonsson, Luleå University of Technology)
Clear and concise programs written in functional programming languages often suffer from poor performance compared to their counterparts written in imperative languages such as Fortran or C. Supercompilation is a program transformation that can mitigate many of these problems by removing intermediate structures and performing program specialization.
Unfortunately supercompilation sometimes also suffer from the problem of code explosion. This results in huge binaries which might hurt the performance on a modern processor. We present a revised supercompilation algorithm that is fast enough to speculatively supercompile expressions and discard the result if it turned out bad. This allows us to supercompile large parts of the imaginary and spectral parts of nofib in a matter of seconds while keeping the binary size increase below 5%.
04 Feb 2011 10:15 room 1537
Resident Evil: After-Life Vulnerabilities in Firefox. A Study on Firefox Evolution, its Vulnerabilities, and its Fixes
(Fabio Massacci, University of Trento)
I will discuss the interplay between the evolution of Firefox source code and its vulnerabilities over six major versions (v1.0, v1.5, v2.0, v3.0, v3.5, and v3.6) spanning almost ten years of development, and integrating a numbers of sources (NVD, CVE, MFSA, Firefox CVS).
Somewhat surprisingly we found out that a large fraction of today's vulnerabilities apply to code from older versions no longer mantained. We call these after-life vulnerabilities. This somewhat contradicts and somewhat confirms the Milk-or-Wine study of Ozment and Schechter as we did not found enough evidence that most vulnerabilities are foundational while they are still more than they should.
The surprise will be spelled out after digging into a new metric which we call the LOC's market share (as opposed to the software or version market share), where we are able to show that old code is still very much in use both in terms of instances and as global codebase: versions might be replaced in the span of 6 months but we actually use the same code of 10 years ago.
This is empirical evidence that the software-evolution-as-security solution (patching software and automatic updates) might not work, and that vulnerabilities will have to be mitigated by other means.
Joint Work with S. Neuhaus and V. H. Nguyen.
31 Jan 2011 10:15 room D4448 (D-building, next to D42)
From Trusted Systems to the Smart Grid
(Klaus Kursawe, Radboud University Nijmegen)
In this presentation, I will cover several layers of trust implementation from essential building blocks to architecture questions. We will start with mechanism used to implement secure hardware building blocks, then moving up to basic security services such as key establishment protocols for restricted devices, and ending in distributed security architectures. The presentation will finish with latest activities in Smart-Grid security, with a special emphasis on privacy protection in this context.
27 Jan 2011 10:15 room 4523
Securing Mobile Unattended WSNs against a Mobile Adversary
(Roberto di Pietro, Department of Mathematics, Roma Tre University)
Wireless Sensor Networks (WSNs) security is complicated by the lack of inexpensive tamper resistant hardware in commodity sensors. Indeed, once an adversary compromises a sensor, all memory and forms of storage become exposed, along with all secrets. Thereafter, any cryptographic remedy ceases to be effective. Regaining sensor security after compromise (i.e., intrusion-resilience) is a formidable challenge.
Prior approaches that rely on either (1) the presence of an on-line trusted third party (sink), or (2) the availability of a True Random Number Generator (TRNG) on each sensor, cannot be adopted in large-scale Unattended Wireless Sensor Networks (UWSNs), composed of low-cost commodity sensors and characterized by the intermittent presence of a sink.
In this talk, we explore intrusion resilience in Mobile UWSNs in the presence of a powerful mobile adversary. We show how the choice of the sensor mobility model influences intrusion resilience with respect to this adversary. We also explore self-healing protocols that require only local communication. Results indicate that sensor density and neighborhood variability are the two key parameters affecting intrusion resilience. Findings are supported by extensive analyses and simulations.
14 Jan 2011 10:15 room 4523
Security by degrees: the process of security risk and trust management
(Simon Foley, Department of Computer Science, University College Cork, Ireland)
A challenge to securing open systems is the process of assuring robustness to failure due to threats that exploit vulnerabilities in design, implementation and deployment. The provision of this assurance assumes that: requirements are understood; threats are properly identified, and the right security controls are available to mitigate the threats. Notwithstanding the challenge of verifying security of a complex system, what happens in practice is that requirements are misunderstood, threats are misidentified and security controls selection is limited. This talk discusses insights to these problems and offers some solutions based on my research in trust management and security configuration management.
12 Jan 2011 15:15 room 1537
Femtocells - A femtostep to the holy grail
(Prof. Jean-Pierre Seifert, Deutsche Telekom Laboratories, TU Berlin)
Mobile network operators are adapting femtocells in order to simplify their network architecture for increased coverage, performance, and greater revenue opportunities. While emerging as a new low-cost technology which assures best connectivity, it has also introduced a range of new potential security risks for the mobile network operators. We analyze these security issues and demonstrate the weaknesses of femtocell security. First, we describe and demonstrate weaknesses in the location verification techniques that create problems for various important services such as lawful interception services, emergency call services, and the operator's business. Next, we outline several security flaws that allow attackers to gain root access or install malicious applications on the femtocell, and to remain undetected even when it has connected to the operator's network. Furthermore, we experimentally evaluated and showed a wide range of possible threats to femtocells; including compromise of femtocell credentials; physical, configuration, and protocol attacks; attacks on the core network; user data and identity privacy attacks. Experimental results suggest that location security methods are insufficient to avoid femtocell misuse. In addition, the vulnerabilities we found suggest that commercially available femtocells fail to fulfill 3GPP security requirements and could expose operator network elements to the attacker. Our findings and successful attacks exhibit the need for further research to bridge the gap between theoretical and practical security of femtocell devices.
13 Dec 2010 13:15 room 4523
Network coding and guessing games
(Klas Markström, Umeå university)
In a network coding problem there are multiple senders who each wants to send a message to some receiver in a network. In traditional information transfer protocols bottlenecks may appear when several different messages which need to pass through a given router arrive there at the same time. In network coding this problem is solved by letting the routers transform the messages, in such a way that the receivers can compute their intended message from a set of arrive messages. Soren Riis found an equivalent formulation for the problem of finding optimal network coding protocols in terms of a guessing game on a graph, and this also turned out to be equivalent to a problem in circuit complexity. In this talk I will discuss some results by myself and Demetres Christofides which determine the optimal solutions to these games and, using linear programming and entropy, also present a conjecture on the optimal values for all undirected graphs.
29 Nov 2010 13:15 room 1537
On the Semantics of Local Characterizations for Linear-Invariant Properties
(Jakob Nordström, KTH CSC)
A property of functions on a vector space is said to be linear-invariant if it is closed under linear transformations of the domain. Linear-invariant properties are some of the most well-studied properties in the field of property testing. Testable linear-invariant properties can always be characterized by so-called local constraints, and of late there has been a rapidly developing body of research investigating the testability of linear-invariant properties in terms of their descriptions using such local constraints. One problematic aspect that has been largely ignored in this line of research, however, is that syntactically distinct local characterizations need not at all correspond to semantically distinct properties. In fact, there are known fairly dramatic examples where seemingly infinite families of properties collapse into a small finite set that was already well-understood.
In this work, we therefore initiate a systematic study of the semantics of local characterizations of linear-invariant properties. For such properties the local characterizations have an especially nice structure in terms of forbidden patterns on linearly dependent sets of vectors, which can be encoded formally as matroid constraints. We develop techniques for determining, given two such matroid constraints, whether these constraints encode identical or distinct properties, and show for a fairly broad class of properties that these techniques provide necessary and sufficient conditions for deciding between the two cases. We use these tools to show that recent (syntactic) testability results indeed provide an infinite number of infinite strict hierarchies of (semantically) distinct testable locally characterized linear-invariant properties.
Joint work with Arnab Bhattacharyya, Elena Grigorescu, and Ning Xie
15 Nov 2010 13:15 room 4523
Structural Properties of Hard Problem Instances
(Tobias Mömke, KTH CSC)
Most of the hardness results for NP-hard problems are derived for worst-case scenarios and in many cases it is not clear whether the actual problem instances arising in practical applications exhibit this worst-case behavior. A recent branch of algorithmic research aims at a more fine-grained analysis of the hardness of optimization problems. The main idea behind this analysis is to find some parameter according to which one can classify the hardness of problem instances. In this spirit, we characterize instances for the metric TSP according to the solution computed by Hoogeveen's 5/3-approximation algorithm for the problem to find a Hamiltonian path with prespecified ends in the same metric graph. Our analysis reveals that the sets of the hardest instances of both problems for Christofides' and Hoogeveen's algorithm are disjoint in the sense that any instance is guaranteed to allow at least one of the two algorithms to achieve a significantly improved approximation ratio. In particular, any input instance that leads to a $5/3$-approximation with Hoogeveen's algorithm enables us to find an optimal solution for the traveling salesman problem.
Oct 20 2010 13:15 room 1537
A counter-example guided abstraction refinement scheme for parameterized verification.
(Ahmed Rezine, Uppsala Univerity)
I will introduce ''monotonic abstraction'' as an approach to verify systems with an arbitrary number of concurrent and communicating processes, i.e., parameterized systems. Monotonic abstraction is particularly successful in automatic verification of safety properties for parameterized systems. The main drawback is that it sometimes generates spurious counter-examples.
I will describe a counterexample-guided abstraction refinement (CEGAR) framework for monotonic abstraction. The CEGAR algorithm automatically extracts from each spurious counterexample a set of configurations called a ''Safety Zone'' and uses it to refine the abstract transition system of the next iteration. This approach gave encouraging results and allowed the verification of several parameterized systems.
Oct 18 2010 13:15 room 1537
On the Relative Strength of Pebbling and Resolution
(Jakob Nordström, KTH CSC)
In the early 70s, combinatorial pebble games played on directed acyclic graphs were introduced as a way of studying programming languages and compiler construction. These games found a broad range of applications in computational complexity theory and were extensively studied in the 70s and 80s.
Somewhat surprisingly, the last decade has seen a revival of interest in pebble games in the context of proof complexity. In particular, pebbling has proven a very useful tool for understanding time-space trade-offs. Very roughly, what one can do is to encode instances of the pebble game as propositional logic formulas, and then argue that these formulas (almost) inherit the pebbling properties of the underlying graphs.
The crux of the matter here is what "almost" means. Graphs and formulas are very different objects, and the reductions we have between the two are far from tight. In this work, we introduce a new flavour of pebbling that gives better reductions than were previously known. We also construct graph families for which the gap in the current reductions does not matter. There are still a number of problems regarding space complexity and time-space trade-offs that, although simple to state, remain wide open, however, and time permitting we will discuss some of these problems.
This talk is intended to be roughly 45 minutes long and to be completely self-contained. In particular, no prerequisites in proof complexity or pebbling are required. The talk is based on a paper that appeared in the Conference on Computational Complexity 2010.
Sept 27 2010 13:15 room 1537
Virtualization and Security
(Christian Gehrmann, Swedish Institute of Computer Science)
Part I: Background to Virtualization and Security
Virtualization is becoming increasingly popular. In this talk we give a brief overview of the technologies behind virtualization. In particular we discuss the security challenges and opportunities in virtualized systems.
Part II: Hypervisors for security in embedded systems
Security threats on consumer devices such as mobile phones are increasing as the software platforms become more open and complex. Therefore, hypervisors, which bring potential new secure services to embedded systems, are becoming increasingly important. We look into how to design hypervisor-based security architecture for an advanced mobile phone and compare with alternative approaches such as the ARM TrustZone technology.
Sept 23 13:15 1537
Non-deterministic Matrices and their Applications
(Anna Zamansky, Tel-Aviv University)
One of the main principles of classical and many-valued logic is truth-functionality: the truth-value assigned to a complex formula is uniquely determined by the truth-values of its subformulas. In commonsense reasoning, however, an agent often needs to deal with inherently non-deterministic phenomena: partially unknown information, faulty devices and ambiguity of natural language are only a few cases in point. It is clear that non-determinism, the very essence of which is contradictory to the principle of truth-functionality, cannot be captured by classical or many-valued logics. One possible solution is to borrow the idea of non-deterministic computations from automata and computability theory and to apply it to evaluations of formulas. This leads to the introduction of Non-deterministic Matrices (Nmatrices), which are a generalization of standard many-valued matrices, in which the truth-value of a complex formula is chosen non-deterministically out of a certain set of options. Although applicable to a much wider family of logics, finite Nmatrices have all the advantages of ordinary finite-valued semantics. In fact, there are many useful (propositional) logics that have no finite ordinary semantics, but do have finite non-deterministic semantics, and are thus decidable.
In this talk we survey the theory and a number of applications of Nmatrices in different areas. One such application is in the area of paraconsistent logics, which are logics that allow non-trivial inconsistent theories and are useful for reasoning with inconsistent information. Nmatrices can be used to provide simple and modular semantics for a large family of paraconsistent logics known as Logics of Formal Inconsistency. Another application of Nmatrices is in proof theory: there is a remarkable correspondence between two-valued Nmatrices and important syntactic properties, such as as cut-elimination, invertibility of rules, etc. in a natural class of Gentzen-type systems called Canonical Calculi.
May 27 2010 13:15 room 1537
Hardness of 3Lin over the reals
(Subhash Khot, New York University)
TBA
May 19 2010 10:00 4423
Cryptographic Hash Functions: Theory and Practice
(Bart Preneel, Katholieke Universiteit Leuven)
The first designs of cryptographic hash functions date back to the late 1970s. In the early 1990s, MD5 and SHA-1 were deployed in an ever increasing number of applications; as a consequence, hash functions became the "Swiss army knifes" of cryptography. In spite of the importance of hash functions, only limited effort was spent on studying their formal definitions and foundations. In 2004 Wang et al. perfected differential cryptanalysis to a point that finding collisions for MD5 became very easy; for SHA-1 a substantial reduction of the security margin was obtained. This breakthrough has resulted in a flurry of research, resulting in new constructions and a growing body of foundational research. NIST announced in November 2007 that it would organize the SHA-3 competition, with as goal to select a new hash function family by 2012. From the 64 candidates submitted by October 2008, 14 have made it to the second round. This talk presents an overview of the state of hash functions 30 years after their introduction; it also discusses the progress of the SHA-3 competition.
May 12 2010 11:00 room 4423
Understanding Space in Proof Complexity: Separations and Trade-offs via Substitutions
(Jakob Nordström, MIT CSAIL)
In recent years, deciding if a CNF formula is satisfiable has gone from a theoretical question to a practical approach for solving real-world problems. For current state-of-the-art satisfiability algorithms, typically based on resolution and clause learning, the two main bottlenecks are the amounts of time and memory used. Understanding time and memory consumption of SAT-solvers, and how these resources are related to one another, is therefore a question of considerable interest.
Roughly a decade ago, it was asked whether proof complexity had anything intelligent to say about this question, corresponding to the interplay between size and space of proofs. In this talk, I will explain how this question can be answered almost completely by combining two tools, namely good old pebble games on graphs, studied extensively in the 70s and 80s, and a new, somewhat surprising, theorem showing how the hardness of CNF formulas can be amplified simply by making variable substitutions.
This talk is based partly on my PhD thesis, presented with the Ackermann Award 2009, and partly on some subsequent papers. Most of it is joint work with Eli Ben-Sasson. The talk will be self-contained, so no background is needed.
April 26 2010 13:15 room 1537
CFlow: A security-preserving cryptography-implicit compiler for distributed programs
(Gurvan Le Guernic, KTH CSC)
In language-based security, confidentiality and integrity policies conveniently specify the permitted flows of information between different parts of a program with diverse levels of trust. These policies enable a simple treatment of security, and they can often be verified by typing. However, their enforcement in concrete systems involves delicate compilation issues. We consider cryptographic enforcement mechanisms for imperative programs with untrusted components. Such programs may represent, for instance, distributed systems connected by some untrusted network. In source programs, security depends on an abstract information-flow policy for accessing the shared memory. In their implementations, shared memory is unprotected and security depends instead on encryption and signing. We build a translation from well-typed source programs and policies to cryptographic implementations. To establish its correctness, we develop a cryptographic type system for a target probabilistic language. Our typing rules enforce the correct usage of cryptographic primitives against active adversaries; from an information-flow viewpoint, they capture controlled forms of robust declassification and endorsement. We show type soundness for a variant of the non-interference property, then show that our translation preserves typability. We rely on concrete primitives and hypotheses for cryptography, stated in terms of probabilistic polynomial-time algorithms and games. We model these primitives as commands in our target language. Thus, we develop a uniform language-based model of security, ranging from computational non-interference for probabilistic programs down to standard cryptographic hypotheses.
Jan 25 2010 13:15 4523
En introduktion till OWASP -- The Open Web Application Security Project
(John Wilander, Omegapoint / Chapter leader OWASP Sweden)
En introduktion till den världsomspännande, ideella säkerhetscommunityn OWASP. Genom vår ständigt växande wiki, öppna projekt, gratis böcker och konferenser så arbetar OWASP för högre säkerhet i applikationsutveckling. Mest kända är vi för vår topp tio-lista över de största säkerhetsriskerna i webbapplikationer (OWASP Top 10) men vi har många andra strängar på lyran:
Dec 03 2009 10:15 1537
Introduction to Supercompilation
(Peter Jonsson, Luleå)
High levels of abstraction, the possibility to reason about software components in isolation, and the ability to compose different components together are crucial to improve productivity in software development. A pure functional language gives the ability to reason equationally about programs, along with features such as higher-order functions that aid programmer productivity. While these mechanisms increase productivity they also come with a problem known as the 'abstraction penalty': as the programmer increases the abstraction level to deal with system complexity, the performance of the resulting program decreases.
I will show, through examples, how the abstraction penalty can be automatically removed through supercompilation. Anyone who has taken a basic course in functional programming should be able to follow my examples. I will briefly survey previous work and identify some open problems.
Dec 01 2009 13:15 1537
A Refined State Monad, with applications to capability-based access control
(Johannes Borgström, )
Behavioural type and effect systems regulate properties such as adherence to object and communication protocols, dynamic security policies, avoidance of race conditions, and many others. Typically, each system is based on some specific syntax of constraints, and is checked with an ad hoc solver.
Instead, we advocate types refined with first-order logic formulas as a basis for behavioural type systems, and general purpose automated theorem provers as an effective means of checking programs.
To illustrate this approach, we give type systems for two related notions of permission-based access control: stack inspection and history-based access control. These type systems are both instances of a refined state monad.
Our main technical result is a safety theorem stating that no assertions fail when running a well-typed program.
Oct 29 2009 09:00 MDI-Torget
Evolving Contracts
(Gerardo Schneider, Chalmers)
Any formalism to describe contracts must be able to capture evolvability over time, and also to correlate such evolutions to changes in the environment or in the behavior of the parties involved in contracts. Yet, few works have focused on the general problem of verifying evolvable contracts.
In this talk I will present ongoing work on the definition of an abstract theory of dynamic contracts, including some preliminary results concerning verification of static and dynamic contracts. Starting from a very general view of contracts as syntactic entities that characterize sets of traces, I show how to accomodate two essential ingredients of dynamic contracts: spillover, which characterizes the remains of a clause when it is withdrawn from a contract, and power, which characterizes when a principal is entitled to perform a change in a contract. Although the technical development is carried in an abstract setting, I will illustrate our definitions and results using contract languages for rights and obligations; these languages, despite their simplicity, share many essential features with other formalisms for digital right management and access control, and are therefore representative of the potential interest of our approach.
(Joint work with Gilles Barthe and Gordon Pace)
Tuesday August 25, 14.00, room 1537:
PeerSoN: Privacy-Preserving P2P Online Social Networks
(Sonja Buchegger, Deutsche Telekom Laboratories)
Online Social Networks like Facebook, MySpace, Xing, etc. have become extremely popular. Yet they have some limitations that we want to overcome for a next generation of social networks: privacy concerns and requirements of Internet connectivity, both of which are due to web-based applications on a central site whose owner has access to all data.
To overcome these limitations, we envision a paradigm shift from client-server to a peer-to-peer infrastructure coupled with encryption so that users keep control of their data and can use the social network also locally, without Internet access. This shift gives rise to many research questions intersecting networking, security, distributed systems and social network analysis, leading to a better understanding of how technology can support social interactions.
Our project consists of several parts. One part is to build a peer-to-peer infrastructure that supports the most important features of online social networks in a distributed way. We have written a first prototype to test our ideas. Another part is concerned with encryption, key management, and access control in such a distributed setting. Extending the distributed nature of the system, we investigate how to integrate such peer-to-peer social networking with ubiquitous computing and delay-tolerant networks, to enable direct exchange of information between devices and to take into account local information.
Monday May 25, 13.15, room 1537 (Lindstedtsvägen 5, floor 5):
Three holy grails of programming models
(Joachim Parrow, Uppsala Universitet)
I shall discuss three important paradigms for formulating models of programming languages, the technical problems involved in unifying them, and how it connects to recent work joint with Jesper Bengtson, Magnus Johansson and Björn Victor to appear at LICS '09. This will also serve to put the pi-calculus in perspective by explaining its underlying motivations and real achievements and limitations. The intended audience should have a reasonable grasp on programming but needs not be familiar with any particular formalmodels.
Wednesday May 27, 10.15, room 4423 (Lindstedtsvägen 5, floor 4):
Game Theory with Costly Computation
(Rafael Pass, Cornell University)
We develop a general game-theoretic framework for reasoning about strategic agents performing possibly costly computation. In this framework, many traditional game-theoretic results (such as the existence of a Nash equilibrium) no longer hold. Nevertheless, we can use the framework to provide psychologically appealing explanations to observed behavior in well-studied games (such as finitely repeated prisoner's dilemma and rock-paper-scissors). Furthermore, we provide natural conditions on games sufficient to guarantee that equilibria exist.
As an application of this framework, we consider a notion of game-theoretic implementation of mediators in computational games. We show that a special case of this notion is equivalent to a variant of the traditional cryptographic definition of protocol security; this result shows that, when taking computation into account, the two approaches used for dealing with deviating players in two different communities---Nash equilibrium in game theory, and zero-knowledge simulation in cryptography---are intimately connected.
Joint work with Joe Halpern.
Wednesday May 27, 13.15, room 1537 (Lindstedtsvägen 5, floor 5):
Quantitative Social Choice Theory
(Elchanan Mossel,
UC Berkeley / Weizmann Institute of Science)
I will survey recent results giving quantitative versions of theorems in economics regarding social choice (voting) functions. The focus of the talk will be a quantitative proof of Arrow's Impossibility Theorem. The proof is based on new combinatorial arguments coupled with use of an inverse hyper-contractive estimate by Borell and non-linear invariance principles.
Friday December 12, 13.15, room 1537:
Reducing Behavioural Properties to Structural Properties of
Programs with Procedures
(Dilian Gurov, Theory Group, KTH CSC)
(Joint work with Marieke Huisman, University of Twente)
There is an intimate link between program structure and behaviour. Exploiting this link to phrase program correctness problems in terms of the structural properties of a program graph rather than in terms of its unfoldings is a useful strategy for making analyses more tractable. This talk presents a characterisation of behavioural program properties through sets of structural properties by means of a translation. The characterisation is given in the context of a program model based on control flow graphs of sequential programs with procedures, and properties expressed in a fragment of the modal mu-calculus with boxes and greatest fixed-points only. The property translation is based on a tableau construction that conceptually amounts to symbolic execution of the behavioural formula, collecting structural constraints along the way. By keeping track of the subformulae that have been examined, recursion in the structural constraints can be identified and captured by fixed-point formulae. The tableau construction terminates, and the characterisation is exact, i.e., the translation is sound and complete. A prototype implementation has been developed. We discuss several applications of the characterisation, in particular compositional verification for behavioural properties, based on maximal models.
Friday November 28, 10.00 (NB! no academic quarter!), room 4523 (Lindstedtsvägen 5, floor 5):
Nearly spherical cubes
(Ryan O'Donnell,
School of Computer Science, Carnegie Mellon University)
What is the least surface area of a shape that tiles d-dimensional space when shifted by all vectors in the integer lattice? A unit cube is such a shape, and has surface area 2d. On the other hand, any such shape must have volume 1 and hence surface area at least that of the volume-1 ball, namely about sqrt(2 pi e) sqrt(d). We nearly close the gap, using a randomized construction to show that there exists a tiler with surface area at most 4 pi sqrt(d). The problem was originally motivated by questions in computational complexity theory; our construction generalizes a discretized solution given by Raz in the complexity-theory setting.
Tuesday October 7, 15.15, room 1537:
Security Policy Enforcement through Transactional Memory Introspection
(Úlfar Erlingsson,
School of Computer Science, Reykjavík University)
Correct enforcement of authorization policies is a difficult task, especially for multi-threaded software. Even in carefully-reviewed code, unauthorized access may be possible in subtle corner cases. This talk introduces Transactional Memory Introspection (TMI), a novel reference monitor architecture that builds on Software Transactional Memory-a new, attractive alternative for writing correct, multi-threaded software. TMI may be seen as an early language-based security result in a promising new area that is both well-suited to formalization and can also hold large practical benefits
TMI facilitates correct security enforcement by simplifying how the reference monitor integrates with software functionality. In particular, TMI can help ensure complete mediation of security-relevant operations, eliminate race conditions related to security checks, and simplify handling of authorization failures. The talk will present the design, implementation, and initial formalization of TMI-based reference monitors. The talk also describes the results of our initial experiments, which confirm the value of the TMI architecture and that it incurs only acceptable runtime overhead.
Friday September 12, 10.15, room 1439:
A clearer picture of approximation resistance
(Per Austrin, Theory Group, KTH CSC)
A constraint satisfaction problem (CSP) is said to be approximation resistant if it is hard to approximate within a factor better than what is obtained by a random assignment. We talk about recent progress on characterizing approximation resistant CSPs.
Based on joint works with Johan Håstad and Elchanan Mossel.
Monday March 31, 13.00, room 1537:
A Transformation from the Fibonacci to the Galois
Non-Linear Feedback Shift Registers
(Professor Elena Dubrova, ECS/ICT/KTH)
Conventional Non-Linear Feedback Shift Registers (NLFSRs) use the Fibonacci configuration in which the feedback is applied to the first bit only. In this paper, we show how to transform a Fibonacci NLFSR into an equivalent NLFSR in the Galois configuration, in which the feedback can potentially be applied to every bit. Such a transformation reduces the depth of the circuits implementing feedback functions, thus decreasing the propagation time and increasing the throughput. The practical significance of the presented technique is that is makes possible increasing (in some cases doubling) the keystream generation speed of any Fibonacci NLFSR-based stream cipher with no area penalty.
Wednesday January 8, 10.15, room 4523:
Structural Operational Semantics for Computational Effects
(John Power, Reader, Dept. of Computer Science, University of Bath)
(joint with Gordon Plotkin)
In seeking a unified study of computational effects, a fundamental task is to give a unified structural operational semantics, together with an adequate denotational semantics for it, in such a way that, for the leading examples of computational effects, the general definitions are consistent with the usual operational semantics for the relevant effects. One can readily produce a unified operational semantics that works fine for examples that include various forms of nondeterminism and probabilistic nondeterminism. But that simple semantics fails to yield a sensible result in the vitally important case of state or variants of state. The problem is that one must take serious account of coalgebraic structure. I shall not formally enunciate a general operational semantics and adequacy theorem in this talk, but I shall explain the category theory that supports such a semantics and theorem. I shall investigate, describe, and characterise a kind of tensor of a model and a comodel of a countable Lawvere theory, calculating it in leading examples, primarily involving state. Ultimately, this research supports a distinction between what one might call coalgebraic effects, such as state, and algebraic effects, such as nondeterminism
WENNER-GREN FOUNDATIONS DISTINGUISHED LECTURE
Wednesday November 21, 15:00, room D3
Cryptography in Financial Transactions: Current Practice and Future
Directions
(Professor Jacques Stern, Ecole Normale Supérieure and CNRS)
In this talk I will briefly describe the history of cryptography and explain how it became an area of scientific research, served by a strong community both in academia and in the industry.
Next I will cover two case studies.
The first is related to banking cards and payment terminals and the second to Internet banking. In both cases, I will show how cryptographic tools crafted within the research community in the past thirty years entered the picture and how cryptographers were able to provide stronger and stronger levels of security.
While applications are now stable in the first area, there is more to come in the second.
Monday October 15, 15:15, room 4523:
Optimal Bounds for Predecessor Search and the First Separation between Linear and Polynomial Space
(Mikkel Thorup, AT&T Labs Research)
(joint work with Mihai Patrascu from STOC'06 and SODA'07)
We develop a new technique for proving cell-probe lower bounds for static data structures. Previous lower bounds used a reduction to communication games, which was known not to be tight by counting arguments. We give the first lower bound for an explicit problem which breaks this communication complexity barrier. In addition, our bounds give the first separation between polynomial and near linear space. Such a separation is inherently impossible by communication complexity.
Using our lower bound technique and new upper bound constructions, we obtain tight bounds for searching predecessors among a static set of integers. We determine the optimal query time for any combination of space and word size w. In particular, we show that the classic van Emde Boas search time of O(log w) cannot be improved, even if we allow randomization. This is a separation from polynomial space, since Beame and Fich [STOC'99] give a predecessor search time of O(log w / log log w) using quadratic space.
Monday October 8, 13:15, room 1537:
Beating Semidefinite Programming Means Beating The Unique Games Conjecture
(Per Austrin, Theory Group, KTH CSC)
During the last few years, there have been several results of the form "if the Unique Games Conjecture is true, then problem X can not be approximated better than what is achieved by algorithm Y, based on semidefinite programming", indicating a strong connection between the UGC and the limitations of SDP-based approximation algorithms.
For 2-CSP problems in particular this connection has been very evident, with the optimal parameters for the hardness reductions for Max Cut and Max 2-Sat coming directly from the analysis of the best SDP-based approximation algorithms for the problems.
We generalize these results, by considering an arbitrary boolean 2-CSP (or more generally, an arbitrary nonnegative objective function on two boolean variables), and show that a set of "obstructions" towards obtaining a good rounding procedure for the SDP relaxation can be translated into a matching UG-hardness result. We also show that, under a certain conjecture on the nature of worst-case angles for the SDP relaxation, this result is tight. This conjecture is supported by all previous results for specific 2-CSPs.
The talk will be 45 minutes long and will be held in English.
Tuesday October 2, 13:15, room 4523:
Model Checking Network Applications
(Cyrille Artho, AIST, Japan)
This tutorial addresses a new model checking technique for networked applications. Such applications could not be model checked by traditional techniques, as multiple processes cannot be checked in normal (single-process) software model checkers. Our approach is to convert processes into threads and to model network communication using a special library and model checker extensions. Other approaches include the usage of stubs or a special cache that can serialize the state space exploration tree.
Monday September 10, 13:15, room 1537:
Generating Propagators for Finite Set Constraints
(Christian
Schulte, Department of Electronic, Computer, and Software Systems, KTH ICT)
Constraint programming is a successful and widely used method for solving combinatorial optimization problems. An essential ingredient for any constraint programming system are propagators implementing constraints performing strong constraint propagation.
Ideally, programming propagators as implementations of constraints should be an entirely declarative specification process for a large class of constraints: a high-level declarative specification is automatically translated into an efficient propagator.
This talk introduces the use of existential monadic second-order logic as declarative specification language for finite set propagators. The approach taken is to automatically derive projection propagators (involving a single variable only) implementing constraints described by formulas. By this, we transfer the ideas of indexicals to finite set constraints while considerably increasing the level of abstraction available with indexicals. We show soundness and completeness of the derived propagators and present a runtime analysis, including techniques for efficiently executing projectors for n-ary constraints.
Joint work with:
Wednesday June 27, 13:15, room 1537:
Towards modular verification of concurrent object-oriented programs
(Marieke Huisman, INRIA Sophia Antipolis)
Modular static verification of concurrent object-oriented programs remains a challenge. This talk discusses the impact of concurrency on the use of standard program-logic-based verification techniques.
Atomicity of methods is often advocated as a solution to the problem of verification of multithreaded programs. However, we show that in a design-by-contract framework atomicity in itself is not sufficient, because it does not consider specifications. Instead, we propose to use the notion of stability of method contracts, to allow sound modular reasoning about method calls. A contract is stable if it cannot be broken by interferences from concurrent threads.
We explain why stability of contracts cannot always be shown directly, and we speculate about different approaches to prove stability. One approach that we will detail further is the use of an annotation system to describe object capacities and locking policies. The annotation system can be used to specify how many threads simultaneously can access an object. The annotation system distinguishes between read-write accesses and read-only accesses, thus offering fine-grained concurrency control. The locking policy of an object describes which locks must be held, before accessing it. The annotation system can express how ownership may be transferred or split between different threads.
The information that is given by the annotations can be exploited to verify other properties of the application. In particular, if an object is known to be local to a thread, sequential verification techniques can be used to verify functional correctness of its methods. We finish by outlining how a proof obligation generator for sequential programs can be extended to one for concurrent programs by using stability information.
This talk does not present a full technical solution to the problem, but instead describes work in progress. It shows how the verification problem can be decomposed into several smaller subproblems. For each subproblem, a solution is sketched, but the technical details still need to be worked out.
(Joint work with Clement Hurlin)
Thursday June 14, 13:15, room 1537:
Logic-based Specification and Verification of Multi-Agent Systems
(Alessio Lomuscio, Department of Computing,
Imperial College London)
Multi-agent systems are open, highly-autonomous systems whose components act rationally, independently or interacting with their peers, to achieve their design objectives. While several formalisms in Artificial Intelligence have been developed in the past to represent multi-agent systems, the issue of their automatic verification has acquired prominence only very recently. In this talk I will try to discuss some of my own contribution to this area. In particular I will survey some recent work on a family of temporal, epistemic, correctness, ATL logics as well as symbolic model checking techniques (both obdd- and sat-based) for their verification. I will discuss current research directions and, present brief demonstrations of MCMAS, a specialised model checker for temporal, epistemic, ATL logics.
Wednesday April 25, 13:15, room 1537:
Towards the Engineering of Modular Software for Increased Predictability
(Michel Schellekens, Centre for Efficiency-Oriented Languages, University College Cork)
We focus in this talk on two main methods used in academia and industry to optimize/evaluate software: worst-case and average-case analysis. These methods can be used in a variety of contexts for optimization purposes. For instance in a Real-Time context, to efficiently budget resources, and in embedded systems, for optimizing power consumption.
A crucial property for the predictability of software is modularity, i.e. the capacity to predict the behaviour of software from the behaviour of its components. It is shown that the worst-case measure typically does not allow for exact modularity. Current Real-Time approaches to static worst-case analysis are discussed in this light. On the other hand, we show that the average-case measure does possess inherent modularity. We show how this modularity can be exploited, based on a redesign of standard data structuring operations, to track the distribution of the data states throughout a computation. This approach in turn has enabled the specification of the novel programming language MOQA, implemented in Java 5.0, and its associated timing tool DISTRI-TRACK. MOQA (MOdular Quantitative Analysis), essentially a suite of data structure operations for modular design, is guaranteed to be modular w.r.t. the average-case time measure. This is not the case for general purpose programming languages and in particular for current languages geared towards automated average-case analysis.
The approach links several, thus far largely separate, areas together, including Semantics, Complexity, Analysis of Algorithms and Real-Time Language design. The corresponding unified foundation for algorithmic analysis has led to the solution of bottle-neck problems in automated average-case timing (open problems on dynamic algorithms, first investigated by Knuth) and has given rise to novel algorithms.
The talk focuses on the intuitions underlying the approach and should be accessible to anyone with a standard undergraduate background in the Analysis of Algorithms. The talk touches on some core issues which will be discussed in the book ``A Modular Calculus for the Average Cost of Data Structuring'', to appear with Springer.
Monday March 5, 13:15, room 1537:
Computational Aspects of Random Boolean Networks
(Elena Dubrova, KTH ICT)
Research on networks becomes essential to all branches of sciences as we struggle to interpret the data coming from neurobiology, genomics, economics, ecology, and the Internet. Random Boolean Networks (RBNs) were introduced by Kaufmann in 1969 in the context of gene expression and fitness landscapes. They were applied to the problems of cell differentiation, immune response, evolution, and neural networks. They have also attracted the interest of physicists due to their analogy with the disordered systems studied in statistical mechanics, such as the mean field spin glass. An RBN is a synchronous Boolean automaton. Each vertex has k predecessors, selected at random, and an associated Boolean function of k variables. Kauffman has shown that it is possible to tune the parameters of an RBN so that the network exhibits self-organized critical behavior ensuring both stability and evolutionary improvements. Statistical features of self-organized RBNs match the characteristics of living cells. This talk focuses on computational aspects of RBNs. First, we give an introduction to random Boolean networks and show how they can be used for modeling of gene regulatory networks of living cells. Then, we describe three basic steps of the analysis of dynamical behavior of RBNs: redundancy removal, partitioning, and computation of attractors. Finally, we discuss open problems and outline prospectives of RBNs.
Joint TCS/CIAM Seminar
Tuesday February 27, 15:15-17:00, room D3:
Security and Cryptography in Mobile Wireless Networks
(Mats Näslund, Ericsson Research)
In the first half of the talk I will give an overview of the security functions and cryptography used in today's GSM (2G) and UMTS (3G) networks. I will also describe some cryptographic attacks on 2G security, largely due to 'security by obscurity' approaches used in the past. This has led to rethinking the security design principles, and in the second half I will present the new data authentication algorithm for UMTS which in a formal setting can be proven to be (unconditionally) secure. The only pre-requisites are basic understanding of finite fields.
Tuesday January 23, 13:15, room 1537:
Learning of timed systems
(Olga Grinchtein, Uppsala University)
We present an algorithm for constructing a timed-automaton model of a system from information obtained by observing its external behavior. The construction of models from observations of system behavior can be seen as a learning problem. For finite-state reactive systems, it means to construct a (deterministic) finite automaton from the answers to a finite set of membership queries, each of which asks whether a certain word is accepted by the automaton or not. This problem is well understood, e.g., by the work by Angluin. We extend this approach to learning of timed systems, modeled by deterministic event-recording automata. Our construction deviates from previous work and that of Angluin in that it first constructs a so called timed decision tree from observations of system behavior. When sufficiently many observations have been recorded, this decision tree is folded into an event-recording automaton.
joint work with Bengt Jonsson and Paul Pettersson
Monday December 11, 10:15, room 4523 (NB! Not the usual time and place):
Higher Level Fusion For Catastrophic Events
(Galina L. Rogova PhD, Encompass Consulting, USA)
The core purpose of higher level fusion (situation and threat assessment) is to infer and approximate the characteristics and critical events of the environment in relation to specific goals, capabilities and policies of the decision makers. The higher level fusion processes utilize fused data about objects of interest, dynamic databases, maps, and expert knowledge, and opinion for context processing. The result of higher level fusion is a coherent composite picture of the current and predicted situation, which provides human experts with essential information to help them understand and control the situation, and act effectively to mitigate its impact. Situation and threat assessment processing has to be adaptive to resource and time constraints, new and uncertain environments, and reactive to uncertain and unreliable heterogeneous inputs.
The presentation will discuss major challenges, specific requirements, and approaches to designing higher level fusion processes as applied to the problem of crisis management.
The higher level fusion processing described in the presentation exploits synergy between cognitive work analysis and ontological analysis of the specific domain, developed within the framework of a formal ontology. The combination of cognitive work analysis and ontology provides a formally structured and computationally tractable domain representation capturing the basic structures of relevant objective reality and users’ domain knowledge and requirements. This domain representation further serves as a basis for generating domain specific situational hypotheses and high-level reasoning about these hypotheses. The dynamic situational picture is built by analyzing spatial and temporal relations of the situational entities and entity aggregations at different levels of granularity, and their dynamics provided within the overall situational context. Special attention is paid to "inference for best explanation" aimed at discovery of the underlying causes of observed situational entities and their behavior. Belief Based Argumentation system, a reasoning framework considered, represents a generalization of Probabilistic Argumentation System. It allows for allocating rational belief in hypotheses about the environment by utilizing given knowledge to find and combine arguments in favor of and against them.
The presented methodology also includes multi-step inter-level and intra-processing information exchange comprising a quality control and a belief update steps.
Monday November 20, 13:15, room 1537:
Counting Set Covers
(Andreas Björklund, Lund Institute Of Technology)
In a Set Cover problem we are given a ground set U of size n and a family S of size m of subsets of U and we want to know if U can be covered by k of the sets from S. We give two related algorithms which are stronger than applying dynamic programming over all subsets of U.
Based on two recent papers with Thore Husfeldt announced at ICALP 2006 and FOCS 2006. The seminar will be roughly 45 minutes long.
Monday October 2, 13:15, room 1537:
BitTorrent
(Stefan Nilsson, Theory Group, KTH CSC)
BitTorrent är ett filöverföringsprotokoll som gör det möjligt att med mycket små serverresurser distribuera stora filer till många användare på kort tid. I det här föredraget kommer jag att beskriva hur protokollet fungerar, berätta hur Bram Cohen uppfann det och diskutera dess skalbarhet, säkerhet och begränsningar.
Seminariet blir ca 45 minuter långt.
Wednesday June 21, 15:15, room 1537:
Cybersecurity and its limitations
(Andrew Odlyzko, University of Minnesota)
Network security is terrible, and we are constantly threatened with the prospect of imminent doom. Yet such warnings have been common for the last two decades. In spite of that, the situation has not gotten any better. On the other hand, there have not been any great disasters either. To understand this paradox, we need to consider not just the technology, but also the economics, sociology, and psychology of security. Any technology that requires care from millions of people, most very unsophisticated in technical issues, will be limited in its effectiveness by what those people are willing and able to do. The interactions of human society and human nature suggest that security will continue being applied as an afterthought. We will have to put up with the equivalent of bailing wire and chewing gum, and to live on the edge of intolerable frustration. However, that is not likely to block development and deployment of information technology, because of the non-technological protection mechanisms in our society.
The talk will be roughly 60 minutes long.
Thursday June 15, 10:15, room 1537:
A Framework for Sequential
Planning in Multi-Agent Settings
(Piotr Gmytrasiewicz, Department of Computer Science, University of Illinois at Chicago)
This work extends the framework of partially observable Markov decision processes (POMDPs) to multi-agent settings by incorporating the notion of agent models into the state space. Agents maintain beliefs over physical states of the environment and over models of other agents, and they use Bayesian updates to maintain their beliefs over time. The solutions map belief states to actions. Models of other agents may include their belief states and are related to agent types considered in games of incomplete information. We express the agents' autonomy by postulating that their models are not directly manipulable or observable by other agents. We show that important properties of POMDPs, such as convergence of value iteration, the rate of convergence, and piece-wise linearity and convexity of the value functions carry over to our framework. Our approach complements a more traditional approach to interactive settings which uses Nash equilibria as a solution paradigm. We seek to avoid some of the drawbacks of equilibria which may be non-unique and do not capture off-equilibrium behaviors. We do so at the cost of having to represent, process and continuously revise models of other agents. Since the agent's beliefs may be arbitrarily nested, the optimal solutions to decision making problems are only asymptotically computable. However, approximate belief updates and approximately optimal plans are computable. We illustrate our framework using a simple application domain, and we show examples of belief updates and value functions.
Thursday June 8, 13:15, room 1537:
Narrow Proofs May Be Spacious: Separating Space and Width in Resolution
(Jakob Nordström,
Theory Group, KTH CSC)
[slides
]
Resolution is a proof system for proving tautologies in propositional logic. It works by showing that the negation of a tautology, encoded as a CNF formula, is unsatisfiable. There is only one derivation rule, namely that from the clauses C ∨ x and D ∨ ¬ x we can resolve on the variable x to derive the resolvent clause C ∨ D. A resolution proof refutes an unsatisfiable formula F by deriving the empty clause 0, i.e., the clause with no literals, from F.
Because of its simplicity, resolution is well adapted to proof search algorithms. Many real-world automated theorem provers are based on resolution. It is also perhaps the single most studied propositional proof system from a theoretical point of view in the area of proof complexity.
The width of a resolution proof is the maximal number of literals in any clause of the proof. The space of a proof is, intuitively, the maximal number of clauses one needs to keep in memory while verifying the proof. Both of these measures have previously been studied and related to the resolution refutation size of unsatisfiable CNF formulas. Also, the refutation space of a formula has been proven to be at least as large as the refutation width, but it has been open whether space can be separated from width or the two measures coincide asymptotically. We prove that there is a family of k-CNF formulas for which the refutation width in resolution is constant but the refutation space is non-constant, thus solving a problem mentioned in several previous papers.
Our result has been published as ECCC Report TR05-066, and an extended abstract appeared in STOC '06 (co-winner of Danny Lewin Best Student Paper Award).
The talk will be given in Swedish or English depending on the participants, and is intended to last for 2x45 minutes.
Wednesday May 31, 13:15, room 1537:
Confluent Markov Chains
(Parosh Abdulla, Department of Information Technology, Uppsala University)
We consider infinite-state discrete Markov chains which are confluent: each computation will almost certainly either reach a defined set F of final states, or reach a state from which F is not reachable. Confluent Markov chains include probabilistic extensions of several classical computation models such as Petri nets, Turing Machines, and communicating finite-state machines.
For confluent Markov chains, we consider three different variants of the reachability problem and the repeated reachability problem: The qualitative problem, i.e., deciding if the probability is one (or zero); the approximate quantitative problem, i.e., computing the probability up-to arbitrary precision; and the exact quantitative problem, i.e., computing probabilities exactly.
We also study the problem of computing the expected reward (or cost) of runs until reaching the final states, where rewards are assigned to individual runs by computable reward functions.
Monday May 8, 13:15, room 1537:
Cryptography in mobile networks
(Mats Näslund, Ericsson Research)
Cryptography in mobile networks has over the last 15 years gone from security-by-obscurity to a more sound approach of using publicly scrutinized algorithms. I will give a brief history of cryptography in GSM and UMTS, and then focus on some quite interesting new algorithms for UMTS that have recently been developed by ETSI SAGE. In particular, the new integrity algorithm is based on an (unconditionally) provably secure construction. It is based on quite old but nice (and easy) results by Carter and Wegman, which for some reason have not started to see wider deployment until recently.
Tuesday May 2, 13:15, room 1537:
CoverTranslator - from Haskell to First Order Logic
(Patrik Jansson, CSE
Department, Chalmers University of Technology)
The Cover project at Chalmers has been developing systems (theories, languages, libraries and tools) for software verification of Haskell programs. I will give a quick overview of the Cover project and present CoverTranslator in a little more detail. The translator takes as input Haskell programs with properties (defined in an embedded language), translates these into first order clauses and uses off-the-shelf FOL provers to prove the properties.
Thursday March 23, 13:15, room 1537:
Cryptographically Sound Theorem Proving
(Christoph Sprenger, Department of Computer Science, ETH Zurich)
Tools for security protocol verification are traditionally based on Dolev-Yao models, which give the adversary complete control over the network and assume cryptography to be perfect. Recently, much research has been devoted to underpinning such symbolic protocol models with sound cryptographic foundations (possibly relaxing the perfect cryptography assumption).
In this talk, I will describe a faithful embedding of the Dolev-Yao-style model of Backes, Pfitzmann, and Waidner (CCS 2003) in the theorem prover Isabelle/HOL. This model provides strong soundness guarantees in the sense of reactive simulatability: essentially arbitrary security properties proved in the symbolic model carry over to the cryptographic realization and this holds under active attacks and in arbitrary protocol environments. The main challenge in designing a practical formalization of this model is to cope with the complexity of providing such strong guarantees. We have reduced this complexity by abstracting the model into a sound, light-weight formalization that enables both concise property specifications and efficient application of our proof strategies and their supporting proof tools. This yields the first tool-supported framework for symbolically verifying security protocols that enjoys the strong cryptographic soundness guarantees provided by reactive simulatability.
Monday March 13, 13:15, room 1537:
Compositional Verification of Sequential Programs with Procedures
(Dilian Gurov, Theory Group, KTH CSC)
I present a method for algorithmic, compositional verification of control flow based safety properties of sequential programs with procedures. The application of the method involves three steps: (1) decomposing the desired global property into local properties of the components, (2) proving the correctness of the property decomposition by using a maximal model construction, and (3) verifying that the component implementations obey their local specifications. I shall consider safety properties of the control flow behaviour of programs, as well as of the control flow structure.
The compositional verification method builds on a technique proposed by Grumberg and Long who use maximal models to reduce compositional verification of finite-state parallel processes to standard model checking. The generalisation of the maximal model technique to programs with recursion requires a refinement, since maximal applets are only guaranteed to exist for structural but not for behavioural properties. I therefore present a mixed, two-level approach where local assumptions are structural, while the global guarantee is behavioural.
The proposed verification method is applicable to arbitrary sequential programs with procedures. It is evaluated on an industrial case study taken from the smart card area. By separating the tasks of verifying global and local properties the method supports secure post-issuance loading of applets onto a smart card.
Monday February 27, 15:15 (NB! Not the usual time!), room 1537:
Introduction to information flow analysis
(Mads Dam, Theory Group, KTH CSC)
In computer security, information flow analysis concerns the problem of determining whether, and, potentially, how much, information is flowing across security boundaries. Confidentiality and integrity, for instance, are easily modelled as information flow problems: For confidentiality the problem concerns flow of secret data to public domains, and for integrity the problem is dual, to prevent flow of insecure data to secure domains. In the talk I introduce the research area, state of the art and open issues, and cover some recent work in the area, including some by the speaker.
Monday February 13, 13:15, room 1537:
Cryptographic Protocol Logic. A Synthetic Approach.
(Simon Kramer, EPFL)
I present my current work on CPL, a logic for reasoning about cryptographic protocols. CPL establishes truth on the grounds of evidence-based knowledge and spans the modal dimensions of knowledge, norms, space, and time.
The contribution of my work is twofold:
Thursday January 26, 9:15, room 1537:
Ontology Based Higher-Level Fusion
(Mieczyslaw M. Kokar, Department of Electrical and Computer Engineering, Northeastern University, Boston)
Higher-level fusion involves estimation of abstract entities - sometimes called "situations" - that can be represented as relations among objects, both physical and conceptual. Unlike features of physical objects, features of relations are not directly measured by any sensors. Instead, the existence of a relation is derived from a domain theory relevant to a specific scenario. In this talk, both theoretical and practical aspects of situation awareness and higher-level information fusion will be discussed. First, a motivational example will be given to demonstrate the importance of relations and to introduce the concept of situation. Then a formal mathematical definition of situation will be formulated and mapped to an ontological framework. This will be followed by a presentation of some methodological techniques and some technologies that are needed for establishing an ontological approach to higher level fusion processing. In conclusion, directions for both applications and research in the areas of ontologies and higher-level fusion will be discussed.
December 20, 13:15, room 1537:
Spam fighting and The Complexity of Pebbling Graphs
(Moni Naor, Weizmann Institute of Science)
Consider the following simple technique for combatting spam:
If I don't know you, and you want your e-mail to appear in my inbox, then you must attach to your message an easily verified "proof of computational effort", just for me and just for this message.To apply this approach one needs to be able to come up with computational problems where solving them requires significant expenditure of resources while verifying a solution can be done easily. In this talk I will introduce this approach and concentrate on the choice of computational problems for which most of the work is in retrieving information from memory. In particular I will describe the connection to pebbling problems.
(Joint work with Cynthia Dwork and Hoeteck Wee.)
September 19, 10:15, room 1537:
A compositional natural semantics and
Hoare logic for low-level languages
(Tarmo Uustalu, Institute of
Cybernetics, Tallinn)
The advent of proof-carrying code has generated significant interest in reasoning about low-level languages. It is widely believed that low-level languages with jumps must be difficult to reason about by being inherently non-modular. We argue that this is untrue. We take it seriously that, differently from statements of a high-level language, pieces of low-level code are multiple-entry and multiple-exit. And we define a piece of code to consist of either a single labelled instruction or a finite union of pieces of code. Thus we obtain a compositional natural semantics and a matching Hoare logic for a basic low-level language with jumps. By their simplicity and intuitiveness, these are comparable to the standard natural semantics and Hoare logic of While. The Hoare logic is sound and complete wrt. the semantics and allows for compilation of proofs of the Hoare logic of While.
(Joint work with Ando Saabas, based on a paper at SOS 2005.)
September 15, 13:15, room 1537:
Web services security
(Alan Abdulrahman)
As the surrounding world changes, IT systems grow and become more complex. By providing everything in terms of service modules to internal and external users of an organization, these service modules can easily be regrouped and exchanged to provide new forms of services adapted to the new situation. This is the idea of service-oriented architectures. A set of specifications that provide means to communicate in a platform- and language-independent manner, are grouped under the concept of Web services, and provide a realization of serviceoriented architectures. Web services facilitate communication between partner organizations with entirely different underlying IT infrastructures by exchanging XML messages in a standardized manner.
To address security in Web services environments, another set of specifications are being developed that goes under the name Web services security. These specifications include mechanisms for securing single messages, establishing and brokering trust relationships between organizations, putting security capabilities and constraints on Web services, establishing security contexts, federating identities across partner organizations, stating privacy policies, and subjecting Web services to fine-grained access control.
This seminar will present in more detail the specifications that comprise Web services security. No more than basic knowledge on computer security mechanisms is required.
A mix-net is a multi-party protocol that takes a list of cryptotexts and outputs the list of corresponding cleartexts in random order. No individual mix-server knows the secret key of the cryptosystem used or the resulting random permutation. The main application of mix-nets is to implement electronic elections.
We propose here the first efficient publicly verifiable hybrid mix-net. In order to achieve this goal, we have newly developed an IND-ME-CCA secure scheme of multiple encryption using hybrid encryption and a perfect zero-knowledge argument for shuffle-and-decryption of ElGamal ciphertexts. Although the resulting mix-net does not provide full public verifiability of the hybrid decryption in the case when a user and a mixer collude, the best adversary can do is to switch the input between a valid and an invalid one. The resulting scheme is efficient enough to treat large scale electronic questionnaires of long messages as well as voting with write-ins. The scheme is provably secure if we assume random oracles, semantic security of a one-time symmetric-key cryptosystem, and intractability of decision Diffie-Hellman problem.
Let G=(V,E,w) be a weighted directed graph with integer edge weights of absolute value at most M. We show that G can be preprocessed in O*(Mn^w) time, where w<2.376 is the exponent of fast matrix multiplication, such that subsequently each distance d(u,v) in the graph can be computed exactly in O(n) time. As a very special case, we obtain an O*(Mn^w) time algorithm for the SINGLE SOURCE SHORTEST PATHS (SSSP) problem for directed graphs with integer weights of absolute value at most M. For sufficiently dense graph, with edge weights that are not too large, this improves upon the O*(mn^{1/2}log M) time algorithms of Gabow and Tarjan, and Goldberg.
Joint work with Raphael Yuster.
The University of Skövde is embarking on a research program, funded by the Knowledge Foundation, Sweden, in the area of Information Fusion from Databases, Sensors and Simulations. Information fusion entails the combination of data from multiple sources, to generate information that cannot be derived from the individual sources. There are several reasons for focusing on this. The area is of strategic importance for industry, defense, and public administration areas such as health care. A large number of industrial partners are supporting and participating in the development of the area. This work is related to two former programs, Mechatronic Systems and Learning Systems, previously funded by the Knowledge Foundation. Several research groups at the University of Skövde are active in different aspects of information fusion, adding to the strength to the program. The program covers the entire spectrum from sensors to decision support, i.e., from technical to human (cognitive and organizational) aspects, where decisions need to be based on different sources of information or where time is a critical factor and the decision support needs to be of highest possible quality, given the available information. Examples of the latter are when a company needs to act before a competitor does, or when a fighter pilot needs to act before his adversary does, i.e., when it is necessary to get inside the competitor's decision loop.
The talk does not require prior knowledge of information fusion.
The controlled linear programming problem (CLPP) is a combinatorial optimization problem. An instance consists of a number of linear constraints of a certain form. A controller is allowed to select and discard constraints according to simple rules, with the goal of maximizing the optimal solution to the resulting linear program.
The CLPP captures and generalizes parity, mean payoff, discounted payoff, and simple stochastic games. For its most general version, the exact complexity is still unknown, but several rich subclasses can be shown to belong to NP intersection coNP. In this talk we use linear algebra to characterize the properties of such subclasses, and prove a number of new results. We also identify sufficient conditions for a class to be solvable in randomized subexponential time.
A credential system allows users to obtain credentials from organizations and demonstrate possession of these credentials. An anonymous credential system further ensures that per se different transactions by the same user cannot be linked. Therefore anonymous credential systems are one of the corner stones to protect users' privacy in electronic transactions.
This talk discusses anonymous credentials systems and generic yet efficient constructions for them.
BAN-logic is an epistemic logic for verification of security protocols proposed by Burrows, Abadi and Needham in the late 80'es. From a practical point of view, BAN logic has turned out to be quite successful: Reasoning about cryptographic protocol in terms of principals knowledge is arguably very natural, and moreover, BAN logic produces short and informative derivations which can reveal quite subtle protocol errors. However, despite quite a number of attempts, the semantics of BAN logic has remained problematic. In the talk we pinpoint the rule of normality, essentially monotonicity of the epistemic modality, as the chief culprit: This rule is validated by all proposed semantics we know of, but by adding it to BAN logic intuitively absurd statements become derivable from intuitively correct ones without much trouble. To overcome this problem we propose to adopt an idea from counterpart semantics, namely to generalize the epistemic accessibility relation from a relation between local states to a relation between pairs of messages and local states, by systematically renaming content which is cryptographically inaccessible. We use this idea to build a new semantics for BAN logic which we claim avoids the problems of the previous semantics, and we show how the idea can be used to build semantics for richer logics up to and including first-order BAN logic. The latter is interesting, in particular, since it provides a framework in which existing semantics can be compared. Finally, if there is time, we will discuss soundness of the proposed cryptographic counterpart semantics.
We present a method for converting shares of a secret into shares of the same secret in a different secret-sharing scheme using only local computation and no communication between players.
We show how this can be combined with any pseudorandom function to create, from initially distributed randomness, any number of Shamir secret-sharings of (pseudo)random values without communication. We apply this technique to obtain efficient non-interactive protocols for secure computation of low-degree polynomials, which in turn give rise to other applications in secure computation and threshold cryptography. For instance, we can make the Cramer-Shoup threshold cryptosystem by Canetti and Goldwasser fully non-interactive, or assuming initially distributed randomness, we can compute any function securely in 2 rounds of communication.
The solutions are practical only for a relatively small number of players. However, in our main applications the number of players is typically small, and furthermore it can be argued that no solution that makes a black-box use of a pseudorandom function can be more efficient.
I will give an introduction to Abadi and Cardelli's object calculus, a typed system similar to simply typed lambda calculus but where objects, rather than functions, are the primitive syntactic entities. I give a denotational model for the first order object calculus (without subtyping) in the category pCpo of cpos and partial maps. The key novelty of this new model is its extensive use of recursively defined types, supporting self-application, to model objects. At a technical level, this entails using some sophisticated techniques such as Freyd's algebraic compactness to guarantee the existence of the denotations of the object types. The key feature/complexity is the mixed variance functors which are needed to model object types.
I will show that a canonical recursion operator is inherent in this semantics. This operator can be useful in object-oriented programming: both in algebraic/coalgebraic formal methods and in capturing recurring abstractions in actual programs. The usefulness of the operator is witnessed by giving a straightforward translation of algebraic datatypes into so called wrapper classes. The talk concludes by comparing with Abadi and Cardelli's per semantics and by discussing current and future work.
The work reported here is joint work with Neil Ghani at University of Leicester.
Semidefinite programming has attracted interest in the computational complexity community as a tool for providing lower bounds on optimal values of certain combinatorial optimization problems. It has also become an important tool for solving many design problems in control and communications theory.
The aim of this talk is to give an introduction to semidefinite programming. We start with a discussion on linear programming and duality. These results are then generalized to semidefinite programming, with discussion on important differences. The ability to solve semidefinite programs efficiently has been enhanced quite significantly by the development of interior methods. We give a basic description of interior methods. Finally, a few application examples are discussed.
A black box secret sharing scheme (BBSSS) for a threshold access structure is a linear secret sharing scheme that works for any finite abelian group. Introduced by Desmedt and Frankel, the problem has been rigourously analyzed by Cramer and Fehr.
BBSSS can be based on number fields with certain properties. The approach by Desmedt and Frankel relies on number fields with large Lenstra constant, i.e.,number fields over which a "large" invertible Vandermonde matrix exists. The best known choices for these number fields lead to BBSSS with expansion factor O(n), where n is the number if players. The expansion factor corresponds to the length of each share, i.e., the number of group elements received from the dealer by each player. The solution of Cramer and Fehr achieves expansion factor O(log n), which is asymptotically optimal. It relies on low-degree number fields over which a pair of "large" Vandermonde matrices exists whose determinants are co-prime.
We propose a new approach which aims at achieving an absolutely optimal solution. Our approach is based on low-degree number fields containing a "large primitive set." We give some experimental as well as some theoretical results.
There are two closely related algorithmic problems which we want to address. Given a finite set X and a cost function W on its elements, one may be interested in computing the cost of X - the total cost of its elements. The other question is to find an element of X of largest cost.
For our purposes we must and will assume that the set X in question is endowed with a combinatorial structure that allows us to answer one of these questions efficiently, for any cost function W. Can this be of any use for the other question?
It is not hard to see that if we can count (i.e. compute the total cost of X), then optimization becomes easy. For instance, it is possible to solve the Assignment problem (finding a maximal-weight perfect matching in a bipartite graph) by computing appropriate determinants. What about the other direction?
It turns out that in a fairly general setting the two problems are 'equivalent'. This means that given an ability to optimize over a set one can estimate the total cost of the set. In fact, the estimates so obtained will be sufficiently precise to allow optimization.
The proofs use several tools from probability and statistics, such as the concentration of measure in product spaces, large deviations, asymptotics of order statistics etc.
Seminariet refererar en artikel från 1990 av Tom Leighton och C. Greg Plaxton.
Artikeln beskriver en parallell sorteringsmetod som bygger på den s.k. fjärilsturneringen. I denna får 2^k deltagare under k omgångar möta motståndare med samma mönster av förluster och segrar. Därför får spelare av liknande styrka tävla mot varandra, vilket bäddar för en snabb turnering.
På k omgångar blir fjärilsturneringen inte färdig. Artikeln ger exakta uppskattningar av hur stor andel av indatapermutationerna som ger utdata med måttliga fel i sorteringsordningen.
Genom att dela upp utdata från fjärilsturneringen i block och använda andra sorteringsmetoder på dessa skapar författarna en sorteringskrets som med mycket hög sannolikhet sorterar rätt.
Den informationsteoretiska undre gränsen är 2 lg n - o(lg n) omgångar. Leighton - Plaxtons algoritm är 10-potenser bättre än tidigare rekord och tar 7,44 lg n omgångar. Går det att göra bättre?
The original setting in which secure two-party protocols were investigated allowed the execution of a single instance of the protocol at a time. A more realistic setting, however, is one which allows the concurrent execution of protocols. In the concurrent setting, many two-party protocols are executed at the same time, involving many parties which may be talking with the same (or many) other parties simultaneously. This setting presents the new risk of a coordinated attack in which an adversary controls many parties, interleaving the executions of the protocols and choosing messages based on other partial executions of the protocol.
In this work we consider the problem of constructing a general protocol for secure two-party computation in a way that preserves security under concurrent composition. In our treatment, we focus on the case where an a-priori bound on the number of concurrent sessions is specified before the protocol is constructed (a.k.a. bounded concurrency). We make no set-up assumptions.
Lindell (STOC 2003) has shown that any protocol for bounded-concurrent secure two-party computation, whose security is established via black-box simulation, must have round complexity that is strictly larger than the bound on the number of concurrent sessions. In this talk I will show how to construct a (non black-box) protocol for realizing bounded-concurrent secure two-party computation in a constant number of rounds. The only previously known protocol for realizing the above task required more rounds than the pre-specified bound on the number of sessions (despite usage of non black-box simulation techniques).
An extended abstract is available from the author's homepage http://www.nada.kth.se/~rafael/.
In the PCP model, a verifier is supposed to probabilistically decide if a given input belongs to some language by posing queries to a purported proof of this fact. The probability that the verifier accepts an input in the language given a correct proof is called the completeness c; the probability that the verifier rejects an input not in the language given any proof is called the soundness s. For a verifier posing q queries to the proof, the amortized query complexity is defined by q / log_2(c/s) if the proof is coded in binary. It is a measure of the average ``efficiency'' of the queries in the following sense: An ideal query should preserve the completeness and halve the soundness. If this were the case for all queries, the amortized query complexity would be exactly one.
Samorodnitsky and Trevisan (STOC 2000) gave a q-query PCP for NP with amortized query complexity (1 + 2/sqrt{q} + epsilon) for any constant epsilon > 0. In this paper, we examine to what extent their result can be sharpened. In particular, we consider the dependency between the probability that a proof of an incorrect statement is accepted and the number of queries posed to the proof oracle.
Our main result is a PCP for NP that queries q positions in the proof and has amortized query complexity (1 + sqrt{2/q} + epsilon). As an immediate corollary, we also obtain an improved hardness of approximation result for the Maximum q-CSP problem. As can be seen, our improvements are in the lower order term. It is, however, not possible to improve the amortized query complexity much more unless P=NP; a consequence of a result due to Trevisan (Algorithmica, 21(1):72--88, 1998) is that unless P=NP no PCP verifier for NP that queries q positions in the proof can have amortized query complexity 1 + 1/(q-1).
Our improved construction uses the layered label cover problem recently introduced by Dinur et al. (STOC2003); based on such a label cover problem we devise a new ``outer verifier'' that allows us to construct an ``inner verifier'' that uses the query bits more efficiently than earlier verifiers.
Vi studerar problemet att göra effektiva beräkningar (på binär hårdvara) i GF(p^n), där p är ett litet primtal skiljt från 2. Vissa primtal (t.ex. 3 och 5) av denna typ är av speciellt intresse i en del nya kryptosystem baserade på elliptiska kurvor. Utöver detta är problemet naturligtvis intressant ur ett rent teoretiskt perspektiv.
Huvudidén är att operera på flera element i GF(p) parallellt genom att stoppa dem i samma maskinord. Motsvarande knep i GF(2^n) är både välkänt och tacksamt, eftersom addition här motsvaras av XOR.
Vi ger övre gränser (samt ev. något handviftande argument om undre gränser) för hur mycket utrymme som behövs för varje GF(p)-element för att vi ska kunna operera på dem parallellt (d.v.s. väsentligen hur många vi kan klämma in i ett maskinord), och presenterar några prestandajämförelser från en faktisk implementation.
Ett problem i samband med kommunikation inom grupper är att dynamiskt kontrollera medlemskap i gruppen, d.v.s. att lägga till nya medlemmar samt att utesluta gamla medlemmar. T.ex. vill man se till att före detta medlemmar inte längre kan lyssna på kommunikationen, och detta sköts med hjälp av kryptografiska nycklar. Så länge gruppen är liten kan man lösa detta genom att skicka ut nya nycklar till alla kvarvarande medlemmar på ett förhållandevis enkelt sätt, men när gruppstorleken växer blir detta snabbt mycket komplext och ohanterligt.
Vi kommer att titta lite närmare på några av de protokoll som finns för att lösa detta problem, både naiva och mer effektiva. Bland de mer effektiva kan nämnas två som får extra mycket uppmärksamhet: Subset Difference (SD) och Logical Key Hierarchy (LKH). Inom ramen för en föreslagen verklig tillämpning, frågar vi oss också vilka krav det ger på revokerings protokollet och om det i dagsläget finns något praktiskt användbart alternativ till detta.
A5/1 is a stream cipher used in the GSM standard. Several time-memory trade-off attacks against A5/1 have been proposed, most notably the attack by Biryukov, Shamir and Wagner, which can break A5/1 in seconds using huge precomputation time and memory. We present a completely different attack on A5/1, based on ideas from correlation attacks. Whereas time-memory trade-off attacks have a complexity which is exponential with the shift register length, the complexity of the proposed attack is almost independent of the shift register length. Our implementation of the suggested attack breaks A5/1 in a few minutes using 2-5 minutes of conversation.
Första delen av seminariet kommer att ägnas åt en översikt över språkteknologiforskningen på Nada från 90-talets rättstavningsprojekt till dagens forskning om informationssökning med matrismetoder och grammatikgranskning för andraspråkssvenska. Som en del i det senare projektet ingår konstruktion av en supertaggare som med hjälp av fem oeniga taggare kan märka orden i en svensk text med ordklass och böjningsform bättre än någon av de enskilda taggarna. Om detta, och skapandet av en "extra oenig" taggare, handlar andra delen av seminariet.
Seminariet ger en överblick över detta medicinska tvärvetenskapliga forskningsprojekt som går ut på att skapa kunskap om sjukdomen schizofreni. Särskilt: Vilket problem ska lösas och vilka frågor är viktiga? Hur är det organiserat? Vilken information kan användas, och vad kan en datalog göra i sammanhanget? Inga specialistkunskaper förutsätts.
En ekvation över en ändlig grupp G är ett uttryck på formen w_1 w_2...w_k = 1_G, där varje w_i är en variabel, en inverterad variabel, eller konstant från G. En sådan ekvation är satisfierbar om det går att tilldela variablerna värden från G på ett sådant sätt att likheten realiseras.
I detta seminarium behandlas problemet att samtidigt satisfiera så många som möjligt av en familj av ekvationer över en ändlig grupp G. Vi presenterar ett bevis för att det är NP-svårt att satisfiera mer än en andel 1/|G| av det optimala antalet ekvationer, eller med andra ord att problemet är NP-svårt att approximera inom |G|-epsilon för varje epsilon > 0. Motsvarande resultat var tidigare känt enbart för Abelska grupper (Håstad 2001).
I seminariet skissar vi på en koppling mellan problemet att satisfiera maximalt antal ekvationer och så kallade PCPer ("Probabilistically Checkable Proofs"), som kan ses som ett slags spel mellan två personer, en verifierare och en bevisare, där bevisaren vill övertyga verifieraren om något påstående. Vi konstruerar ett sådant spel som motsvarar optimeringsproblemet ovan, och för att analysera detta spel använder vi sedan representationsteori för ändliga grupper.
Arbetet har utförts tillsammans med Lars Engebretsen och Alexander Russell.
Att jämföra olika arters arvsmassa ger möjlighet att studera evolutionära mekanismer samt att överföra kunskap om en art till kunskap om en annan, tex från modellorganismer till människan. Tiden sedan ett par arter divergerade, eller mer generellt arträdet (det kantviktade träd som beskriver evolutionen av en mängd arter), är mycket relevant vid denna typ av jämförelser. Genduplikationer är en typ av mutation vars roll i skapandet av nya funktioner inte är klarlagd men trots detta är central i definition av "samma funktion" (ortologi).
Vi ska titta på en probabilistisk modell för genduplikationer där ett genträd växer fram inuti ett arträd. En rekonsiliering är en beskrivning av en sådan framväxt. Givet ett arträd samt ett genträd så är likelihooden för en rekonsiliering sannolikheten att det är den rätta rekonsilieringen. Det visar sig att vi kan beräkna likelihooden m.h.a. dynamisk programmering. För att uppskatta posteriorifördelningen över rekonsilieringar så använder vi Markov Chain Monte Carlo tekniker.
Seminariet kommer handla om svåra predikat. Givet en funktion f och ett predikat P så är P svårt om det inte finns en algoritm som kan gissa värdet P(x), givet värdet av f(x), bättre än en slumpgissning. Vi kommer att gå igenom en känd konstruktion av ett svårt predikat som baseras på en godtycklig enkelriktad funktion (Goldreich och Levin, STOC '89).
Konceptet svåra predikat kommer bland annat till användning när man ska bevisa säkerheten hos en pseudoslumptalsgeneratorer (PSG), ett viktigt kryptografiskt primitiv. Utdata från en PSG ska inte kunna skiljas från slump av motståndare som är begränsade av en viss tidsåtgång. Bevisgången är att reducera problemet att förutsäga värdet av ett svårt predikat till problemet att "knäcka" PSG:n. Här är effektiviteten hos reduktionen viktig eftersom den relaterar direkt till hur kraftfulla motståndare beviset fungerar på. Vi kommer även att ta upp en effektivare reduktion för specifika, men vanligt förekommande, användningsområden för PSG:er (Hast, EUROCRYPT 2003).
In this seminar we present Bird Meertens formalism, a mathematical tool for the construction of generic (datatype-parametric) programs. We define the formalism starting from the category FUN of sets and total functions, and explain how recursive datatypes are represented by fixpoints of functors. We define catamorphism and anamorphism as fundamental building blocks in the formalism, and then turn to concrete examples of program derivation. We conclude by describing our current research in the area.
Funktionell hjärnbildforskning har på relativt kort tid etablerats som ett stort forskningsområde med målet att skapa förståelse om det sista outforskade organet i människokroppen. Med hjälp av en PET eller fMRI scanner undersöker man mönster i förändringar av tex blodets syresättning.
Men att dra slutsatser av den uppmätta signalen från scannern har visat sig svårt, av flera anledningar, och graden av reproducerbarhet mellan experiment är därför ofta dålig. I slutfasen av det europeiska databasprojektet NeuroGenerator (www.neurogenerator.org) försöker vi därför hitta statistiska metoder för att analysera graden av reproducerbarhet för insamlade PET och fMRI experiment.
Under seminariet kommer jag att presentera problemet och gå igenom metoder som används för analysen av enstaka experiment, samt hur vi inom projektet attackerar problemet att göra en sammantagen analys av flera experiment som borde uppvisa gemensamma funktionella komponenter.
Informationsteori för kommunikation med kvantobjekt, t ex fotoner, har rönt stor uppmärksamhet under senare år. Kanalkapacitet för en sådan kommunikationskanal har studerats och felkorrigerande koder har presenterats. Seminariet behandlar principerna för kvantkommunikation illustrerad med en rad exempel. Tyngdpunkten ligger på transmission av klassiska data (vanliga ettor och nollor). Överföring av hemlig information (kvantkryptering) beskrivs.
Sphere Decoding, originally an algorithm to find vectors of short length in lattices, has recently been suggested by a number of authors as an efficient algorithm to solve various maximum likelihood (ML) detection problems in digital communication. Often the algorithm is referred to as an algorithm of polynomial complexity, and some papers have previously been published in communication literature in support of this claim. This is a somewhat surprising result, especially since the ML detection problem, in general, is known to be NP-hard. However, as will be argued in this talk by making some assumptions on the detection problems, these claims are probably not correct and the complexity of the algorithm is instead exponential
It will in this talk be argued that, although always exponential, the complexity is strongly dependent on some parameters of the communication system, such as for example the signal to noise ratio (SNR). This will be done by first briefly discussing the differences between the detection problem and the related lattice problem to show what assumptions can be made about the detection problem. It will then be outlined how these assumptions lead to an exponential lower bound on the complexity of the algorithm. Also, numerical examples will be given to show the effect of different parameters on the complexity. Special attention will be given to how the algorithm benefits from a high SNR.
Several approximation algorithms will be presented. A common theme for these algorithms is that they were either designed by Magnús Halldórsson (perhaps with coauthors), or are based on ideas that appeared in work of Magnús. Another common theme is that the algorithms have clever but simple proofs of correctness. Among the algorithms presented will be a recent algorithm with the current best approximation ratio for finding a maximum clique in a graph.
Att avgöra om ett givet tal är ett primtal är ett grundläggande problem som dessutom är mycket viktigt bland annat i kryptografiska tillämpningar. Nyligen visade Agarwal, Kayal och Saxena att problemet går att lösa i deterministisk polynomisk tid; tidigare fanns algoritmer som krävde probabilistisk polynomisk tid samt algoritmer som bara kunde visas fungera korrekt under antagandet att vissa, obevisade, matematiska satser är sanna.
Agarwals, Kayals och Saxenas resultat har fått stor uppmärksamhet över hela världen och är av stor teoretisk betydelse. I det här seminariet kommer vi att ge en bakgrund till problemet, beskriva dess lösning och till sist diskutera vilken betydelse resultatet har, både från ett teoretiskt och ett praktiskt perspektiv.
En villkorsfunktion på två variabler, dvs en funktion från D×D till {0,1}, är r-reguljär ifall det för varje fix tilldelning till den ena variabeln finns exakt r tilldelningar till den andra variabeln som gör att villkoret är satisfierat. Genom att välja en slumpvis tilldelning till alla variablerna i en instans av ett r-reguljärt villkorsproblem gan man uppfylla en förväntad andel r/d av alla villkor, där d är storleken på domänen D.
Vi använder semidefinit programmering för att konstruera en algoritm som uppfyller en förväntad andel r/d + Omega(d^{-4}) av det optimala antalet villkor. Arbetet har utförts tillsammans med Venkatesan Guruswami vid University of California at Berkeley.
Handelsresandens problem med symmetrisk kostnadsfunktion är ett klassiskt datalogiproblem. Det är NP-svårt att lösa exakt, men när kostnadsfunktionen uppfyller triangelolikheten ger Christofides algoritm (1976) en lösning i polynomisk tid med vikt högst en faktor 3/2 från optimum.
Versionen med asymmetrisk kostnadsfunktion som uppfyller triangelolikheten är inte lika välstuderad. Jag kommer att ge en översiktlig beskrivning av tidigare resultat och diskutera olika möjligheter till forskning kring handelsresandens problem med asymmetrisk kostnadsfunktion. Inga förkunskaper krävs.
Inom kretskomplexitet försöker man visa att problem är svåra genom att visa att det krävs stora kretsar för att lösa dem. Hur kraftfull en krets är beror, förutom storleken, på vilken typ av beräkningselement (grindar) kretsen är uppbyggd av.
En majoritetsgrind med n indata ger utdata 1 om mer än hälften av indatabitarna är 1 och ger utdata 0 annars. Tröskelgrinden är en generalisering av majoritetsgrinden. Den beräknar en viktad summa av indatabitarna och ger utdata 1 om summan är större än ett visst tröskelvärde. Exempelvis kan man med en tröskelgrind jämföra två binärkodade heltal och avgöra vilket som är störst.
Tröskelkretsar av konstant djup och polynomisk storlek har visat sig vara ganska kraftfulla, till skillnad från kretsar med AND- och OR-grindar med samma begränsningar på storlek och djup. De är de enklaste "naturliga" kretsar för vilka man inte kunnat visa några starka undre gränser förutom i vissa mycket begränsade fall (väsentligen för majoritetskretsar av djup två).
Seminariet kommer att handla om en del av de övre och undre gränser som finns för tröskelkretsar. I samband med det kommer jag också att ta upp en annan beräkningsmodell där två eller flera spelare ska evaluera en funktion tillsammans. Kruxet är att ingen ensam spelare har tillgång till hela indatat, och målet är att evaluera funktionen genom att skicka så få och så små meddelanden som möjligt till andra deltagare. Kommunikationskomplexitet är ett användbart verktyg i många delar av komplexitetsteori, och i vårt fall visar det sig att problem som har hög kommunikationskomplexitet också kräver stora majoritetskretsar av djup två.
Grammatikgranskare är ofta användbara verktyg när man skriver stora mängder text. Traditionella språkgranskningsprogram (såsom Granska) hittar stavfel genom uppslagning i lexikon och grammatikfel genom att matcha skribentens text mot regler. Vissa vanliga typer av fel förblir dock oupptäckta eftersom de inte lämpar sig för att beskriva med regler och lexikon. Ett exempel på en sådan feltyp är stavfel som resulterar i befintliga ord.
Trots att grammatik till sin natur är nära knutet till regler visar det sig att statistik kan hjälpa oss med dessa feltyper. Från stora mängder text kan man bygga statistik över vilka konstruktioner som är grammatiska i språket. Denna information används sedan för att avgöra vilka grammatiska konstruktioner som är osannolika i den text som ska grammatikgranskas.
En statistisk grammatikgranskare har utvecklats tillsammans med Ola Knutsson på Nada. Jag kommer att berätta om de bakomliggande teorierna och hur statistisk grammatikkontroll kan komplettera en traditionell grammatikgranskare. En testversion finns integrerad i Granska på Granskaprojektets hemsida.
Om man på exempelvis ett hotell tillhandahåller filmer på en internkanal är det vanligt att man har en fix programtablå. Som gäst vill man däremot själv kunna välja när man ska titta på en viss film och man vill inte behöva vänta två timmar på nästa utsändning av filmen bara för att man råkat missa första kvarten.
I det här seminariet kommer jag att prata om ovanstående problem ur ett datalogiskt perspektiv. Den modell jag kommer att behandla bygger på att det finns en central utsändningskälla och tunna klienter i varje TV-mottagare. Problemet blir då att tillhandahålla flexibilitet för tittarna utan att det går åt alltför mycket bandbredd i nätet.
De resultat jag presenterar är frukterna av ett samarbete med Madhu Sudan vid MIT.
Antag att vi har en mängd X och en familj C av delmängder till X, och vi vill välja ut en så liten delmängd S av X som möjligt under bivillkoret att varje mängd i C har något element med i S.
Detta problem kallas för Minimum Hitting Set och är NP-svårt att lösa exakt. När ett problem är svårt att lösa är exakt är det naturligt att försöka hitta en lösning som är ganska bra. Vi säger att en algoritm approximerar ett problem inom en faktor c om den producerar en lösning som är högst en faktor c större än den optimala. Hur lätt det är att hitta approximativa lösningar varierar enormt mellan olika NP-svåra problem. Ett viktigt problem inom teoretisk datalogi är att utreda olika problems approximationsegenskaper.
Det är känt att generella Minimum Hitting Set är rejält svårt att approximera. I detta seminarium koncentrerar vi oss på Minimum Hitting Set där mängderna i familjen C alla har samma storlek. Om mängderna har storlek k, säg, går detta problem att approximera inom en faktor k med en enkel algoritm. När k=2 är problemet ekvivalent med det välkända problemet Minimal Hörntäckning. Det tycks naturligt att problemt blir svårare när k växer, men upp till alldeles nyligen kunde man inte visa starkare icke-approximationsresultat för k>2 än för k=2.
Vi diskuterar utveckligen för detta problem och visar att när k är högst 4 så är problemet NP-svårt att approximera inom en faktor 2.
Reductions have played an important role in theoretical computer science since the development of the theory of NP-completeness. They have also been used in the domain of approximation algorithms. One such use is to transfer lower bounds, usually derived from probabilistic proof systems, from the original problem to other problems. In this context the cost of the reduction is important - a smaller cost leads to a stronger inapproximability result. In this talk we discuss the problem of finding the cheapest reductions and focus on the class of gadget reductions. It turns out that it in many cases is possible to obtain the best possible reductions by solving linear programs. Many strong inapproximability results have been derived using this technique.
This talk will describe a probabilistic method for verifying the equivalence of two multiple-valued functions. Each function is hashed to an integer code by transforming it to a integer-valued polynomial and the equivalence of two polynomials is checked probabilistically. The hash codes for two equivalent functions are always the same. Thus, the equivalence of two functions can be verified with a known probability of error, arising from collisions between inequivalent functions. Such a probabilistic verification can be an attractive alternative for verifying functions that are too large to be handled by deterministic verification methods.
Using very efficient probabilistically checkable proofs (PCP) for NP we prove that unless NP=P, some simple approximation algorithms for basic NP-hard optimization problems are essentially optimal. In particular given a SAT formula with exactly 3 variables in each clause it is not hard to find an assignment that satisfies a fraction 7/8 of the clauses. We prove that (upto an arbitrary epsilon > 0) this is the best possible for a polynomial time approximation algorithm.
In this talk we concentrate on the problem of given a linear system of equations mod 2, to satisfy the maximal number of equations. This problem is easy to approximate within a factor of 2 and we prove that this is essentially tight. This result is obtained by constructing a PCP that uses logarithmic randomness, reads 3 bits in the proof and accepts based on the exclusive-or of the these bits. This proof system has completeness 1-epsilon and soundness 1/2+epsilon, for any constant epsilon > 0.
We describe how the reduction from PCPs to Maximum Clique can be extended to give hardness result for Coloring.
We discuss hardness results for Maximum Clique under different assumptions.
One of the main problems in chip design is the huge number of possible combinations of individual chip elements, leading to a combinatorial explosion as chips become more and more complex. New key results in theoretical computer science and in the design of data structures and efficient algorithms can be applied fruitfully here. The use and application of ordered binary decision digrams (OBDDs) has led to dramatic performance improvements in many computer-aided design projects. The talk provides an introduction to this interdisciplinary research area with an emphasis in computer-aided circuit design and verification.
Om man ska få datorn att hitta grammatikfel i svenska texter så räcker det inte med att mata in Svenska Akademiens nya grammatik. Nej, bäst resultat har vi faktiskt fått genom att avstå från att göra en fullständig grammatisk analys av varje mening och istället uttnyttja ett par icke-lingvistiska metoder. Först ska varje ord i texten taggas, det vill säga märkas med ordklass och böjningsform. Detta gör vi med statistiska metoder och en Markovmodell. 97% av orden får rätt tagg och hela 91% av orden som inte finns i lexikonet taggas rätt. Sedan matchas en stor uppsättning granskningsregler mot den taggade texten, och ut kommer en lista med hittade grammatikfel och andra språkfel. I seminariet kommer vi att beskriva hur taggningen och regelmatchningen kan göras effektivt.
In his work on the stationary heat distribution in a unit disc, Fourier introduced the method of expressing functions as series involving trigonometric functions. In Fourier's case, the functions were functions from the unit circle in R² to C, but the concept can be generalized. In this talk, we show that the approach can also be used for functions in the space L²(G), i.e., the space of functions from some finite Abelian group G to C.
A fast algorithm for the "standard sorting problem" is presented. It sorts n word-sized integers on a unit-cost RAM in O(n log log n) worst-case time. The algorithm appeared in a rather inaccessible paper ("Sorting in linear time?" by A. Andersson, T. Hagerup, S. Nilsson, and R. Raman, STOC'95) but is actually quite simple.
Aristoteles beskrev induktionsproblemet - att generalisera från observationer - och erkände Sokrates som den som identifierade problemet. Det har varit centralt på olika sätt under hela filosofins historia. Bayes och Laplace kvantifierade osäkerheten i induktion med sannolikheter i Bayesiansk tolkning. Idag ser vi en ökande mängd 'intelligenta' datorbaserade system som gör observationer och försöker tolka dem, och Bayes har fått konkurrens av flera alternativa metoder. Därför är grundvalarna för induktion och inferens fortfarande högaktuella.
Cox (Am Jour of Phys., 1946) försökte visa att Bayesianism är oundviklig om man vill räkna konsistent med osäker information. Iden är att alla rimliga alternativa osäkerhetsmått kan skalas om så att de omskalade osäkerheterna kombineras med multiplikation, som sannolikheter för oberoende storheter. Hans arbete har prisats och kritiserats i omgångar sedan dess. Speciellt har hans antaganden att osäkerhet graderas i ett kontinuum och att osäkerheter måste kombineras med en två gånger differentierbar funktion kritiserats. Antagandena föranleddes av Cox bevismetod och har senare mildrats.
Vi har visat att det finns goda skäl att betrakta Bayesianism som oundviklig även i modeller med ett ändligt antal grader av osäkerhet, och med en oändlig men inte tät mängd osäkerheter. Några antaganden måste dock göras som inte förefaller helt oundvikliga. Dessa hänger samman med problemets natur och inte med den bevismetod som används, vilket framgår av motexempel. Vi kallar antagandena förfiningsbarhet, strikt monotonicitet och separerbarhet. För att bevisa våra satser använder vi dualitet och en utveckling av bevismetoder som använts av Janos Aczel för associativitetsekvationen. I de fall våra antaganden inte gäller kan man få olika varianter av possibilistisk logik och, som gränstagningsoperationer, icke-monoton logik.