documents Here is a list of my academic papers, classified by type of publication and in reverse chronological order:

You might also be interested in my author profiles on DBLP and Google Scholar.

international, peer-reviewed journal articles

  1. [.pdf] [.bib] Matthieu Caneill, Daniel M. Germán, Stefano Zacchiroli. The Debsources Dataset: Two Decades of Free and Open Source Software. In Empirical Software Engineering, Volume 22, pp. 1405-1437, June, 2017. ISSN 1382-3256, Springer. Abstract...

    Abstract: We present the Debsources Dataset: source code and related metadata spanning two decades of Free and Open Source Software (FOSS) history, seen through the lens of the Debian distribution. The dataset spans more than 3 billion lines of source code as well as metadata about them such as: size metrics (lines of code, disk usage), developer-defined symbols (ctags), file-level checksums (SHA1, SHA256, TLSH), file media types (MIME), release information (which version of which package containing which source code files has been released when), and license information (GPL, BSD, etc). The Debsources Dataset comes as a set of tarballs containing deduplicated unique source code files organized by their SHA1 checksums (the source code), plus a portable PostgreSQL database dump (the metadata). A case study is run to show how the Debsources Dataset can be used to easily and efficiently instrument very long-term analyses of the evolution of Debian from various angles (size, granularity, licensing, etc.), getting a grasp of major FOSS trends of the past two decades. The Debsources Dataset is Open Data, released under the terms of the CC BY-SA 4.0 license, and available for download from Zenodo with DOI reference 10.5281/zenodo.61089.

  2. [.pdf] [.bib] Roberto Di Cosmo, Jacopo Mauro, Stefano Zacchiroli, Gianluigi Zavattaro. Aeolus: a Component Model for the Cloud. In Information and Computation, Volume 239, pp. 100-121. 2014. ISSN 0890-5401, Elsevier. Abstract...

    Abstract: We introduce the Aeolus component model, which is specifically designed to capture realistic scenarii arising when configuring and deploying distributed applications in the so-called cloud environments, where interconnected components can be deployed on clusters of heterogeneous virtual machines, which can be in turn created, destroyed, and connected on-the-fly. The full Aeolus model is able to describe several component characteristics such as dependencies, conflicts, non-functional requirements (replication requests and load limits), as well as the fact that component interfaces to the world might vary depending on the internal component state. When the number of components needed to build an application grows, it becomes important to be able to automate activities such as deployment and reconfiguration. This correspond, at the level of the model, to the ability to decide whether a desired target system configuration is reachable, which we call the achievability problem, and producing a path to reach it. In this work we show that the achievability problem is undecidable for the full Aeolus model, a strong limiting result for automated configuration in the cloud. We also show that the problem becomes decidable, but Ackermann-hard, as soon as one drops non-functional requirements. Finally, we provide a polynomial time algorithm for the further restriction of the model where support for inter-component conflicts is also removed.

  3. [.pdf] [.bib] Pietro Abate, Roberto Di Cosmo, Ralf Treinen, Stefano Zacchiroli. Learning from the Future of Component Repositories. In Science of Computer Programming, Volume 90, Part B, pp. 93-115. ISSN 0167-6423, Elsevier, 2014. Abstract...

    Abstract: An important aspect of the quality assurance of large component repositories is to ensure the logical coherence of component metadata, and to this end one needs to identify incoherences as early as possible. Some relevant classes of problems can be formulated in term of properties of the future repositories into which the current repository may evolve. However, checking such properties on all possible future repositories requires a way to construct a finite representation of the infinite set of all potential futures. A class of properties for which this can be done is presented in this work. We illustrate the practical usefulness of the approach with two quality assurance applications: (i) establishing the amount of "forced upgrades" induced by introducing new versions of existing components in a repository, and (ii) identifying outdated components that are currently not installable and need to be upgraded in order to become installable again. For both applications we provide experience reports obtained on the Debian free software distribution.

  4. [.pdf] [.bib] Pietro Abate, Roberto Di Cosmo, Ralf Treinen, Stefano Zacchiroli. A Modular Package Manager Architecture. In Information and Software Technology, Volume 55, Issue 2, pp. 459-474. ISSN 0950-5849, Elsevier, February 2013. Abstract...

    Abstract: The success of modern software distributions in the Free and Open Source world can be explained, among other factors, by the availability of a large collection of software packages and the possibility to easily install and remove those components using state of the art package managers. However, package managers are often built using a monolithic architecture and hard-wired and ad-hoc dependency solvers implementing some customized heuristics. In this paper we propose a modular architecture relying on precise interface formalisms that allows the system administrator to choose from a variety of dependency solvers and backends. We argue that this is the path that leads to the next generation of package managers that will deliver better results, offer more expressive preference languages, and be easily adaptable to new platforms. We have built a working prototype, called MPM, following the design advocated in this paper, and we show how it largely outperforms a variety of state of the art package managers.

  5. [.pdf] [.bib] Pietro Abate, Roberto Di Cosmo, Ralf Treinen, Stefano Zacchiroli. Dependency Solving: a Separate Concern in Component Evolution Management. In Journal of Systems and Software, Volume 85, Issue 10, pp. 2228-2240. ISSN 0164-1212, Elsevier, October 2012. Abstract...

    Abstract: Maintenance of component-based software platforms often has to face rapid evolution of software components. Component dependencies, conflicts, and package managers with dependency solving capabilities are the key ingredients of prevalent software maintenance technologies that have been proposed to keep software installations synchronized with evolving component repositories. We review state-of-the-art package managers and their ability to keep up with evolution at the current growth rate of popular component-based platforms, and conclude that their dependency solving abilities are not up to the task. We show that the complexity of the underlying upgrade planning problem is NP-complete even for seemingly simple component models, and argue that the principal source of complexity lies in multiple available versions of components. We then discuss the need of expressive languages for user preferences, which makes the problem even more challenging. We propose to establish dependency solving as a separate concern from other upgrade aspects, and present CUDF as a formalism to describe upgrade scenarios. By analyzing the result of an international dependency solving competition, we provide evidence that the proposed approach is viable.

  6. [.pdf] [.bib] Angelo Di Iorio, Francesco Draicchio, Fabio Vitali, Stefano Zacchiroli. Constrained Wiki: The WikiWay to Validating Content. In Advances in Human-Computer Interaction, Volume 2012, Article ID 893575, pp. 1-19. Hindawi, 2012 Abstract...

    Abstract: The "WikiWay" is the open editing philosophy of wikis meant to foster open collaboration and continuous improvement of their content. Just like other online communities, wikis often introduce and enforce conventions, constraints, and rules for their content, but do so in a considerably softer way, expecting authors to deliver content that satisfies the conventions and the constraints, or, failing that, having volunteers of the community, the WikiGnomes, fix others' content accordingly. Constrained wikis is our generic framework for wikis to implement validators of community-specific constraints and conventions that preserve the WikiWay and their open collaboration features. To this end, specific requirements need to be observed by validators and a specific software architecture can be used for their implementation, that is, as independent functions (implemented as internal modules or external services) used in a nonintrusive way. Two separate proof-of-concept validators have been implemented for MediaWiki and MoinMoin, respectively, providing an annotated view functions, that is, presenting content authors with violation warnings, rather than preventing them from saving a noncompliant text.

  7. [.pdf] [.bib] Roberto Di Cosmo, Davide Di Ruscio, Patrizio Pelliccione, Alfonso Pierantonio, Stefano Zacchiroli. Supporting Software Evolution in Component-Based FOSS Systems. In Science of Computer Programming, Volume 76, Issue 12, pp. 1144-1160. ISSN 0167-6423, Elsevier, 2011. Abstract...

    Abstract: FOSS (Free and Open Source Software) systems present interesting challenges in system evolution. On one hand, most FOSS systems are based on very fine-grained units of software deployment, called packages, which promote system evolution; on the other hand, FOSS systems are among the largest software systems known and require sophisticated static and dynamic conditions to be verified, in order to successfully deploy upgrades on user machines. The slightest error in one of these conditions can turn a routine upgrade into a system administrator nightmare. In this paper we introduce a model-based approach to support the upgrade of FOSS systems. The approach promotes the simulation of upgrades to predict failures before affecting the real system. Both fine-grained static aspects (e.g. configuration incoherences) and dynamic aspects (e.g. the execution of configuration scripts) are taken into account, improving over the state of the art of upgrade planners. The effectiveness of the approach is validated by instantiating the approach to widely-used FOSS distributions.

  8. [.pdf] [.bib] Paolo Marinelli, Fabio Vitali, Stefano Zacchiroli. Towards the unification of formats for overlapping markup. In New Review of Hypermedia and Multimedia, Volume 14, Issue 1, January 2008, pp. 57-94. Taylor and Francis, ISSN 1361-4568. Abstract...

    Abstract: Overlapping markup refers to the issue of how to represent data structures more expressive than trees, for example direct acyclic graphs, using markup (meta-)languages which have been designed with trees in mind, for example XML. In this paper we observe that the state of the art in overlapping markup is far from being the widespread and consistent stack of standards and technologies readily available for XML and develop a roadmap for closing the gap. In particular we present in the paper the design and implementation of what we believe to be the first needed step, namely: a syntactic conversion framework among the plethora of overlapping markup serialization formats. The algorithms needed to perform the various conversions are presented in pseudo-code, they are meant to be used as blueprints for researchers and practitioners which need to write batch translation programs from one format to the other.

  9. [.pdf] [.bib] Claudio Sacerdoti Coen, Stefano Zacchiroli. Spurious Disambiguation Errors and How to Get Rid of Them. In Mathematics in Computer Science, Volume 2, Number 2, pp. 355-378, December 2008. Springer Birkhäuser, ISSN 1661-8270. Abstract...

    Abstract: The disambiguation approach to the input of formulae enables users of mathematical assistants to type correct formulae in a terse syntax close to the usual ambiguous mathematical notation. When it comes to incorrect formulae however, far too many typing errors are generated; among them we want to present only errors related to the formula interpretation meant by the user, hiding errors related to other interpretations. We study disambiguation errors and how to classify them into the spurious and genuine error classes. To this end we give a general presentation of the classes of disambiguation algorithms and efficient disambiguation algorithms. We also quantitatively assess the quality of the presented error classification criteria benchmarking them in the setting of a formal development of constructive algebra.

  10. [.pdf] [.bib] Andrea Asperti, Claudio Sacerdoti Coen, Enrico Tassi, Stefano Zacchiroli. User Interaction with the Matita Proof Assistant. In Journal of Automated Reasoning, Volume 39, Number 2. Springer Netherlands, ISSN 0168-7433, pp. 109-139, 2007. Abstract...

    Abstract: Matita is a new, document-centric, tactic-based interactive theorem prover. This paper focuses on some of the distinctive features of the user interaction with Matita, mostly characterized by the organization of the library as a searchable knowledge base, the emphasis on a high-quality notational rendering, and the complex interplay between syntax, presentation, and semantics.

editorials

  1. [.pdf] [.bib] Federico Balaguer, Roberto Di Cosmo, Alejandra Garrido, Fabio Kon, Gregorio Robles, Stefano Zacchiroli. Open Source Systems: Towards Robust Practices. 13th IFIP WG 2.13 International Conference, OSS 2017, Buenos Aires, Argentina, May 22-23, 2017, Proceedings. IFIP Advances in Information and Communication Technology 496, Springer 2017, ISBN 978-3-319-57734-0.
  2. [.pdf] [.bib] Angelo Di Iorio, Davide Rossi, Stefano Zacchiroli. Editorial. In Journal of Web Engineering, Volume 14, Number 1-2, pp. 1-2. ISSN 1540-9589, Rinton Press, 2014.
  3. [.pdf] [.bib] Angelo Di Iorio, Davide Rossi, Stefano Zacchiroli. Web Technologies: Selected and extended papers from WT ACM SAC 2012. In Science of Computer Programming, Volume 94, Part 1, pp. 1-2. ISSN 0167-6423, Elsevier, 2014.
  4. [.pdf] [.bib] Angelo Di Iorio, Davide Rossi, Stefano Zacchiroli. Editorial. In Software: Practice and Experience, Volume 43, Issue 12, pp. 1393-1394. ISSN 1097-024X, Wiley, 2013.

book chapters

  1. [.pdf] [.bib] Angelo Di Iorio, Fabio Vitali, Stefano Zacchiroli. Web Semantics via Wiki Templating. Chapter 34 of Handbook of research on Web 2.0, 3.0 and x.0: technologies, business and social applications. San Murugesan Ed., Information Science Reference, November 2009, ISBN 978-1605663845. Abstract...

    Abstract: A foreseeable incarnation of Web 3.0 could inherit machine understandability from the Semantic Web, and collaborative editing from Web 2.0 applications. We review the research and development trends which are getting today Web nearer to such an incarnation. We present semantic wikis, microformats, and the so-called "lowercase semantic web": they are the main approaches at closing the technological gap between content authors and Semantic Web technologies. We discuss a too often neglected aspect of the associated technologies, namely how much they adhere to the wiki philosophy of open editing: is there an intrinsic incompatibility between semantic rich content and unconstrained editing? We argue that the answer to this question can be "no", provided that a few yet relevant shortcomings of current Web technologies will be fixed soon.

international, peer-reviewed conference proceedings

  1. [.pdf] [.bib] Roberto Di Cosmo, Stefano Zacchiroli. Software Heritage: Why and How to Preserve Software Source Code. To appear in Proceedings of iPRES 2017: 14th International Conference on Digital Preservation, Kyoto, Japan, September 2017, 10 pages. Abstract...

    Abstract: Software is now a key component present in all aspects of our society. Its preservation has attracted growing attention over the past years within the digital preservation community. We claim that source code—the only representation of software that contains human readable knowledge—is a precious digital object that needs special handling: it must be a first class citizen in the preservation landscape and we need to take action immediately, given the in- creasingly more frequent incidents that result in permanent losses of source code collections. In this paper we present Software Heritage, an ambitious initiative to collect, preserve, and share the entire corpus of publicly accessible software source code. We discuss the archival goals of the project, its use cases and role as a participant in the broader digital preservation ecosystem, and detail its key design decisions. We also report on the project road map and the current status of the Software Heritage archive that, as of early 2017, has collected more than 3 billion unique source code files and 700 million commits coming from more than 50 million software development projects.

  2. [.pdf] [.bib] Roberto Di Cosmo, Antoine Eiche, Jacopo Mauro, Stefano Zacchiroli, Gianluigi Zavattaro, Jakub Zwolakowski. Automatic Deployment of Services in the Cloud with Aeolus Blender. In proceedings of ICSOC 2015: 13th International Conference on Service Oriented Computing, November 16-19, 2015, Goa, India. ISBN 978-3-662-48615-3, pp. 397-411, Springer-Verlag 2015. Abstract...

    Abstract: We present Aeolus Blender (Blender in the following), a software product for the automatic deployment and configuration of complex service-based, distributed software systems in the "cloud". By relying on a configuration optimiser and a deployment planner, Blender fully automates the deployment of real-life applications on OpenStack cloud deployments, by exploiting a knowledge base of software services provided by the Mandriva Armonic tool suite. The final deployment is guaranteed to satisfy not only user requirements and relevant software dependencies, but also to be optimal with respect to the number of used virtual machines.

  3. [.pdf] [.bib] Roberto Di Cosmo, Michael Lienhardt, Jacopo Mauro, Stefano Zacchiroli, Gianluigi Zavattaro, Jakub Zwolakowski. Automatic Application Deployment in the Cloud: from Practice to Theory and Back. In proceedings of CONCUR 2015: 26th International Conference on Concurrency Theory, September 1-4, 2015, Madrid, Spain. Leibniz International Proceedings in Informatics (LIPIcs) 42, pp. 1-16, ISBN 978-3-939897-91-0, Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik 2015. Abstract...

    Abstract: The problem of deploying a complex software application has been formally investigated in previous work by means of the abstract component model named Aeolus. As the problem turned out to be undecidable, simplified versions of the model were investigated in which decidability was restored by introducing limitations on the ways components are described. In this paper, we take an opposite approach, and investigate the possibility to address a relaxed version of the deployment problem without limiting the expressiveness of the component model. We identify three problems to be solved in sequence: (i) the verification of the existence of a final configuration in which all the constraints imposed by the single components are satisfied, (ii) the generation of a concrete configuration satisfying such constraints, and (iii) the synthesis of a plan to reach such a configuration possibly going through intermediary configurations that violate the non-functional constraints.

  4. [.pdf] [.bib] Stefano Zacchiroli. The Debsources Dataset: Two Decades of Debian Source Code Metadata. In proceedings of MSR 2015: The 12th Working Conference on Mining Software Repositories, May 16-17, 2015, Florence, Italy. Co-located with ICSE 2015. ISBN ISBN 978-0-7695-5594-2, pp. 466-469, IEEE 2015. Abstract...

    Abstract: We present the Debsources Dataset: distribution metadata and source code metrics spanning two decades of Free and Open Source Software (FOSS) history, seen through the lens of the Debian distribution. Debsources is a software platform used to gather, search, and publish on the Web the full source code of the Debian operating system, as well as measures about it. A notable public instance of Debsources is available at http://sources.debian.net; it includes both current and historical releases of Debian. Plugins to compute popular source code metrics (lines of code, defined symbols, disk usage) and other derived data (e.g., checksums) have been written, integrated, and run on all the source code available on sources.debian.net. The Debsources Dataset is a PostgreSQL database dump of sources.debian.net metadata, as of February 10th, 2015. The dataset contains both Debian-specific metadata—e.g., which software packages are available in which release, which source code file belong to which package, release dates, etc.—and source code information gathered by running Debsources plugins. The Debsources Dataset offer a very long-term historical view of the macro-level evolution and constitution of FOSS through the lens of popular, representative FOSS projects of their times.

  5. [.pdf] [.bib] Pietro Abate, Roberto Di Cosmo, Louis Gesbert, Fabrice Le Fessant, Ralf Treinen, Stefano Zacchiroli. Mining Component Repositories for Installability Issues. In proceedings of MSR 2015: The 12th Working Conference on Mining Software Repositories, May 16-17, 2015, Florence, Italy. Co-located with ICSE 2015. ISBN ISBN 978-0-7695-5594-2, pp. 24-33, IEEE 2015. Abstract...

    Abstract: Component repositories play an increasingly relevant role in software life-cycle management, from software distribution to end-user, to deployment and upgrade management. Software components shipped via such repositories are equipped with rich metadata that describe their relationship (e.g., dependencies and conflicts) with other components. In this practice paper we show how to use a tool, distcheck, that uses component metadata to identify all the components in a repository that cannot be installed (e.g., due to unsatisfiable dependencies), provides detailed information to help developers understanding the cause of the problem, and fix it in the repository. We report about detailed analyses of several repositories: the Debian distribution, the OPAM package collection, and Drupal modules. In each case, distcheck is able to efficiently identify not installable components and provide valuable explanations of the issues. Our experience provides solid ground for generalizing the use of distcheck to other component repositories.

  6. [.pdf] [.bib] Roberto Di Cosmo, Michael Lienhardt, Ralf Treinen, Stefano Zacchiroli, Jakub Zwolakowski, Antoine Eiche, Alexis Agahi. Automated Synthesis and Deployment of Cloud Applications. In proceedings of ASE 2014: 29th IEEE/ACM International Conference on Automated Software Engineering, September 15-19, 2014, Vasteras, Sweden. ISBN 978-1-4503-3013-8, pp. 211-222, ACM 2014. Abstract...

    Abstract: Complex networked applications are assembled by connecting software components distributed across multiple machines. Building and deploying such systems is a challenging problem which requires a significant amount of expertise: the system architect must ensure that all component dependencies are satisfied, avoid conflicting components, and add the right amount of component replicas to account for quality of service and fault-tolerance. In a cloud environment, one also needs to minimize the virtual resources provisioned upfront, to reduce the cost of operation. Once the full architecture is designed, it is necessary to correctly orchestrate the deployment phase, to ensure all components are started and connected in the right order. We present a toolchain that automates the assembly and deployment of such complex distributed applications. Given as input a high-level specification of the desired system, the set of available components together with their requirements, and the maximal amount of virtual resources to be committed, it synthesizes the full architecture of the system, placing components in an optimal manner using the minimal number of available machines, and automatically deploys the complete system in a cloud environment.

  7. [.pdf] [.bib] Matthieu Caneill, Stefano Zacchiroli. Debsources: Live and Historical Views on Macro-Level Software Evolution. In proceedings of ESEM 2014: 8th International Symposium on Empirical Software Engineering and Measurement, September 18-19, 2014, Torino, Italy. ISBN 978-1-4503-2774-9, ACM 2014. Abstract...

    Abstract: Context. Software evolution has been an active field of research in recent years, but studies on macro-level software evolution---i.e., on the evolution of large software collections over many years---are scarce, despite the increasing popularity of intermediate vendors as a way to deliver software to final users. Goal. We want to ease the study of both day-by-day and long-term Free and Open Source Software (FOSS) evolution trends at the macro-level, focusing on the Debian distribution as a proxy of relevant FOSS projects. Method. We have built Debsources, a software platform to gather, search, and publish on the Web all the source code of Debian and measures about it. We have set up a public Debsources instance at http://sources.debian.net, integrated it into the Debian infrastructure to receive live updates of new package releases, and written plugins to compute popular source code metrics. We have injected all current and historical Debian releases into it. Results. The obtained dataset and Web portal provide both long term-views over the past 20 years of FOSS evolution and live insights on what is happening at sub-day granularity. By writing simple plugins (~100 lines of Python each) and adding them to our Debsources instance we have been able to easily replicate and extend past empirical analyses on metrics as diverse as lines of code, number of packages, and rate of change---and make them perennial. We have obtained slightly different results than our reference study, but confirmed the general trends and updated them in light of 7 extra years of evolution history. Conclusions. Debsources is a flexible platform to monitor large FOSS collections over long periods of time. Its main instance and dataset are valuable resources for scholars interested in macro-level software evolution.

  8. [.pdf] [.bib] Michel Catan, Roberto Di Cosmo, Antoine Eiche, Tudor A. Lascu, Michael Lienhardt, Jacopo Mauro, Ralf Treinen, Stefano Zacchiroli, Gianluigi Zavattaro, Jakub Zwolakowski. Aeolus: Mastering the Complexity of Cloud Application Deployment. In proceedings of ESOCC 2013: Service-Oriented and Cloud Computing, 2nd European Conference, Málaga, Spain, September 11-13, 2013. LNCS 8135, pp. 1-3, Springer-Verlag, 2013. Abstract...

    Abstract: Cloud computing offers the possibility to build sophisticated software systems on virtualized infrastructures at a fraction of the cost necessary just few years ago, but deploying/maintaining/reconfiguring such software systems is a serious challenge. The main objective of the Aeolus project, an initiative funded by ANR (the French "Agence Nationale de la Recherche"), is to tackle the scientific problems that need to be solved in order to ease the problem of efficient and cost-effective deployment and administration of the complex distributed architectures which are at the heart of cloud applications.

  9. [.pdf] [.bib] Roberto Di Cosmo, Ralf Treinen, Stefano Zacchiroli. Formal Aspects of Free and Open Source Software Components. In proceedings of FMCO 2012: HATS International School on Formal Models for Components and Objects, Bertinoro, Italy, 24-28 September 2012. LNCS 7866, pp. 216-239, Springer-Verlag, 2013. Abstract...

    Abstract: Free and Open Source Software (FOSS) distributions are popular solutions to deploy and maintain software on server, desktop, and mobile computing equipment. The typical deployment method in the FOSS setting relies on software distributions as vendors, packages as independently deployable components, and package managers as upgrade tools. We review research results from the past decade that apply formal methods to the study of inter-component relationships in the FOSS context. We discuss how those results are being used to attack both issues faced by users, such as dealing with upgrade failures on target machines, and issues important to distributions such as quality assurance processes for repositories containing tens of thousands, rapidly evolving software packages.

  10. [.pdf] [.bib] Roberto Di Cosmo, Jacopo Mauro, Stefano Zacchiroli, Gianluigi Zavattaro. Component Reconfiguration in the Presence of Conflicts. In proceedings of ICALP 2013: 40th International Colloquium on Automata, Languages and Programming, Riga, Latvia, 8-12 July, 2013. LNCS 7966, pp. 187-198, Springer-Verlag, 2013. Abstract...

    Abstract: Components are traditionally modeled as black-boxes equipped with interfaces that indicate provided/required ports and, often, also conflicts with other components that cannot coexist with them. In modern tools for automatic system management, components become grey-boxes that show relevant internal states and the possible actions that can be acted on the components to change such state during the deployment and reconfiguration phases. However, state-of-the-art tools in this field do not support a systematic management of conflicts. In this paper we investigate the impact of conflicts by precisely characterizing the increment of complexity on the reconfiguration problem.

  11. [.pdf] [.bib] Cyrille Valentin Artho, Kuniyasu Suzaki, Roberto Di Cosmo, Ralf Treinen, Stefano Zacchiroli. Why do software packages conflict?. In proceedings of MSR 2012: 9th IEEE Working Conference on Mining Software Repositories, co-located with ICSE 2012, IEEE, ISBN 978-1-4673-1760-3, pp. 141-150. June 2-3, Zurich, Switzerland. Abstract...

    Abstract: Determining whether two or more packages cannot be installed together is an important issue in the quality assurance process of package-based distributions. Unfortunately, the sheer number of different configurations to test makes this task particularly challenging, and hundreds of such incompatibilities go undetected by the normal testing and distribution process until they are later reported by a user as bugs that we call "conflict defects". We performed an extensive case study of conflict defects extracted from the bug tracking systems of Debian and Red Hat. According to our results, conflict defects can be grouped into five main categories. We show that with more detailed package meta-data, about 30% of all conflict defects could be prevented relatively easily, while another 30% could be found by targeted testing of packages that share common resources or characteristics. These results allow us to make precise suggestions on how to prevent and detect conflict defects in the future.

  12. [.pdf] [.bib] Roberto Di Cosmo, Stefano Zacchiroli, Gianluigi Zavattaro. Towards a Formal Component Model for the Cloud. In proceedings of SEFM 2012: 10th International Conference on Software Engineering and Formal Methods, Thessaloniki, Greece, 1-5 October, 2012. LNCS 7504, pp. 156-171, Springer-Verlag, 2012. Abstract...

    Abstract: We consider the problem of deploying and (re)configuring resources in a "cloud" setting, where interconnected software components and services can be deployed on clusters of heterogeneous (virtual) machines that can be created and connected on-the-fly. We introduce the Aeolus component model to capture similar scenarii from realistic cloud deployments, and instrument automated planning of day-to-day activities such as software upgrade planning, service deployment, elastic scaling, etc. We formalize the model and characterize the feasibility and complexity of configuration achievability in Aeolus.

  13. [.pdf] [.bib] Pietro Abate, Roberto Di Cosmo, Ralf Treinen, Stefano Zacchiroli. Learning from the Future of Component Repositories. In proceedings of CBSE 2012: 15th International ACM SIGSOFT Symposium on Component Based Software Engineering, Bertinoro, Italy, June 26-28, 2012. ISBN 978-1-4503-1345-2, pp. 51-60, ACM 2012. Award: Best Paper Award. Abstract...

    Abstract: An important aspect of the quality assurance of large component repositories is the logical coherence of component metadata. We argue that it is possible to identify certain classes of such problems by checking relevant properties of the possible future repositories into which the current repository may evolve. In order to make a complete analysis of all possible futures effective however, one needs a way to construct a finite set of representatives of this infinite set of potential futures. We define a class of properties for which this can be done. We illustrate the practical usefulness of the approach with two quality assurance applications: (i) establishing the amount of "forced upgrades" induced by introducing new versions of existing components in a repository, and (ii) identifying outdated components that need to be upgraded in order to ever be installable in the future. For both applications we provide experience reports obtained on the Debian distribution.

  14. [.pdf] [.bib] Pietro Abate, Roberto Di Cosmo, Ralf Treinen, Stefano Zacchiroli. MPM: a modular package manager. In proceedings of CBSE 2011: 14th International ACM SIGSOFT Symposium on Component Based Software Engineering, Boulder, Colorado, USA, 21-23 June, 2011. ISBN 978-1-4503-0723-9, pp. 179-188, ACM 2011. Award: ACM SIGSOFT Distinguished Paper Award Abstract...

    Abstract: Software distributions in the FOSS world rely on so-called package managers for the installation and removal of packages on target machines. State-of-the-art package managers are monolithic in architecture, and each of them is hard-wired to an ad-hoc dependency solver implementing a customized heuristics. In this paper we propose a modular architecture allowing for pluggable dependency solvers and backends. We argue that this is the path that leads to the next generation of package managers that will deliver better results, accept more expressive input languages, and can be easily adaptable to new platforms. We present a working prototype, called MPM, which has been implemented following the design advocated in this paper.

  15. [.pdf] [.bib] Roberto Di Cosmo, Stefano Zacchiroli. Feature Diagrams as Package Dependencies. In proceedings of SPLC 2010: 14th International Software Product Line Conference, Jeju Island, South Korea, 13-17 September 2010. LNCS 6287, ISBN 978-3-642-15578-9, pp. 476-480, Springer-Verlag, 2010. Abstract...

    Abstract: FOSS (Free and Open Source Software) distributions use dependencies and package managers to maintain huge collections of packages and their installations; recent research have led to efficient and complete configuration tools and techniques, based on state of the art solvers, that are being adopted in industry. We show how to encode a significant subset of Free Feature Diagrams as interdependent packages, enabling to reuse package tools and research results into software product lines.

  16. [.pdf] [.bib] Lucas Nussbaum, Stefano Zacchiroli. The Ultimate Debian Database: Consolidating Bazaar Metadata for Quality Assurance and Data Mining. In proceedings of MSR 2010: 7th IEEE Working Conference on Mining Software Repositories, co-located with ICSE 2010, IEEE, ISBN 978-1-4244-6802-7, pp. 52-61. 02-03/05/2010, Cape Town, South Africa. Abstract...

    Abstract: FLOSS distributions like RedHat and Ubuntu require a lot more complex infrastructures than most other FLOSS projects. In the case of community-driven distributions like Debian, the development of such an infrastructure is often not very organized, leading to new data sources being added in an impromptu manner while hackers set up new services that gain acceptance in the community. Mixing and matching data is then harder than should be, albeit being badly needed for Quality Assurance and data mining. Massive refactoring and integration is not a viable solution either, due to the constraints imposed by the bazaar development model. This paper presents the Ultimate Debian Database (UDD), which is the countermeasure adopted by the Debian project to the above "data hell". UDD gathers data from various data sources into a single, central SQL database, turning Quality Assurance needs that could not be easily implemented before into simple SQL queries. The paper also discusses the customs that have contributed to the data hell, the lessons learnt while designing UDD, and its applications and potentialities for data mining on FLOSS distributions.

  17. [.pdf] [.bib] Gabriele D'Angelo, Fabio Vitali, Stefano Zacchiroli. Content Cloaking: Preserving Privacy with Google Docs and other Web Applications. In proceedings of ACM SAC 2010: 25th Annual ACM Symposium on Applied Computing, ISBN 978-1-60558-639-7, pp. 826-830. 22-26/03/2010 - Sierre, Switzerland. Abstract...

    Abstract: Web office suites such as Google Docs offer unparalleled collaboration experiences in terms of low software requirements, ease of use, data ubiquity, and availability. When the data holder (Google, Microsoft, etc.) is not perceived as trusted though, those benefits are considered at stake with important privacy requirements. Content cloaking is a lightweight, cryptographic, client-side solution to protect content from data holders while using web office suites and other "Web 2.0", AJAX-based, collaborative applications.

  18. [.pdf] [.bib] Pietro Abate, Jaap Boender, Roberto Di Cosmo, Stefano Zacchiroli. Strong Dependencies between Software Components. In proceedings of ESEM 2009: 3rd International Symposium on Empirical Software Engineering and Measurement, ISBN 978-1-4244-4842-5, pp. 89-99. October 15-16, 2009 - Lake Buena Vista, Florida, USA. Abstract...

    Abstract: Component-based systems often describe context requirements in terms of explicit inter-component dependencies. Studying large instances of such systems, such as free and open source software (FOSS) distributions, in terms of declared dependencies between packages is appealing. It is however also misleading when the language to express dependencies is as expressive as boolean formulae, which is often the case. In such settings, a more appropriate notion of component dependency exists: strong dependency. This paper introduces such notion as a first step towards modeling semantic, rather then syntactic, inter-component relationships. Furthermore, a notion of component sensitivity is derived from strong dependencies, with applications to quality assurance and to the evaluation of upgrade risks. An empirical study of strong dependencies and sensitivity is presented, in the context of one of the largest, freely available, component-based system.

  19. [.pdf] [.bib] Antonio Cicchetti, Davide Di Ruscio, Patrizio Pelliccione, Alfonso Pierantonio, Stefano Zacchiroli. A Model Driven Approach to Upgrade Package-Based Software Systems. In proceedings of ENASE 2009: 4th international conference on Evaluation of Novel Aspects to Software Engineering; held in conjunction with ICEIS 2009. 6-10 May 2009, Milan, Italy. CCIS Volume 69, pp. 262-276, Springer-Verlag, 2010. Abstract...

    Abstract: Complex software systems are more and more based on the abstraction of package, brought to popularity by Free and Open Source Software (FOSS) distributions. While helpful as an encapsulation layer, packages do not solve all problems of deployment, and more generally of management, of large software collections. In particular upgrades, which often affect several packages at once due to inter-package dependencies, often fail and do not hold good transactional properties. This paper shows how to apply model driven techniques to describe and manage software upgrades of FOSS distributions. It is discussed how to model static and dynamic aspects of package upgrades, the latter being the most challenging aspect to deal with, in order to be able to predict common causes of upgrade failures and undo residual effects of failed or undesired upgrades.

  20. [.pdf] [.bib] Angelo Di Iorio, Davide Rossi, Fabio Vitali, Stefano Zacchiroli. Where are your Manners? Sharing Best Community Practices in the Web 2.0. In proceedings of ACM SAC 2009: the 24th Annual ACM Symposium on Applied Computing. ISBN 978-1-60558-166-8, pp. 681-687, ACM. Abstract...

    Abstract: The Web 2.0 fosters the creation of communities by offering users a wide array of social software tools. But, while the success of these tools is based on their ability to support different interaction patterns among users by imposing as less limitations as possible, the communities they support are not free of rules (just think about the posting rules in a community forum or the editing rules in a thematic wiki). In this paper we propose a framework for the sharing of best community practices in the form of a (potentially rule-based) annotation layer that can be integrated with existing Web 2.0 community tools (with specific focus on wikis). This solution is characterized by minimal intrusiveness and plays nicely within the open spirit of the Web 2.0 by proving users with behavioral hints rather than by enforcing the strict adherence to a set of rules.

  21. [.pdf] [.bib] Angelo Di Iorio, Fabio Vitali, Stefano Zacchiroli. Wiki Content Templating. In Proceedings of WWW 2008: 17th International World Wide Web Conference. April 21-25, 2008 Beijing, China. ACM ISBN 978-1-60558-085-2/08/04, pp. 615-624. Abstract...

    Abstract: Wiki content templating enables reuse of content structures among wiki pages. In this paper we present a thorough study of this widespread feature, showing how its two state of the art models (functional and creational templating) are sub-optimal. We then propose a third, better, model called lightly constrained (LC) templating and show its implementation in the Moin wiki engine. We also show how LC templating implementations are the appropriate technologies to push forward semantically rich web pages on the lines of (lowercase) semantic web and microformats.

  22. [.pdf] [.bib] Claudio Sacerdoti Coen, Stefano Zacchiroli. Spurious Disambiguation Error Detection. In Proceedings of MKM 2007: The 6th International Conference on Mathematical Knowledge Management. Hagenberg, Austria -- 27-30 June 2007. LNAI 4573, Springer Berlin / Heidelberg, ISBN 978-3-540-73083-5, pp. 381-392, 2007. Abstract...

    Abstract: The disambiguation approach to the input of formulae enables the user to type correct formulae in a terse syntax close to the usual ambiguous mathematical notation. When it comes to incorrect formulae we want to present only errors related to the interpretation meant by the user, hiding errors related to other interpretations (spurious errors). We propose a heuristic to recognize spurious errors, which has been integrated with the disambiguation algorithm of [1].

  23. [.pdf] [.bib] Andrea Asperti, Claudio Sacerdoti Coen, Enrico Tassi, Stefano Zacchiroli. Crafting a Proof Assistant. In Proceedings of Types 2006: Types for Proofs and Programs. Nottingham, UK -- April 18-21, 2006. LNCS 4502, Springer Berlin / Heidelberg, ISBN 978-3-540-74463-4, pp. 18-32, 2007. Abstract...

    Abstract: Proof assistants are complex applications whose development has never been properly systematized or documented. This work is a contribution in this direction, based on our experience with the development of Matita: a new interactive theorem prover based, as Coq, on the Calculus of Inductive Constructions (CIC). In particular, we analyze its architecture focusing on the dependencies of its components, how they implement the main functionalities, and their degree of reusability. The work is a first attempt to provide a ground for a more direct comparison between different systems and to highlight the common functionalities, not only in view of reusability but also to encourage a more systematic comparison of different softwares and architectural solutions.

  24. [.pdf] [.bib] Luca Padovani, Stefano Zacchiroli. From Notation to Semantics: There and Back Again. In Proceedings of MKM 2006: The 5th International Conference on Mathematical Knowledge Management. Wokingham, UK -- August 11-12, 2006. LNAI 4108, Springer Berlin / Heidelberg, ISBN 978-3-540-37104-5, pp. 194-207, 2006. Abstract...

    Abstract: Mathematical notation is a structured, open, and ambiguous language. In order to support mathematical notation in MKM applications one must necessarily take into account presentational as well as semantic aspects. The former are required to create a familiar, comfortable, and usable interface to interact with. The latter are necessary in order to process the information meaningfully. In this paper we investigate a framework for dealing with mathematical notation in a meaningful, extensible way, and we show an effective instantiation of its architecture to the field of interactive theorem proving. The framework builds upon well-known concepts and widely-used technologies and it can be easily adopted by other MKM applications.

  25. [.pdf] [.bib] Andrea Asperti, Ferruccio Guidi, Claudio Sacerdoti Coen, Enrico Tassi, Stefano Zacchiroli. A Content Based Mathematical Search Engine: Whelp. In Proceedings of TYPES 2004: Types for Proofs and Programs. Paris, France -- December 15-18, 2004. LNCS 3839, Springer Berlin / Heidelberg, ISBN 3-540-31428-8, pp. 17-32, 2006. Abstract...

    Abstract: The prototype of a content based search engine for mathematical knowledge supporting a small set of queries requiring matching and/or typing operations is described. The prototype, called Whelp, exploits a metadata approach for indexing the information that looks far more flexible than traditional indexing techniques for structured expressions like substitution, discrimination, or context trees. The prototype has been instantiated to the standard library of the Coq proof assistant extended with many user contributions.

  26. [.pdf] [.bib] Luca Padovani, Claudio Sacerdoti Coen, Stefano Zacchiroli. A Generative Approach to the Implementation of Language Bindings for the Document Object Model. In Proceedings of GPCE'04 3rd International Conference on Generative Programming and Component Engineering. Vancouver, Canada -- October 24-28, 2004 LNCS 3286, Springer Berlin / Heidelberg, ISBN 3-540-23580-9, pp. 469-487, 2004. Abstract...

    Abstract: The availability of a C implementation for the Document Object Model (DOM) gives the interesting opportunity of generating bindings for different programming languages automatically. Because of the DOM bias towards Java-like languages, a C implementation that fakes objects, inheritance, polymorphism, exceptions and uses reference-counting introduces a gap between the API specification and its actual implementation that the bindings should try to close. In this paper we overview the generative approach in this particular context and apply it for the generation of C++ and OCaml bindings.

  27. [.pdf] [.bib] Claudio Sacerdoti Coen, Stefano Zacchiroli. Efficient Ambiguous Parsing of Mathematical Formulae. In Proceedings of MKM 2004: 3rd International Conference on Mathematical Knowledge Management. September 19-21, 2004 Bialowieza - Poland. LNCS 3119, Springer Berlin / Heidelberg, ISBN 3-540-23029-7, pp. 347-362, 2004. Abstract...

    Abstract: Mathematical notation has the characteristic of being ambiguous: operators can be overloaded and information that can be deduced is often omitted. Mathematicians are used to this ambiguity and can easily disambiguate a formula making use of the context and of their ability to find the right interpretation. Software applications that have to deal with formulae usually avoid these issues by fixing an unambiguous input notation. This solution is annoying for mathematicians because of the resulting tricky syntaxes and becomes a show stopper to the simultaneous adoption of tools characterized by different input languages. In this paper we present an efficient algorithm suitable for ambiguous parsing of mathematical formulae. The only requirement of the algorithm is the existence of a validity predicate over abstract syntax trees of incomplete formulae with placeholders. This requirement can be easily fulfilled in the applicative area of interactive proof assistants, and in several other areas of Mathematical Knowledge Management.

international, peer-reviewed workshop proceedings

  1. [.pdf] [.bib] Pietro Abate, Roberto Di Cosmo, Louis Gesbert, Fabrice Le Fessant, Stefano Zacchiroli. Using Preferences to Tame your Package Manager. In proceedings of OCaml 2014: The OCaml Users and Developers Workshop, September 5, 2014, Gothenburg, Sweden. Co-located with ICFP 2014. 2014. Abstract...

    Abstract: Determining whether some components can be installed on a system is a complex problem: not only it is NP-complete in the worst case, but there can also be exponentially many solutions to it. Ordinary package managers use ad-hoc heuristics to solve this installation problem and choose a particular solution, making extremely difficult to change or sidestep these heuristics when the result is not the one we expect. When software repositories become complex enough, one gets vastly superior results by delegating dependency handling to a specialised solver, and use optimisation functions (or preferences) to control the class of solutions that are found. The opam package manager relies on the CUDF pivot format, which allows OCaml users that have a CUDF-compliant solver on their machine to reap the benefits of preferences-based dependency resolution. Thanks to the solver farm provided by Irill, these benefits are now extended to the OCaml community at large. In this talk we will present the preferences language and explain how to use it.

  2. [.pdf] [.bib] Cyrille Valentin Artho, Roberto Di Cosmo, Kuniyasu Suzaki, Stefano Zacchiroli. Sources of Inter-package Conflicts in Debian. In proceedings of LoCoCo 2011 International Workshop on Logics for Component Configuration, affiliated with CP 2011 Abstract...

    Abstract: Inter-package conflicts require the presence of two or more packages in a particular configuration, and thus tend to be harder to detect and localize than conventional (intra-package) defects. Hundreds of such inter-package conflicts go undetected by the normal testing and distribution process until they are later reported by a user. The reason for this is that current meta-data is not fine-grained and accurate enough to cover all common types of conflicts. A case study of inter-package conflicts in Debian has shown that with more detailed package meta-data, at least one third of all package conflicts could be prevented relatively easily, while another one third could be found by targeted testing of packages that share common resources or characteristics. This paper reports the case study and proposes ideas to detect inter-package conflicts in the future.

  3. [.pdf] [.bib] Ralf Treinen, Stefano Zacchiroli. Expressing Advanced User preferences in Component Installation. In proceedings of IWOCE 2009: International Workshop on Open Component Ecosystem, affiliated with ESEC/FSE 2009. Foundations of Software Engineering, ISBN 978-1-60558-677-9, pp. 31-40, ACM 2009. Abstract...

    Abstract: State of the art component-based software collections, such as FOSS distributions, are made of up to dozens of thousands components, with complex inter-dependencies and conflicts. Given a particular installation of such a system, each request to alter the set of installed components has potentially (too) many satisfying answers. We present an architecture that allows to express advanced user preferences about package selection in FOSS distributions. The architecture is composed by a distribution-independent format for describing available and installed packages called CUDF (Common Upgradeability Description Format), and a foundational language called MooML to specify optimization criteria. We present the syntax and semantics of CUDF and MooML, and discuss the partial evaluation mechanism of MooML which allows to gain efficiency in package dependency solvers.

  4. [.pdf] [.bib] Davide Di Ruscio, Patrizio Pelliccione, Alfonso Pierantonio, Stefano Zacchiroli. Towards maintainer script modernization in FOSS distributions. In proceedings of IWOCE 2009: International Workshop on Open Component Ecosystem, affiliated with ESEC/FSE 2009. Foundations of Software Engineering, ISBN 978-1-60558-677-9, pp. 11-20, ACM 2009. Abstract...

    Abstract: Free and Open Source Software (FOSS) distributions are complex software systems, made of thousands packages that evolve rapidly, independently, and without centralized coordination. During packages upgrades, corner case failures can be encountered and are hard to deal with, especially when they are due to misbehaving maintainer scripts: executable code snippets used to finalize package configuration. In this paper we report a software modernization experience, the process of representing existing legacy systems in terms of models, applied to FOSS distributions. We present a process to define meta-models that enable dealing with upgrade failures and help rolling back from them, taking into account maintainer scripts. The process has been applied to widely used FOSS distributions and we report about such experiences.

  5. [.pdf] [.bib] Roberto Di Cosmo, Paulo Trezentos, Stefano Zacchiroli. Package Upgrades in FOSS Distributions: Details and Challenges. In proceedings of HotSWUp'08: Hot Topics in Software Upgrades. October 20, 2008, Nashville, Tennessee, USA. ACM ISBN 978-1-60558-304-4. Abstract...

    Abstract: The upgrade problems faced by Free and Open Source Software distributions have characteristics not easily found elsewhere. We describe the structure of packages and their role in the upgrade process. We show that state of the art package managers have shortcomings inhibiting their ability to cope with frequent upgrade failures. We survey current countermeasures to such failures, argue that they are not satisfactory, and sketch alternative solutions.

  6. [.pdf] [.bib] Paolo Marinelli, Fabio Vitali, Stefano Zacchiroli. Streaming Validation of Schemata: the Lazy Typing Discipline. In Proceedings of Extreme Markup Languages 2007: The Markup Theory and Practice Conference. August 7-10, 2007 Montreal, Canada. Abstract...

    Abstract: Assertions, identity constraints, and conditional type assignments are (planned) features of XML Schema which rely on XPath evaluation to various ends. The allowed XPath subset exploitable in those features is trimmed down for streamability concerns partly understandable (the apparent wish to avoid buffering to determine the evaluation of an expression) and partly artificial. In this paper we dissect the XPath language in subsets with varying streamability characteristics. We also identify the larger subset which is compatible with the typing discipline we believe underlies some of the choices currently present in the XML Schema specifications. We describe such a discipline as imposing that the type of an element has to be decided when its start tag is encountered and its validity has to be when its end tag is. We also propose an alternative lazy typing discipline where both type assignment and validity assessment are fired as soon as they are available in a best effort manner. We believe our discipline is more flexible and delegate to schema authors the choice of where to place in the trade-off between using larger XPath subsets and increasing buffering requirements or expeditiousness of typing information availability.

  7. [.pdf] [.bib] Paolo Marinelli, Stefano Zacchiroli. Co-Constraint Validation in a Streaming Context. In Proceedings of XML 2006: The world's oldest and biggest XML conference. Award: Winner of the XML Scholarship 2006 as best student paper. Boston, MA -- December 5-7, 2006. Abstract...

    Abstract: In many use cases applications are bound to be run consuming only a limited amount of memory. When they need to validate large XML documents, they have to adopt streaming validation, which does not rely on an in-memory representation of the whole input document. In order to validate an XML document, different kinds of constraints need to be verified. Co-constraints, which relate the content of elements to the presence and values of other attributes or elements, are one such kind of constraints. In this paper we propose an approach to the problem of validating in a streaming fashion an XML document against a schema also specifying co-constraints. We describe how the streaming evaluation of co-constraints influences the output of the validation process. Our proposal makes use of the validation language SchemaPath, a light extension to XML Schema, adding conditional type assignment for the support of co-constraints. The paper is based on the description of our streaming SchemaPath validator.

  8. [.pdf] [.bib] Claudio Sacerdoti Coen, Enrico Tassi, Stefano Zacchiroli. Tinycals: Step by Step Tacticals. In Proceedings of UITP 2006: User Interfaces for Theorem Provers. Seattle, WA -- August 21, 2006. ENTCS (Elsevier, ISSN 1571-0661), Volume 174, Issue 2, pp. 125-142. May 2007. Abstract...

    Abstract: Most of the state-of-the-art proof assistants are based on procedural proof languages, scripts, and rely on LCF tacticals as the primary tool for tactics composition. In this paper we discuss how these ingredients do not interact well with user interfaces based on the same interaction paradigm of Proof General (the de facto standard in this field), identifying in the coarse-grainedness of tactical evaluation the key problem. We propose Tinycals as an alternative to a subset of LCF tacticals, showing that the user does not experience the same problem if tacticals are evaluated in a more fine-grained manner. We present the formal operational semantics of tinycals as well as their implementation in the Matita proof assistant.

  9. [.pdf] [.bib] Angelo Di Iorio, Stefano Zacchiroli. Constrained Wiki: an Oxymoron?. In Proceedings of WikiSym 2006: the 2006 International Symposium on Wikis. Odense, Denmark -- August 21-23, 2006. ACM, 2006, ISBN 1-59593-417-0, pp. 89-98. Abstract...

    Abstract: In this paper we propose a new wiki concept -- light constraints -- designed to encode community best practices and domain-specific requirements, and to assist in their application. While the idea of constraining user editing of wiki content seems to inherently contradict "The Wiki Way", it is well-known that communities of users involved in wiki sites have the habit of establishing best authoring practices. For domain-specific wiki systems which process wiki content, it is often useful to enforce some well-formedness conditions on specific page contents. This paper describes a general framework to think about the interaction of wiki system with constraints, and presents a generic architecture which can be easily incorporated into existing wiki systems to exploit the capabilities enabled by light constraints.

  10. [.pdf] [.bib] Andrea Asperti, Stefano Zacchiroli. Searching Mathematics on the Web: State of the Art and Future Developments. In Proceedings of New Developments in Electronic Publishing AMS/SMM Special Session, Houston, May 2004 ECM4 Satellite Conference, Stockholm, June 2004 pp. 9-18. FIZ Karlsruhe, ISBN 3-88127-107-4. Abstract...

    Abstract: A huge amount of mathematical knowledge is nowadays available on the World Wide Web. Many different solutions and technologies for searching that knowledge have been developed as well. We present the state of the art of searching mathematics on the Web, giving some insight on future developments in this area.

  11. [.pdf] [.bib] Claudio Sacerdoti Coen, Stefano Zacchiroli. Brokers and Web-Services for Automatic Deduction: a Case Study. In Proceedings of Calculemus 2003: 11th Symposium on the Integration of Symbolic Computation and Mechanized Reasoning. Roma, Italy -- September 10-12, 2003, Aracne Editrice. ISBN 88-7999-545-6, pp. 43-57, 2003. Abstract...

    Abstract: We present a planning broker and several Web-Services for automatic deduction. Each Web-Service implements one of the tactics usually available in interactive proof-assistants. When the broker is submitted a proof status (an incomplete proof tree and a focus on an open goal) it dispatches the proof to the Web-Services, collects the successful results, and send them back to the client as hints as soon as they are available. In our experience this architecture turns out to be helpful both for experienced users (who can take benefit of distributing heavy computations) and beginners (who can learn from it).

national, peer-reviewed journal articles

  1. [.pdf] [.bib] Mehdi Dogguy, Stéphane Glondu, Sylvain Le Gall, Stefano Zacchiroli. Enforcing Type-Safe Linking using Inter-Package Relationships. In Studia Informatica Universalis, Volume 9, Issue 1, pp. 129-157. Hermann 2011. Abstract...

    Abstract: Strongly-typed languages rely on link-time checks to ensure that type safety is not violated at the borders of compilation units. Such checks entail very fine-grained dependencies among compilation units, which are at odds with the implicit assumption of backward compatibility that is relied upon by common library packaging techniques adopted by FOSS (Free and Open Source Software) package-based distributions. As a consequence, package managers are often unable to prevent users to install a set of libraries which cannot be linked together. We discuss how to guarantee link-time compatibility using inter-package relationships; in doing so, we take into account real-life maintainability problems such as support for automatic package rebuild and manageability of ABI (Application Binary Interface) strings by humans. We present the dh_ocaml implementation of the proposed solution, which is currently in use in the Debian distribution to safely deploy more than 300 OCaml-related packages.

national, peer-reviewed conference and workshop procedings

  1. [.pdf] [.bib] Mehdi Dogguy, Stéphane Glondu, Sylvain Le Gall, Stefano Zacchiroli. Enforcing Type-Safe Linking using Inter-Package Relationships. In proceedings of JFLA 2010: 21st Journée Francophones des Langages Applicatifs, pp. 29-54. 30/01-02/02/2010 - La Ciotat, France. Abstract...

    Abstract: Strongly-typed languages rely on link-time checks to ensure that type safety is not violated at the borders of compilation units. Such checks entail very fine-grained dependencies among compilation units, which are at odds with the implicit assumption of backward compatibility that is relied upon by common library packaging techniques adopted by FOSS (Free and Open Source Software) package-based distributions. As a consequence, package managers are often unable to prevent users to install a set of libraries which cannot be linked together. We discuss how to guarantee link-time compatibility using inter-package relationships; in doing so, we take into account real-life maintainability problems such as support for automatic package rebuild and manageability of ABI (Application Binary Interface) strings by humans. We present the dh_ocaml implementation of the proposed solution, which is currently in use in the Debian distribution to safely deploy more than 300 OCaml-related packages.

technical reports

  1. [.pdf] [.bib] Roberto Di Cosmo, Antoine Eiche, Jacopo Mauro, Gianluigi Zavattaro, Stefano Zacchiroli, Jakub Zwolakowski. Automatic Deployment of Software Components in the Cloud with the Aeolus Blender. Inria technical report 2015. Abstract...

    Abstract: Cloud computing allows to build sophisticated software sys-tems on virtualized infrastructures at a fraction of the cost that was necessary just a few years ago. The deployment of such complex systems, though, is still a serious issue due to the need of deploying a large number of packages and services, their elaborated interdependencies, and the need to define the (ideally optimal) allocation of software compo-nents onto available computing resources. In this paper we present the Aeolus Blender (Blender in the following), a toolchain that automates the assembly and deployment of complex component-based software systems in the "cloud". By relying on a configuration optimizer and a deployment planner, Blender fully automates the deploy-ment of real-life cloud applications on OpenStack infrastruc-tures, by exploiting a knowledge base of software compo-nents defined in the Mandriva Armonic tool-suite. The final deployment is guaranteed to satisfy not only user require-ments and software dependencies, but also to be optimal with respect to the number of used virtual machines.

  2. [.pdf] [.bib] Roberto Di Cosmo, Michael Lienhardt, Ralf Treinen, Stefano Zacchiroli, Jakub Zwolakowski. Optimal Provisioning in the Cloud. Aeolus project technical report, 7 Juin 2013. Abstract...

    Abstract: Complex distributed systems are classically assembled by deploying several existing software components to multiple servers. Building such systems is a challenging problem that requires a significant amount of problem solving as one must i) ensure that all inter-component dependencies are satisfied; ii) ensure that no conflicting components are deployed on the same machine; and iii) take into account replication and distribution to account for quality of service, or possible failure of some services. We propose a tool, Zephyrus, that automates to a great extent assembling complex distributed systems. Given i) a high level specification of the desired system architecture, ii) the set of available components and their requirements) and iii) the current state of the system, Zephyrus is able to generate a formal representation of the desired system, to place the components in an optimal manner on the available machines, and to interconnect them as needed.

  3. [.pdf] [.bib] Roberto Di Cosmo, Jacopo Mauro, Stefano Zacchiroli, Gianluigi Zavattaro. Component reconfiguration in the presence of conflicts. Aeolus project technical report, 22 Avril 2013. Abstract...

    Abstract: Components are traditionally modeled as black-boxes equipped with interfaces that indicate provided/required ports and, often, also conflicts with other components that cannot coexist with them. In modern tools for automatic system management, components become grey-boxes that show relevant internal states and the possible actions that can be acted on the components to change such state during the deployment and reconfiguration phases. However, state-of-the-art tools in this field do not support a systematic management of conflicts. In this paper we investigate the impact of conflicts by precisely characterizing the increment of complexity on the reconfiguration problem.

  4. [.pdf] [.bib] Ralf Treinen, Stefano Zacchiroli. Common Upgradeability Description Format (CUDF) 2.0. Mancoosi project technical report 3, 24 November 2009. Abstract...

    Abstract: The solver competition which will be organized by Mancoosi relies on the standardized format for describing package upgrade scenarios. This document describes the Common Upgradeability Description Format (CUDF), the document format used to encode upgrade scenarios, abstracting over distribution-specific details. Solvers taking part in the competition will be fed with input in CUDF format. The format is not specific to Mancoosi and is meant to be generally useful to describe upgrade scenarios when abstraction over distribution-specific details is desired.

  5. [.pdf] [.bib] Pietro Abate, Jaap Boender, Roberto Di Cosmo, Stefano Zacchiroli. Strong Dependencies between Software Components. Mancoosi project technical report 2, 22 May 2009. Abstract...

    Abstract: Component-based systems often describe context requirements in terms of explicit inter-component dependencies. Studying large instances of such systems, such as free and open source software (FOSS) distributions, in terms of declared dependencies between packages is appealing. It is however also misleading when the language to express dependencies is as expressive as boolean formulae, which is often the case. In such settings, a more appropriate notion of component dependency exists: strong dependency. This paper introduces such notion as a first step towards modeling semantic, rather then syntactic, inter-component relationships. Furthermore, a notion of component sensitivity is derived from strong dependencies, with applications to quality assurance and to the evaluation of upgrade risks. An empirical study of strong dependencies and sensitivity is presented, in the context of one of the largest, freely available, component-based system.

  6. [.pdf] [.bib] Davide Di Ruscio, Patrizio Pelliccione, Alfonso Pierantonio, Stefano Zacchiroli. Metamodel for Describing System Structure and State. Mancoosi project deliverable, D2.1, work package 2. January 2009. Abstract...

    Abstract: Today's software systems are very complex modular entities, made up of many interacting components that must be deployed and coexist in the same context. Modern operating systems provide the basic infrastructure for deploying and handling all the components that are used as the basic blocks for building more complex systems even though a generic and comprehensive support is far from being provided. In fact, in Free and Open Source Software (FOSS) systems, components evolve independently from each other and because of the huge amount of available components and their different project origins, it is not easy to manage the life cycle of a distribution. Users are in fact allowed to choose and install a wide variety of alternatives whose consistency cannot be checked a priori to their full extent. It is possible to easily make the system unusable by installing or removing some packages that "break" the consistency of what is installed in the system itself. This document proposes a model-driven approach to simulate system upgrades in advance and to detect predictable upgrade failures, possibly by notifying the user before the system is affected. The approach relies on an abstract representation of the systems and packages which are given in terms of models that are expressive enough to isolate inconsistent configurations (e.g., situations in which installed components rely on the presence of disappeared sub-components) that are currently not expressible as inter-package relationships.

  7. [.pdf] [.bib] Ralf Treinen, Stefano Zacchiroli. Description of the CUDF Format. Mancoosi project deliverable, D5.1, work package 5. November 2008. Abstract...

    Abstract: This document contains several related specifications, taken together they describe the document formats related to the solver competition which will be organized by Mancoosi. In particular, this document describes: DUDF (Distribution Upgradeability Description Format), the document format to be used to submit upgrade problem instances from user machines to a (distribution-specific) database of upgrade problems; CUDF (Common Upgradeability Description Format), the document format used to encode upgrade problems, abstracting over distribution-specific details. Solvers taking part in the competition will be fed with input in CUDF format.

  8. [.pdf] [.bib] Luca Padovani, Stefano Zacchiroli. Stream Processing of XML Documents Made Easy with LALR(1) Parser Generators. Technical report UBLCS-2007-23, September 2007, Department of Computer Science, University of Bologna. Abstract...

    Abstract: Because of their fully annotated structure, XML documents are normally believed to require a straightforward parsing phase. However, the standard APIs for accessing their content (the Document Object Model and the Simple API for XML) provide a programming interface that is very low-level and is thus inadequate for the recognition of any structure that is not isomorphic to its XML encoding. Even when the document undergoes validation, its unmarshalling into application-specific data using these APIs requires poorly maintainable, tedious-to-write, and possibly inefficient code. We describe a technique for the simultaneous parsing, validation, and unmarshalling of XML documents that combines a stream-oriented XML parser with a LALR(1) parser in order to guarantee efficient stream processing, expressive validation capabilities, and the possibility to associate user-provided actions with specific patterns occurring in the source documents.

  9. [.pdf] [.bib] Angelo Di Iorio, Fabio Vitali, Stefano Zacchiroli. Templating Wiki Content for Fun and Profit. Technical report UBLCS-2007-21, August 2007, Department of Computer Science, University of Bologna. Abstract...

    Abstract: Content templating enables reuse of content structures between wiki pages. Such a feature is implemented in several mainstream wiki engines. Systematic study of its conceptual models and comparison of the available implementations are unfortunately missing in the wiki literature. In this paper we aim to fill this gap first analyzing template-related user needs, and then reviewing existing approaches at content templating. Our investigation shows that two models emerge, functional and creational templating, and that both have weakness failing to properly fit in "The Wiki Way". As a solution, we propose the adoption of creational templates enriched with light constraints, showing that such a solution has a low implementative footprint in state-of-the-art wiki engines, and that it has a synergy with semantic wikis.

dissertations

  1. [.pdf] [.bib] Stefano Zacchiroli. User Interaction Widgets for Interactive Theorem Proving. Ph.D. dissertation, Technical report UBLCS-2007-10, March 2007, Department of Computer Science, University of Bologna (advisor: Andrea Asperti; refereed by: Christoph Benzmueller, Marino Miculan). Abstract...

    Abstract: Matita (that means pencil in Italian) is a new interactive theorem prover under development at the University of Bologna. When compared with state-of-the-art proof assistants, Matita presents both traditional and innovative aspects. The underlying calculus of the system, namely the Calculus of (Co)Inductive Constructions (CIC for short), is well-known and is used as the basis of another mainstream proof assistant, Coq, with which Matita is to some extent compatible. In the same spirit of several other systems, proof authoring is conducted by the user as a goal directed proof search, using a script for storing textual commands for the system. In the tradition of LCF, the proof language of Matita is procedural and relies on tactic and tacticals to proceed toward proof completion. The interaction paradigm offered to the user is based on the script management technique at the basis of the popularity of the Proof General generic interface for interactive theorem provers: while editing a script the user can move forth the execution point to deliver commands to the system, or back to retract (or "undo") past commands. Matita has been developed from scratch in the past 8 years by several members of the Helm research group, this thesis author is one of such members. Matita is now a full-fledged proof assistant with a library of about 1.000 concepts. Several innovative solutions spun-off from this development effort. This thesis is about the design and implementation of some of those solutions, in particular those relevant for the topic of user interaction with theorem provers, and of which this thesis author was a major contributor. Joint work with other members of the research group is pointed out where needed. The main topics discussed in this thesis are briefly summarized below. Disambiguation. Most activities connected with interactive proving require the user to input mathematical formulae. Being mathematical notation ambiguous, parsing formulae typeset as mathematicians like to write down on paper is a challenging task; a challenge neglected by several theorem provers which usually prefer to fix an unambiguous input syntax. Exploiting features of the underlying calculus, Matita offers an efficient disambiguation engine which permit to type formulae in the familiar mathematical notation. Step-by-step tacticals. Tacticals are higher-order constructs used in proof scripts to combine tactics together. With tacticals scripts can be made shorter, readable, and more resilient to changes. Unfortunately they are de facto incompatible with state-of-the-art user interfaces based on script management. Such interfaces indeed do not permit to position the execution point inside complex tacticals, thus introducing a trade-off between the usefulness of structuring scripts and a tedious big step execution behavior during script replaying. In Matita we break this trade-off with tinycals: an alternative to a subset of LCF tacticals which can be evaluated in a more fine-grained manner. Extensible yet meaningful notation. Proof assistant users often face the need of creating new mathematical notation in order to ease the use of new concepts. The framework used in Matita for dealing with extensible notation both accounts for high quality bidimensional rendering of formulae (with the expressivity of MathML-Presentation) and provides meaningful notation, where presentational fragments are kept synchronized with semantic representation of terms. Using our approach interoperability with other systems can be achieved at the content level, and direct manipulation of formulae acting on their rendered forms is possible too. Publish/subscribe hints. Automation plays an important role in interactive proving as users like to delegate tedious proving sub-tasks to decision procedures or external reasoners. Exploiting the Web-friendliness of Matita we experimented with a broker and a network of web services (called tutors) which can try independently to complete open sub-goals of a proof, currently being authored in Matita. The user receives hints from the tutors on how to complete sub-goals and can interactively or automatically apply them to the current proof. Another innovative aspect of Matita, only marginally touched by this thesis, is the embedded content-based search engine Whelp which is exploited to various ends, from automatic theorem proving to avoiding duplicate work for the user. We also discuss the (potential) reusability in other systems of the widgets presented in this thesis and how we envisage the evolution of user interfaces for interactive theorem provers in the Web 2.0 era.

  2. [.pdf] [.bib] Stefano Zacchiroli. Web services per il supporto alla dimostrazione interattiva (Web services for interactive theorem proving). Master thesis (Italian only), March 2003, Department of Computer Science, University of Bologna (advisor: Andrea Asperti; refereed by: Nadia Busi).

miscellanea

  1. [.pdf] [.bib] Ralf Treinen, Stefano Zacchiroli. Solving package dependencies: from EDOS to Mancoosi. In proceedings of DebConf8: 9th annual conference of the Debian project developers. August 10-16, 2008, Mar del Plata, Argentina. Abstract...

    Abstract: Mancoosi (Managing the Complexity of the Open Source Infrastructure) is an ongoing research project funded by the European Union for addressing some of the challenges related to the "upgrade problem" of interdependent software components of which Debian packages are prototypical examples. Mancoosi is the natural continuation of the EDOS project which has already contributed tools for distribution-wide quality assurance in Debian and other GNU/Linux distributions. The consortium behind the project consists of several European public and private research institutions as well as some commercial GNU/Linux distributions from Europe and South America. Debian is represented by a small group of Debian Developers who are working in the ranks of the involved universities to drive and integrate back achievements into Debian. This paper presents relevant results from EDOS in dependency management and gives an overview of the Mancoosi project and its objectives, with a particular focus on the prospective benefits for Debian.