Research Problems Swipe File

This article is an on-going collection of interesting research topics how how they have been addressed by the papers that I have inspected. Additional references and background can be found from the cited sources themselves.

Web of Things Services

Composing WoT Services with Uncertain and Correlated Data

A typical application in the WoT takes the form of a service composition as follows [2]: (1) sensors capture different properties of the physical world (e.g. temperature or humidity), (2) sensing services retrieve the data from the sensors, (3) processing services perform some data treatments, and (4) actuating services give orders to the actuators.

Causes of uncertainty in the data exchanged by WoT services (example is in the case of using IoT sensors to take care of "ornamental plants"):

Some leaves may not be detected by any camera, or they are detected blurred
The proper care cannot always be determined with certainty because different causes have the same symptom
The values returned by the different humidity sensors are not equal because they are located at different locations and they have different measurements of accuracy.

A reliable approach to composing WoT services with uncertain and correlated data must answer the following questions:
How to model the composition? a reliable service composition model must be able to model:
The links between sensors and sensing services, actuating services and actuators, and between the different types of WoT services (sensing, processing, and actuating services);
The certain and uncertain outputs of WoT services.
How to represent correlations between WoT data? the correlations and dependencies between the WoT data must be explicitly modeled. A reliable model must be able to model any type of correlation.
How to evaluate a composition of WoT services with uncertain and correlated data? given a composition plan, an effective evaluation method must be able to return the composition result in a reasonable time by taking into consideration the uncertain and correlated data.

"Evaluation method" here means the mechanism that infers the result of the composition.

Sources:

Awad,S.,Malki,A.,Malki,M.,Barhamgi,M.,Benslimane,D.:ComposingWoTservices with uncertain data. Future Generation Computer Systems 101, 940–950 (2019)

Transition Towards Continuous Delivery in Safety Domain

Model-driven Document Generation

It would be useful to be able to handle the documents that we need to create in an even more automated way. Perhaps model-driven approaches could be beneficial in order to support generating parts of documents in a more fine-grained manner.

Sources: Transition Towards Continuous Delivery in the Healthcare Domain, Proceedings - 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice, ICSE-SEIP 2019

Continuous monitoring of safety

When dealing with safety-critical systems, we must ensure the safety of the system continuously even when making many small releases. Approaches that help us reason about system-level attributes and the composability of artifacts are of value here.

Software Engineering for AI

Empirical study on SE process for AI

Problem:

Recent advances in machine learning have stimulated widespread interest within the Information Technology sector on integrating AI capabilities into software and services. This goal has forced organizations to evolve their development processes. We report on a study that we conducted on observing software teams at Microsoft as they develop AI-based applications.

Results:

We found that various Microsoft teams have united this workflow into preexisting, well-evolved, Agile-like software engineering processes, providing insights about several essential engineering challenges that organizations may face in creating large-scale AI solutions for the marketplace

We collected some best practices from Microsoft teams to address these challenges.

Three aspects of the AI domain that make it fundamentally different from prior software application domains:
1) discovering, managing, and versioning the data needed for machine learning applications is much more complex and difficult than other types of software engineering,
2) model customization and model reuse require very different skills than are typically found in software teams, and
3) AI components are more difficult to handle as distinct modules than traditional software components — models may be “entangled” in complex ways and experience non-monotonic error behavior.

Method:

We collected data in two phases: an initial set of interviews to gather the major topics relevant to our research questions and a wide-scale survey about the identified topics. Our study design was approved by Microsoft’s Ethics Advisory Board.

Source: Software Engineering for Machine Learning: A Case Study, Proceedings - 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice, ICSE-SEIP 2019

AI and Big Data for Software Engineering

Data-driven requirement engineering

Nowadays, users can easily submit feedback about software products in app stores, social media, or user groups. Moreover, software vendors are collecting massive amounts of implicit feedback in the form of usage data, error logs, and sensor data. These trends suggest a shift toward data-driven user-centered identification, prioritization, and management of software requirements.

Major directions in practice:

Tools for feedback analytics will help deal with a large number of heterogeneous and unstructured user comments by classifying, filtering, and summarizing them

Automatically collected usage data, logs, and interaction traces could improve the feedback quality and assist developers with understanding the feedback and reacting to it. We call this automatically collected information about the software usage implicit feedback.

With all the explicit and implicit feedback now available in an (almost) continuous way, the following question arises: How can practitioners use this information and integrate it into their processes and tools to decide about what should be done, e.g. when the next release should be offered or what requirements and features should be added or eliminated [7].

Source: Data-Driven Requirements Engineering-An Update, Proceedings - 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice, ICSE-SEIP 2019

Program Repair Bot

The Repairnator bot is an autonomous agent that constantly monitors test failures, reproduces bugs, and runs program repair tools against each reproduced bug. If a patch is found, Repairnator bot reports it to the developers.

Motivation:

However, previous evaluations of program repair techniques gener- ally only evaluate the capability ofthe repair algorithms themselves. For the use of program repair techniques in practice, several other phases such as failure detection, bug reproduction, and patch re- porting are also needed before or after the run of the core repair algorithm itself.
To demonstrate the real potential of program repair in industry, it is desirable to study the design and implementation of an end-to-end repair toolchain that is amenable to the mainstream development practices.

Software Engineering for Blockchain

Detailed Performance monitoring for blockchain

Background:

We devide the performance of blockchain systems into overall performance and detailed performance.
Overall performance (e.g., throughput, latency) can be employed to select the optimal blockchain system to fit the actual application scenario so that it is valuable for users.
Detailed performance provides more detailed information of the whole process which is valuable for blockchain developers to know the performance bottlenecks.

Problem:

However, overall metrics cannot reflect the detailed performance in different process stages as shown in Figure 1. Detailed performance information of blockchain is urgently required and metrics are lacking. Moreover, the over- head of real-time monitoring as well as scalability ofthe monitoring framework need to be comprehensively investigated. Thus the challenges of performance monitoring for blockchain systems can be summed up as what to monitor and how to monitor.

Method:

The framework gets the performance data in real time by log analysis and daemon process.

Source: A detailed and real-time performance monitoring framework for blockchain systems, Proceedings - International Conference on Software Engineering

Developing Software Engineering Practices for Blockchain

Background:

The feeling of many software engineers about such huge interest in Blockchain technologies and, in particular, on the many software projects rapidly born and quickly developed around the various Blockchain implementations or applications, is that of unruled and hurried software development. The scenario is that of a sort of competition on a first-come-first-served basis which does not assure neither software quality, nor that the basic concepts of software engineering are taken into account.

Problem:

The goal of this paper is to propose and test a design and development process for Blockchain applications based on Smart Contracts.

Method:

The overall process is mainly based on the principles of Agile Manifesto [4], complemented with some specific notation and practices.
...
Consequently, the proposed process divides the Blockchain software system specification in two parts: the specification and development of the SCs, and that of the software application(s) which interact with the external users and with the SCs.

Source: An Agile Software Engineering Method to Design Blockchain Applications, CEE-SECR

Developing modelling notation for Blockchain applications

Context:

There is no standard notation available to design or model BOS. A system using blockchain could need a specialized notation to represent it [19]. The lack of specialized notation can over-complicate the adoption or migration to BOS, since the interaction between the blockchain and the application will not be properly specified.

Method:

In this paper, we present three com-plementary modeling approaches for BOS based on the following modeling standards: Entity Relationship Model, Unified Modeling Language, and Business Process Model and Notation.
...
We also use a simple application scenario to illustrate our modeling as well as describing the advantages and disadvantages of each modeling approach
...
The ER model is used to capture the data-driven aspect of blockchain applications. The main disadvantage is that the ER model can only capture data, not functional structure and behaviours. However, smart contracts are not just data, but also functions and behaviours.
The UML class diagram is used to capture the structure-driven aspect of blockchain applications.
The use of BPMN model is to capture the process-driven aspect of blockchain applications. Its advantage is the ease of specifying the process behaviours.

Understanding software engineering practices of blockchain software projects

Context:

To provide the robustness blockchain applications demand, first, we have to concretely understand the current software engineering practices of BCS projects or lack thereof. The exact practices could be understood reliably from the developers themselves.

Problem:

To provide the currently missing insight on blockchain-centric projects, we have set an objective to carry out the first formal survey to explore the software engineering practices including requirement analysis, task assignment, testing, and verification.
RQ1: Which are software development practices that BCS developers follow?
RQ2: How do BCS developers identify and select the requirements for their projects?
RQ3: How are development tasks assigned to the BCS project members?
RQ4: How is the correctness of BCS projects code verified?
RQ5: How are BCS projects tested for security and scalability?
RQ6: What are the communication channels for the BCS developers?

Method:

To conduct the study we sent an online survey to 1,604 BCS developers gathered via mining the Github repositories of 145 BCS projects. The survey is an ideal instrument for this study as current BCS developers have first-hand experiences of their challenges and needs. The survey received 156 responses from BCS developers that met our criteria for analysis.
We adopted a systematic qualitative analysis approach to building a coding scheme for the open-ended responses.

We identified 145 BCS projects based on following four criteria:
Tagged under at least one of the following six ‘topics’3: blockchain, cryptocurrency, altcoin, ethereum, bitcoin, and smart-contracts.
‘Starred’ by at least ten users.
Have at least five distinct contributors.
A manual verification of the repository confirmed it as a BCS project.

Source: Understanding the software development practices of blockchain projects: A survey, International Symposium on Empirical Software Engineering and Measurement

Repository Identification and Mining Techniques

Research Questions Swipe File