Software Engineering for Blockchain

The need for software engineering techniques for blockchain

Almost all of these attacks can be attributed to poor software development practices. 1

The feeling many software engineers have about such increased interest in blockchain technologies and, in particular, on the numerous software projects rapidly born and quickly developed around the various blockchain implementations is that of unruled and hurried software development. The scenario is that of a sort of competition on a first-come-first- served base which does not assure neither software quality, nor that all the basic concepts of software engineering are taken into account. 1

Poor choices in the architectural design and immature development tools imply that even security-conscious developers are susceptible to creating security loopholes with severe consequences.

The unalterable nature of blockchain technology makes a recovery prohibitively difficult or effectively impossible if the vulnerability is detected after deployment.

Therefore, the devel- opment team is recommended to adopt a forward-looking approach to software engineering best practices, secure development, and extensive testing for eliminating bugs before they enter into the system [23]. 6

Requirements of blockchain-oriented software engineering

New professional roles

Due to the business-critical nature of the Blockchain, finance and legal subjects have shown increasing interest toward BOS.

...

The Blockchain sector will need professional figures with a well-defined skills portfolio comprising finance, law, and technology expertise. An example of a new role could be that of an intermediary between business-focused contractors with low technology expertise and IT professionals. 1

Security and reliability

Ensuring security and reliability in BOS development might require specific methodologies such as Cleanroom Software Engineering [6] or thorough software reviews. Furthermore, mathematically sound analysis techniques could help enforcing reliability and security-related properties in blockchain-oriented applications. 1

Testing techniques can also enhance system security and reliability. IBM recently expressed the need for continuous testing techniques to ensure blockchain software quality 1

Software Architecture

Specific design notations, macroarchitecture patterns, or meta-models may be defined for BOS development. To this purpose, software engineers should define criteria for

  • Selecting the most appropriate blockchain implementation,
  • Evaluating the adoption of sidechain technology, or the
  • Implementation of an ad-hoc blockchain. 1

Modeling Languages

Blockchain-oriented systems may require specialized graphic models for representation. More specifically, existing models might also be adapted to BOS. UML diagrams might be modified or even created anew to account for the BOS specificities. For example, diagrams such as the Use Case Diagram, Activity Diagram, and State Diagram could not effectively represent the BOS environment. 1

Metrics

Due to the distributed nature of the Blockchain, specific metrics are required to measure complexity, communication capability, resource consumption (e.g. the so-called gas in the Ethereum system), and overall performance of BOS systems.

Research directions for blockchain-oriented software engineering

Testing

A recent study on over 50000 GitHub projects [9] has proved that a bigger team size leads to a higher number of test cases, whereas the number of test cases per developer decreases with an increase in the team size. It would be interesting to investigate whether the same can be said about BOS, considering that the most popular repositories have an unusually high number of contributors, even for open-source projects. 1

Collaboration

A large base of voluntary contributing members has been shown to be a pivotal success factor in OSS evolution [10]. To achieve sustainable development and improve software quality, specific practices to enhance the synergy between the system and the community would be highly beneficial to BOSE [11]. 1

Enhancement of testing and debugging for specific programming languages

This arises the need for enhanced testing and debugging suites, tailored upon the most popular BOS languages 1

In addition, as BOS projects work with the Blockchain, which is distributed by definition, testing in isolation would re- quire properly mocking objects capable of effectively simulate the Blockchain.

Some questions

Source: 2

Two main topics:

  1. How to engineer software for running the blockchain itself?
  2. How to engineer applications running within (i.e., smart contracts) or just outside the blockchain?

Questions:

  • How does one significantly increase the performance of the current generation of blockchain virtual machines within these constraints (secure, at scale, fully decentralized)?
  • How does one define the right mix of incentives to ensure the long term sustainability of the blockchain?
  • Given the rapid expansion of competing blockchain networks, with different consensus protocols or new smart contract programming languages, how likely is it that only one will prevail?
  • How to deal with the ecosystem fragmentation? Will smart contracts be able to dynamically migrate between different blockchains, or will it be enough to run some form of distributed transactions across mul- tiple blockchains?
  • Software analysis, measurement, verification, testing, and other quality assurance techniques for ensuring that smart contracts are correct and secure.
  • Should every application use the block- chain? Or more precisely, which part of an application should run on-chain and which part off-chain?

Creation of software tools for smart contract languages

The implementation of Smart Contract Development Environments (SCDEs)– the blockchain-oriented declination of IDEs – might be pivotal for the building and diffusion of BOS expertise. Such environments could streamline smart contract creation through specialized languages (e.g. Solidity, a language designed for writing contracts in Ethereum).

Understanding how smart contracts are used and implemented

Understanding how smart contracts are used and how they are implemented could help designers of smart contract platforms to create new domain-specific languages (not necessarily Turing complete [27,29,33,42]), which by-design avoid vulnerabilities as the ones discussed above. 8

Further, this knowledge could help to improve analysis techniques for smart contracts (like e.g. the ones in [25,37]), by targeting contracts with specific programming patterns. 8

Software vs Smart Contract Engineering

Software engineers need guidance for matching application domain requirements with specific characteristics of blockchain solutions. This enables them to take advantage of smart contracts for solving new classes of real-world problems, as opposed to introducing blockchains everywhere, where they may be unnecessary, or provide an inefficient and environmentally unsound solution.

Key differences between traditional software engineering practices and the assumptions that blockchain developers make when writing smart contracts:

  • Buggy smart contracts can have a serious financial impact (e.g., locking up funds, leaking funds). However, smart contracts are immutable and thus cannot be fixed. Thus, early detection of bugs using code verification and static analysis might help.
  • Blockchain virtual machines running smart contracts can charge for every microinstruction they execute. Moreover, code execution may fail because it has run out of funds. Thus, software engineers have to learn how to deal with limited resources (e.g., recycling allocated memory).
  • Smart contracts are easier targets because attackers have full visibility into their source code and their internal private state. They can also control the order in which transactions are processed.
  • Whereas traditional containers protect the underlying environment from the untrusted code they execute, on the blockchain, smart contract code itself needs to be protected from its sandbox. The sandbox should not be trusted as it may attempt to steal the funds carried by the smart contract.
  • External modules that smart contracts call cannot always be trusted. External callers may exploit and misuse smart contracts by arbitrarily invoking their public interface.
  • Blockchain enables open execution; everyone can verify that your smart contract has not been tampered with by reexecuting it themselves and comparing the outcome.

Software Process for Blockchain Application Engineering

Development process for lean start up

bc-software-process-lean

Source: 3

The process was mainly based on the concept of product-oriented development and on the methodology proposed by Eric Ries, Lean Startup, conceived to Startup companies

It is also worth noting the analysis of the need for Pivot and completeness of the Minimum Viable Product - MVP of the project.

Process assets:

  • Business model
  • Software development policy oriented to Blockchain
  • Blockchain adequacy checklist
  • Project canvas
  • List of requirements
  • Blockchain reference architecture
  • Architecture document
  • Kanban of the project
  • User stories
  • Design of tests
  • Measurement specification
  • Product evaluation questionnaire
  • Retrospective minutes

An Agile Software Engineering Method to Design Blockchain Apps

bc-software-process-agile

Source: 5

Step 1:

State the goal of the system, write one or two sentences summarizing the goal, and post it in a place that is visible to all developers.

Step 2:

Identify the actors which interact with the system (human roles and external systems/devices). Here you can possibly apply the idea of determining the trust/untrust between actors, to assess whether a Blockchain system is really needed, and for what parts [6].

Step 3:

Write the system requirements in term of user stories or features. In this phase, the system to be developed should be considered as a whole. The fact that it will be developed using a Blockchain or a set of servers in a cloud is not important.

Step 4:

Divide the system into two subsystems:

  • The blockchain system composed by the smart contracts
  • The external system that interacts with the first, sending transactions to the blockchain and receiving the results

Step 5:

Design of the SC subsystem:

  • Redefine the actors and user stories, considering only those directly interact with the SC subsystem
  • Define the decomposiion in SCs; define and used libraries and external SCs, design the inheritence structure, and the usage of interfaces
  • Define the connections and flow of messages and ether transfer, define the state diagram if necessary
  • Define the data structure, the external interface (ABI) and the events
  • Define the internal functions and the modifiers
  • Define the tests and security assessment practices

Step 6:

Design of the external subsystem

  • Redefine the actors and the user stories, like those described in steps 2 and 3, but adding the new (passive) actors represented by the SC system; define the acceptance tests of the subsystem;
  • Decide the broad architecture of the system, taking into account the server and client application, the Blockchain node(s) to use;
  • Define the User Interface of the relevant modules, including the apps;
  • Perform an analysis of the system, defining the decomposition in modules, the flow of messages, the structure and storage of permanent data, including those anchored to the Blockchain through hash digest memorization, the data or class structure of the application(s); the connections and data flows between participants, including the SCs must comply with the analysis of step 5.3;
  • Define the state diagrams (if needed), the detailed interfaces of the various modules, the response to the events raised by SCs;
  • Perform a security assessment of the external system

Step 7:

Code and test the subsystems in parallel

Step 8:

Integrate, test, and deploy the overall system

Findings from Empirical Studies

Who is contributing to BCS projects?

It is worth formal study the demographics of the BCS developers regarding their age, gender, education, and general software development experience and see if it differs from non-BCS. Also, the future participants of BCS development are likely to have idea about the characteristics of existing community who are involved in the industry with large stake, wide variety of motivation and ethical consideration.

In general, BCS developer population is more qualitifed than the general software developer population. Although, we noticed that more than 81% of our respondents have less than 2 years ofBCS development experience, more than 70% developers from the same group were found to have more than five years of develop-ment experiences. These numbers indicate that a large number of software developers, who are experienced in non-BCS development, have recently joined BCS projects potentially due to the recent hypes generated by the blockchain technology. 4

What are primary motivations of BCS developers?

Many of the early BCS developers /investors have garnered significant financial rewards from the recent boom of the cryptocurrency market. Therefore, it won’t be surprising if external rewards are the primary motives for many of the BCS developers. Our next research question tries to find out whether BCS developers’ motivations are similar to non BCS OSS developers or they are attracted by potential financial gains. Understanding the motivations of BCS developers is important since it will help to identify prospective joiners, which may form synergies with a BCS community.

Due to the significant financial gains by the early cryptocurrency investors as well as a large influx of cash through ICOs, we hypothesized that the majority of the BCS developers might be motivated by external rewards. However, the ratio of our respondents reporting external motives (36%) were similar to the number from prior OSS studies (Hars and Ou 2001; Lakhani and Wolf 2003). Moreover, the ratio of volunteers (43%) is also similar to what was reported in prior studies (Lakhani and Wolf 2003; Bosu et al. 2014a). On the other hand, ideological motives were more frequent among BCS developers (37%) than among the OSS developers (Hars and Ou 2001; Lakhani and Wolf 2003). Therefore, getting aligned with the ideology of a BCS community is important to become a member of that community. 4

What are the differences between BCS and non-BCS development?

So, is BCS development really different? The answer to this question will depend on whom we ask. It’s true that BCS development has a very high emphasis on security and reliability, but many of the existing software development domains (e.g., financial transaction, air traffic controller, and nuclear power plant man- agement) have a similar emphasis on security and reliability.

If a developer’s non-BCS experience is in high assurance software (Section 6.3.5), then he/she might find little differences.

However, over 93% of our respondents’ non-BCS experiences significantly differed from their experiences in BCS (Fig. 6). Our survey received responses from ≈ 10% of the most active developers from the top BCS projects, and 70% of our respondents have more than 5 years ofdevelopment experiences (Section 6.1). Yet, they found BCS development different from non-BCS. Some of the differences, such as the immaturity of the ecosystem, will resolve with time, but others, such as immutability of data as well as difficulty in upgrading the software after deployment, which is rare among the non-BCS domains, will linger as a differentiating factor. 4

What are the primary challenges of BCS development?

Since most of the non-BCS domains do not have similar high reliability and security requirements as the BCS domain, developers coming from other domains (except high assurance software) will encounter challenges due to those differences.

Moreover, BCS Developers must be careful in writing code due to high costs of defects as well as difficulty in upgrading the software. Yet, they are under constant pressure due to a rapidly changing ecosystem as well as high expectations from the stakeholders to release new versions.

Moreover, as the blockchains become more popular the scalability ofBCS software has become an area of concern (Porru et al. 2017). As a result BCS development is challenging even for developers with considerable non-BCS experiences (Section 6.4.1). 4

What are the tools that BCS developers currently need?

The difficulty of testing BCS ranks among the top of challenges encountered by BCS devel-opers (Porru et al. 2017; Brooke 2018). Current testing tools cannot simulate testbeds to simulate the distributed and hostile execution environment of a BCS. Moreover, smart-contract development, which is gaining popularity lacks supporting tools (Clack et al. 2016). The requirement of tools, once clearly understood, will lead to the development of support-ing tools to design, develop, test, and deploy BCS applications.

Easy way of forking mainnets for testing purposes, a way to deploy a test net in one click would be nice.[#150]

Easier ways to simulate complex network topologies on one single machine to simulate the network. [#195]

Those tools do not work well on BCS codebase, developers wish for automated security testing tools designed specifically for the BCS domain.

Fuzz testing, something like linting for security best practices. [#196]

Before interacting with a smart-contract, a developer might want to verify its security properties by decompiling its bytecode. While few solutions exist (Suiche 2017), smart-contract developers wish for a reliable and user-friendly decompiler.

... high-level Solidity decompiler that works (the current EVM-to-Solidity decompilers are horrible). [#113]

The other tools wished by the respondents include UML/design notations for the BCS domain, containers for deployment, and automated performance analysis tools.

Key takeaway 5: Based on the personal experiences of our respondents, they found some widely used tools tuned for non-BCS development, lacking required support for BCS development. While some ofthe needs expressed by our respondents (e.g., easy to write formal specification) may be a wishful thinking and difficult to achieve, most of those tools are feasible. Potentially implementable tools for BCS development include: testing environment, automatic security testing, static analysis for smart-contrats, and easy to deploy testnets. Since an array of research predict tremendous impacts of the blockchain technology and smartcontracts in future (Iansiti and Lakhani 2017; Peters and Panayi 2016; Fanning and Centers 2016), the number of BCS projects and developers contributing to those projects will grow significantly over the next decade. Therefore, research and development efforts should focus on implementing those tools to build a mature BCS development ecosystem. 4

Software Development Practices of BCS Developers

Software Development Practices

According to the BCS developers, code review is the most common practice and also the most effective one.

As a young and immature ecosystem, the BCS domain lacks adequate testing tools and without that humans are the best bug finders in BCS projects.

Although formal verification ranks sixth in terms of effectiveness, it ranks last among the frequency of regular practice indicating most of the developers have difficulties using it. 7

Verification and Validation Practices

Code review and unit testing are the most used techniques by BCS developers to verify their code correctness. 7

bc-empirical-verification

Testing security and scalability

The figure suggests that code review, stress testing, and unit test-ing are the most used techniques by the BCS developers to test the security and scalability of their projects. Some BCS developers use the Testnet, which is an alternative blockchain to be used for testing without worrying about breaking the main blockchain. Some BCS developers also use techniques like static program analysis, bug bounty, simulation and external audit. 7

bc-empirical-security-scalability

Requirement selection process

However, in BCS projects the majority of requirements are identified through com-munity discussion which is dissimilar to general OSS projects. This result is counter-intuitive as most of the BCS projects are also open source and believed to possess similar characteristics in these as-pects. Moreover, project leaders or founders directly influence the selection of requirements in BCS projects, which is not the case for OSS projects [13]. 7

Task Assignment

The figure suggests that the development tasks are primarily assigned on voluntary basis. Ideally, the key element of any OSS project is voluntary participation and voluntary selection of tasks; i.e., each person is free to choose what he or she wishes to work on [24]. We see that the task assignment in BCS projects follows the similar task assignment strategy as OSS projects. 7

Many BCS developers indicate that tasks are assigned for some projects based on the expertise of the developers. 7

Communication Channels

The figure suggests that Github and Slack are the most used communication channels by the BCS developers. Github is very popular because developers extensively communicate through pull requests, code reviews, etc. directly in codes.

Slack is very popular among BCS developers because some of the large and most popular cryptocurrency communities are on Slack, and the basic account is free.

Medium is also popular among BCS developers because it is the largest repository of Blockchain related technical articles.

Due to the intent of being anonymous, the use of mailing list is less frequent among BCS developers. In general, the communication channels for BCS developers are different from traditional OSS developers. 7

Sources:


  1. Blockchain-oriented software engineering: challenges and new directions, 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C)
  2. Understanding the software development practices of blockchain projects: A survey, International Symposium on Empirical Software Engineering and Measurement
  3. Blockchain and Smart Contract Engineering, IEEE Software
  4. An empirical analysis of smart contracts: platforms, applications, and design patterns; International conference on financial cryptography and data security
  5. An Approach to Develop Software that Uses Blockchain, Csoc 2018, Aisc 763
  6. An Agile Software Engineering Method to Design Blockchain Applications, CEE-SECR
  7. Understanding the motivations, challenges and needs of Blockchain software developers: a survey, Empirical Software Engineering
  8. Understanding the software development practices of blockchain projects: A survey; International Symposium on Empirical Software Engineering and Measurement