The Secure Smart Contract Development Roadmap
The essential blueprint for crafting secure protocols
The Secure Smart Contract Development Roadmap
Are you ready?
Request a Security AuditIn 2023 alone, decentralized protocols were hacked for a combined value of $1.8 billion. The persistence of security breaches, even in systems that have gone though multiple security audits, raises critical questions.
Why do these vulnerabilities persist, and how can developers minimize the risk of these issues?
Smart contract protocols, as ultra-critical pieces of immutable software, require a more thorough and carefully designed development process than traditional applications. A minor oversight can lead to repercussions of monumental scale, with potential losses reaching into the billions. This high-stakes environment demands a paradigm shift in development practices, one that mirrors the rigor applied in the creation of the mission-critical systems like aviation and healthcare. Embracing this approach from day one enhances the effectiveness of each subsequent stage, serving as the secret to ensuring an exponential reduction in the likelihood of errors throughout the development process and culminating in a clean and secure codebase.
The answer to creating secure protocols lies not in the frequency of security audits but in their effectiveness, as well as in the ability to conduct effective failure-based stress testing . A common misconception is that a security audit is the silver bullet for all potential security flaws. However, the reality is more nuanced. Audits depend heavily on humans reading code and the efficacy of that process can vary greatly depending on the condition of the codebase. By following a series of steps and thought processes, you can significantly enhance the auditability of your protocol, thereby reducing the likelihood of missed issues.
The purpose of this guide is to provide a structured approach to ensure that your protocol is properly tested and optimized for a thorough and effective examination. The guide will also cover what to do after an audit in order to interpret its results correctly and safely deploy and monitor your contracts. The secure development lifecycle can be segmented into six main phases: Plan, Code, Test, Audit, Deploy, and Monitor. These are the thought processes that distinguish secure protocols from insecure ones.
Plan
Technical specifications
Creating a high-quality specification for code involves clear definition of the requirements, functionalities, features, and constraints of the protocol before the actual coding begins. It serves as source documentation, a guide for developers and stakeholders to ensure that the final product meets the initial vision and requirements.
Architecture Overview: Provide a high-level overview of the protocol architecture, including major components and their interactions. Define every functionality along with its input and output requirements. To do this, you can use a framework of pre-conditions, post-conditions, and actions in a logical format as outlined in this guide.
Constraints: Outline any limitations such as time, budget, or technology, that could impact the development. Include major milestones and plans for future audits.
Dependencies: List any external systems, libraries, or services that your project will rely on.
Battle-Tested Code: Use battle-tested libraries like OpenZeppelin contracts whenever possible in order to minimize the error surface.
Contracts Flow Diagrams: Visualize how the smart contracts will interact with each other.
Development Toolchain: Specify the development, testing, deployment, and monitoring tools necessary for the project.
Threat Modeling
System-wide Invariants: Define invariants across the system using a logical format. These assertions, which are expected to always hold during execution, are critical for effective testing, auditing, and monitoring. Clarifying these immutable properties is essential for planning security requirements.
Assess Integration Risks: Evaluate the security risks associated with integrating other protocols. Consider the security measures of these protocols and establish contingency plans for potential failures or loss of availability, such as utilizing external oracles or yield protocols.
Document Potential Threats: Identify and document potential threats, drawing inspiration from the Immunefi severity classification system. This process involves a thorough analysis of vulnerabilities and their possible impact on your system.
Code
Code clarity
“Code is clean if it can be understood easily – by everyone on the team. Clean code can be read and enhanced by a developer other than its original author. With understandability comes readability, changeability, extensibility, and maintainability” Clean Code by Robert C. Martin
For an effective audit, your codebase needs to follow a set of certain best practices and heuristics which ensure that it is clean, followable, and minimizes chances of mistakes.
Structure and Style
You want the general structure and style of your contracts to be extremely clear. You can achieve that by following conventional practices outlined in the Solidity style guide.
Self-contained code
Your code should aim to be self-contained, which means that functions should have a clear and singular purpose, avoiding the use of flag-type parameters. If necessary, consider breaking down a function into multiple smaller functions, each performing a distinct action. This same principle should be applied to contracts themselves. Strive to keep contracts as concise as possible, and if complexity arises, consider dividing them into multiple contracts.
Naming
Use declarative, clear, and consistent naming. If some naming is unclear or uses terminology of your protocol, explain it in the comments. Replace magic numbers with named constants.
Dependencies
Whenever possible, follow the principle of least knowledge. Ensure you only inherit from battle-tested libraries. Include any unaudited dependencies in the audit scope and check for common vulnerabilities using Defender’s code module dependency checker.
Input Validation
Go over each input and define clear logical pre-conditions when needed.
- Always assume input can take any value.
- Always assume functions can be called in any order.
- Always assume frontrunning is possible and an attacker can do any call right before a user with knowledge of their actions.
Code analysis
Use a static analysis tool like OpenZeppelin Defender Code Inspector to automatically verify for common vulnerabilities and style violations. Defender integrates with GitHub to be a part of your CI/CD process, giving you feedback on every PR.
Documentation
Your code needs to be properly documented to ensure auditors understand its purpose and intended functionality. Poor documentation leads to auditors spending more time understanding your code vs trying to break it, hurting the effectiveness of the audit.
- Use the natspec format to document every contract, library, function, event, and storage variable.
- Document all your mathematical and financial models with clear explanations and/or proofs.
- Document all your assembly blocks in detail, explaining what they are supposed to do.
- Make use of diagrams as much as possible, to give an overview of the system and visualize the flow of contract interactions.
- Document your threat modeling and assumptions.
Defensive Programming
When developing your protocol, adopt the perspective of an attacker to audit your code. This process encompasses several crucial aspects:
- Review audit reports and issues identified in similar protocols. Often, systems with similarities are vulnerable to the same types of human error, stemming from analogous reasoning about their mechanics.
- Evaluate every user-controlled parameter. Is there a potential for a function to be called with an unexpected parameter?
- Consider the implications of every external call. Are you delegating execution to an untrusted contract? Does this delegation occur in a state that is not consistent?
- Evaluate the sequence and circumstances of function executions. Despite the system's intended use, there's potential for manipulation in ways not originally anticipated. Functions could be executed in various orders or combined into a single transaction, leading to unintended outcomes. Additionally, consider scenarios where external factors, such as the availability of flash loans, introduce new dimensions of risk. Reflect on the potential consequences of such manipulations, taking into account both the internal mechanics and external influences like market volatility and frontrunning.
- If your protocol involves financial transactions, carefully examine the flow of value. How could value be diverted? What factors does it rely on? Are the incentives correctly aligned?
In summary, a key strategy in preparing for an audit is to proactively attempt to audit your own protocol, with a mindset geared towards identifying and exploiting system vulnerabilities.
To learn more about this type of thinking, we recommend reading samczsun’s blog.
Use the check-effects-interactions (CEI) pattern
Despite being the most known and documented vulnerability type in blockchain, reentrancy is still one of the most common causes of hacks in 2023 with read-only reentrancy taking the lead.
Always use CEI when possible and in the uncommon case when it’s impossible, document it clearly and ensure all functions, including view ones, are protected against reentrancy.
If you read or use other protocols, ensure they apply CEI and are not vulnerable to read-only reentrancy. If you are not sure, document it and explain the concern to your auditors.
Test
There are three categories of tests, all of which should have a fully-fledged test suite with a unique set of quality criteria:
Functional testing
Ensures that every happy path through the software system works as intended. This encompasses both state changes and return values.
- Unit tests: Test one function invocation or multiple functions on a single contract (while optionally mocking other parts of the system).
- Integration tests: Evaluate the interactions between multiple contracts in the codebase and with already deployed external contracts. Establishing a forked blockchain state is essential for accurately simulating realistic integration states, allowing for the exploration of different scenarios, such as the failure of an oracle. Furthermore, mock versions of integrations can be employed for simplicity, according to one's preference.
- Deployment tests: Test the entire set of contracts to be deployed, including the deployment scripts, ensuring correct order and parameterization.
Security testing
Assures the absence of undesired functionality that could be misused to steal funds or bring the system into an undefined state.
- Negative testing: Typically, a test case asserts that a certain transaction reverts.
- Authorization: The test ensures that arbitrary senders cannot call the respective
function.
Authentication: Test ensures that the message sender is identified correctly. - Re-entrancy: The test ensures that re-entrancy cannot occur or is considered safe.
- Path equivalence: If two different sets of state changes lead to the same semantic end state, the test case checks for technical equivalence (e.g., staking rewards).
- Known exploits: Tests inspired by known exploits are adapted to the codebase.
Testing Processes: Shift-Left Testing
Test Suite Trigger: Tests are automatically initiated by specific events in the CI/CD pipeline, such as commits, merges to the main branch, or other CI/CD-related occurrences. This ensures that testing is an integral, continuous part of the development process, facilitating early detection and resolution of issues. Defender’s
Test-Driven Development (TDD): Aligning with the Shift Left principle, tests are crafted prior to the development of code. This practice ensures that the system's intentions are thoroughly understood and addressed from the outset. Optionally, this process can be managed by a dedicated team, separate from the one responsible for development, to further ensure objectivity and comprehensive coverage of test cases.
Behaviour-Driven Development (BDD): Following the principles of TDD, Behavior-Driven Development (BDD) takes a step further by integrating business and technical perspectives. BDD involves writing tests in a language that mirrors real-world scenarios, allowing non-technical stakeholders to participate actively in the development process. This approach not only clarifies the system's behavior from the user's viewpoint but also fosters collaboration across the team, ensuring the developed features precisely meet the project's requirements.
Invariant & Spec Design: Before the onset of implementation and testing phases, the design of formal invariants and specifications is critical. These serve as foundational elements that maintain their truth across the entirety of the implementation process.
By incorporating these strategies, the Shift Left principle emphasizes the importance of moving testing and quality assurance earlier in the development lifecycle. This not only helps in identifying and addressing defects sooner, when they are less complex and costly to fix but also aligns development efforts more closely with the intended design and specifications, ensuring a higher quality and more secure final codebase.
Fuzzing
Fuzzing is an automated testing technique that involves executing tests with thousands or millions of computer-generated inputs. Fuzzing is recommended to test parts of the code that are hard to reason for the human mind like integer arithmetic prone to rounding errors, assembly blocks, and complex data compression.
There are three types of fuzzing:
- Stateless: This approach involves randomly generating inputs and immediately checking the results. After each test, the system is reset before proceeding with the next set of inputs.
- Stateful: In this method, the fuzzer is given the autonomy to decide which functions to call and with what inputs. The tester defines an invariant that must be maintained. For example, the total sum of balances should not exceed the total supply.
- Differential: Differential testing involves using fuzzing on two different implementations of the same functionality and comparing the results. If libraries A and B implement the same feature, their outputs should match any given input. This is especially useful for math libraries, where you can compare your math against a Python implementation to check for rounding errors.
When choosing input ranges for fuzzing, think about limiting compute by removing value ranges you already know with certainty will revert or produce a successful result. Also, think about covering enough input ranges that will cover all branches of the control flow.
Audits
Internal reviews
Internal audit processes play a crucial role in enhancing the security posture of a project before it undergoes external scrutiny. While many teams may not have dedicated security researchers in-house, fostering a culture of security-mindedness among developers during the internal review stages is invaluable. By implementing rigorous peer review practices, such as thorough examination of pull requests (PRs), teams can identify and mitigate potential vulnerabilities early in the development cycle. This internal layer of defense, although optional, complements external audits by ensuring that the codebase is as robust as possible from the onset. Encouraging developers to adopt an auditor's mindset helps bridge the gap between development and security, paving the way for a more resilient protocol.
Before the audit
Preparing thoroughly for an audit can significantly enhance its effectiveness and efficiency. A key step in this preparation is defining the scope of the audit at least two weeks in advance. This foresight allows auditors to familiarize themselves with the project's intricacies and tailor their strategies accordingly. It's crucial to adhere strictly to this predefined scope; any alterations can disrupt the audit's timeline and undermine the auditors' preparation, potentially affecting the quality of the audit. Additionally, providing a frozen commit of the codebase that remains untouched during the audit until the review of the fixes phase ensures consistency and clarity in the audit process. This approach minimizes confusion and allows for a focused and thorough examination of the code at its state during the time of freeze, leading to a more effective identification and subsequent remediation of vulnerabilities.
During the audit
A security audit is a collaborative endeavor involving both developers and auditors. To facilitate auditors in performing their duties, it's imperative to supply them with comprehensive details about your protocol's functionality, edge cases, integrations, and any areas of concern. Auditors aim to rigorously test every aspect, as errors frequently lurk in the most unforeseen places. However, being informed about the areas developers are most uncertain about can still be beneficial.
Throughout the engagement, it is crucial to respond to the auditors' inquiries promptly and with precision to maximize the efficiency of the audit time and ensure a thorough understanding of the protocol.
After the audit
After an audit, your team has to asses if the code is ready to be deployed to the blockchain. Sometimes when systematic critical issues are found, the code is deemed not ready and needs to go back to the development stage. To determine if your code is ready, see if the high/critical severity issues found are systematic, meaning they arise from a flawed architecture of your system and can’t be easily fixed.
Depending on the findings, consider whether the test suite is complete. If issues could have been found through testing, make sure to apply stronger QA before going live. If the audit yielded many critical issues, you’re sometimes recommended to get another audit.
After deployment, set up on-chain monitoring against invariants to ensure your team is quickly alerted of any abnormal behavior and set up an incident response protocol to act in case of a breach.
Deployments and upgrades
You’ve followed all the coding best practices and gone through one or multiple audits. Even if you've nailed the coding and your smart contract has passed all audits with flying colors, you still need to prepare for the launch day.
It's like preparing for a marathon; the race doesn't end at the training; you also need to plan your race day strategy. Making sure you cover all your bases during deployment is just as critical as the coding and auditing part to make sure your smart contract runs smoothly and securely.
Deployment Scripts
We usually treat our smart contract code very carefully, write extensive tests for it, and get it reviewed(internally and externally). But scripts that interact with these contracts - such as deployment scripts are sometimes treated differently.
While it is true that they are not core to the functionality of the protocol, deployment scripts are what gives birth to the protocol and they often involve complex logic and interactions, especially for systems composed of multiple smart contracts that interact with each other.
Because of that, it is essential that you try to make your scripts as simple as possible, use them inside your unit and integration tests(to deploy and initialize your contracts),and carefully review them.
Avoid msg. sender
One common practice that we still see today is assigning roles for privileged actions(e.g. Mint tokens, pause the contract) to the deployer of the contract(msg.sender), we even did that in some of our contracts(e.g. Ownable) until Contracts 5.0. Now we ask for explicit assignments and recommend all projects doing so. This is because of two main reasons:
- Less error-prone. If you do explicit assignments in the constructor it won’t happen that you give more permissions to the deployer inadvertently.
- Works with CREATE2. It is becoming more common to do deterministic deployments these days to have the same address across supported EVM chains. If you use msg.sender, permissions will be assigned and locked forever in the factory contract.
You can read more about this in the EEA Trust Specs.
Testing your deployments on a fork
It is a common practice for projects to deploy to testnets as the smart contracts become more ready in order to manually test behavior, integrations, etc. While this is good practice and we encourage teams to keep doing so, teams should consider testnet usage as a potential liability in cases where the code is close source, as hackers can use testnet transactions to learn about the function signatures, allowing to front-run users using the UX.
Besides using testnets, a sometimes overlooked step is testing your deployments in forked networks. Testnets might help you catch some issues, but they often differ a lot from mainnet (they don’t have all the protocols deployed, liquidity profiles are very different, etc.) - in summary, the state of the chain is often very different. Running and testing your deployments on a forked network that has the same state that the chain(s) you plan to deploy to will save you some headaches during the deployment day.
Secure Upgrades
Upgrades have always been a debated topic in crypto, but they have proven to be extremely useful in the right context. They are controversial because of the immutable promise of smart contracts, which should still be the end goal for most protocols, but we can’t deny that especially during the early stages of a project they are very convenient to help fix bugs and make tweaks to find product market fit.
The most common way to add upgradability is to use Proxies. But adding upgradability to your protocol doesn’t come without risks - new vulnerabilities can be added in new versions of the contract, the upgrade itself can go wrong if not handled properly, and there’s an additional risk that the upgrader keys can get compromised if not handled carefully. Because of this it is important that you consider a few things.
- Don’t use upgrades if you don’t need to. Perhaps this sounds obvious, but if there are simple contracts that are highly unlikely to require changes and they are simple enough that you feel confident they don’t contain vulnerabilities, don’t introduce the complexity of upgrades unnecessarily(plus your users will benefit from less gas overhead).
- Use a battle proven standard for proxies. This includes UUPS and Transparent proxies. We have support for both of them in our contracts library.
- Avoid storage collisions between versions of the implementation contract. This happens when the new version of the contract changes the storage layout in a way that overwrites existing data. We recommend you use one of our upgrade plugins that check for this before even proposing the upgrade. We have support for both Hardhat and Foundry.
- Make sure you correctly initialize your smart contracts. Upgradable contracts don’t usually have constructors(since the state is stored on the proxy and not the implementation). Make sure that you correctly initialize the state of the contract and don’t get frontrun by other users. You can leverage our initializer modifier for this.
- Require multiple signatures to execute the upgrade. This can initially be a Safe multisig with a high enough threshold and eventually become a governor contract if you decide to decentralize your protocol.
Choosing the right deployment and upgrade stack
Dealing with all these concerns individually can be cumbersome, and usually not the best use of your time, especially if you are developing the first iterations of your protocol, where your focus is on putting things in the hands of users and achieving product market fit. Luckily there are tools that can help you follow best practices and achieve things more easily (e.g. Deterministic deployments, source code verification, etc.). That’s why we included Deploy as one of the core modules in Defender, to help projects ship faster and more securely.
Operational Security
Boom! You’ve got through deployment procedures and got your smart contract infrastructure up and running on production. The marathon has begun!
Now what’s up with the status of your security? Let’s check the dashboard panel to see the health indicators.. Wait. You did plan for it.. Right?
Before touching web3
Web3 is just an extension of web2. Don’t forget traditional IT security: Your team should have sufficient knowledge in private key infrastructure management, should know how to detect phishing attacks and scams.
Ensure traditional IT security
Traditional practices like installing only secure, signed or IT approved software, using https and vpn should be in place even if you are more of a decentralized protocol.
Your PKI infrastructure must be secure and minimize the chance that an adversary can gain access to any sensitive information.
For entities, having a certified security standard such as SOC2 is a great thought to start early.
Personnel web3 training
Using web3 can be daunting and complicated even for technical experts in web2 space. Don’t neglect the importance of training your team so they know specifics, such as - how to use wallets safely, have basic understanding of recovery phrases, how to submit transactions and what are common phishing methods that they can become targeted by.
Incident response plan
Based on Threat Modeling and risks identified After the audit you can plan accordingly to ensure that your team is well prepared to handle any security threats that may come up. Using identified potential threats, prepare a battle-ready Incident Response plan. Make sure you are able to handle security incidents effectively to mitigate loss of funds and maintain your reputation even if the worst case happens.
Define responsibilities
Make sure you have security policies that will develop a set of roles and responsibilities around the use of specific Incident Response mechanisms in your environment. Develop a high level representation of the roles and responsibilities that various stakeholders will assume during an active security incident.
Vulnerability management protocol
Make sure you have detailed decision trees, communication plans, and specific guidance on opportunities for automation to minimize exposure for a wide range of scenarios.
Incident Response Drills
Execute live Incident Response simulations to assess the effectiveness of your team’s ability to make use of the plan in conditions close to real.
Based on this assessment plan and improve your team performance, adjust defined responsibilities and iterate until you are comfortable that when it comes to real, you are A TEAM where everyone knows what to do.
Plan monitoring in advance
Planning monitoring often means that based on your planned Threat Modeling and risks identified After the audit you will need to iterate back to Technical specifications and monitoring requirements to it.
To save your day from the need of such going back iteration, here are some most important techniques and best practices that you can start planning already in the first specification pass.
Dashboard requirements
Make sure that you have outlined all required visibility on possible attack vectors. Also it's great to think ahead that users, investors and other stakeholders also might need to have access to data they need to drive their decisions.
Events requirements
Every executable method should emit an event in such format that it is convenient for monitoring setup.
These event meanings and mapping to Threat Modeling should be clear and easy to use. Plan indexing accordingly so that required data can be retrieved.
Wherever possible avoid having to build separate indexes and databases off chain in order to provide security monitoring. Such can be challenging when dealing invariant monitoring in cross chain bridges that use mint/burn logic for instance.
United Interface
External interfaces such as methods and events must be clearly defined not just by Code clarity but also checked for signature collisions across the whole infrastructure and if acceptable - dependencies. Compilers will take care of signature collisions within the same contract, but not within an infrastructure, which might make monitoring tasks more challenging later on.
Make sure that hexadecimal signatures of principally different interfaces do not collide as well as that principally different interfaces are named differently across the whole infrastructure.
Ensure that parameters and indexed arguments across infrastructure are the same for the same interfaces with the same signature.
Deployments registry
Your deployed infrastructure, all smart contracts and dependencies defined in Technical specifications should be accessible in a form of registry that maximizes off chain monitoring ability to get and label your addresses as well as retrieve all required application-binary interfaces.
Monitoring infrastructure
Monitoring infrastructure should be not least reliable than stakes at risk. It should fulfill specific monitoring requirements needs, such can be: low latency for time-critical notifications; Public and autonomous redundancy for a community awareness;
It should expose minimum information regarding setup and monitor configuration for non-community monitoring.
Metrics and methodology
Monitoring smart contracts is a challenging task, new attack vectors might arise that you have to adapt for. You need to have enough sensitivity to see alerts, yet low noise to not miss the right beat.
Defining performance metrics and tracking those will help you to be aware of how your monitoring performs over time and after changes made to adapt for a never ending challenge.
Deploying Monitoring
Set up monitoring as early as possible. Don’t miss a bit of possible threat.
Ideally prepare monitoring to launch before contract deployments and make sure you’ve set up all required infrastructure needed by Incident response plan that is ready to target the production environment.
For non-deterministic deployments you should be aware and check manually all attack vector threats that can exist before your monitoring system is activated.
Deploy monitoring stack for deterministic deployments before deploying contracts.
Monitor for new deployments
Prepare your monitoring to follow Deployments registry in automated manner to the extent possible in your stack.
Notify immediately when a new piece was added to your infrastructure and add new infrastructure to the monitoring stack immediately.
Continuously improve
Set up a coverage metric for your smart contracts infrastructure and expect need to continuously improve. Even over immutable contract infrastructure monitoring is going to be continuous opportunity for improvements as protocols undergo hardforks, state changes and new attack vectors may arise.
Track and improve your Metrics and methodology to ensure maximum level of security