Cardano staking pool architecture

This is a short high level overview to show you what is our software and hardware infrastructure @ ADA Point Pool.

Overall goal of our pool is that people have a quality and low fee pool to delegate to which provides max return on investment (ROI) every EPOCH.

The strategy to achieve that goal is for us to have the biggest uptime possible at competitive price/block. In other words having high reliability; no downtime, which means we should produce blocks every time we are assigned to produce them.

Being a highly reliable staking pool means we have to take care of many services in our stack. Lets try to list some: electricity, hardware components, reliable internet, backups, operating system, custom scripts, Cardano node.

If we would try to manage all of them it would be very expensive, therefore we decided to go with a common practice, to lower some of this costs. We use an IaaS provider which is common in the industry, therefore we can delegate some of the responsibilities to companies whose core goal is to provide the most basic computing services to their clients.

Service stack

IaaS provider
Electricity
Hardware components
Reliable internet
Backups

ADA Point Pool
Operating system
Custom workflow scripts
Monitoring
Cardano node

IaaS services

Plus [+]

  • no electricity downtime, power batteries and generators
  • hardware is virtualized, if some component fails there is no downtime
  • data center is usually situated or have a good connection with the internet backbone cables so network latencies are much better
  • automated backups
  • popular OS can be quickly installed & booted
  • other driver, hardware optimizations that happen in the background
  • server grade hardware AMD Epyc, ECC RAM, …
  • relatively speaking low cost for what you get

Minus [-]

  • not directly in line with decentralization effort
  • no customizations at this levels, for example you cannot power your server with solar power
  • if reliability of this services is bad you cannot directly affect this yourself

The pluses in our case out-weighted the minuses.

To address the negative sides. Decentralization — wise we have a separate backup system. We have a spare IaaS provider and if that fail a direct hardware machine when we can run our nodes if the need arises. This can happen with IaaS if reliability becomes bad or from other social pressures that come with running decentralized nodes.

ADA Point Pool services

For good basic low level step by step setup you can follow an awesome Chris Graffagnino tutorial. Here we will describe some things that we do beyond that.

We are running a server version of Ubuntu 20.04.

There are other important parts for pool operators that you have to take care of on the operating system level:

  • Operating system should be updated with latest security patches
  • Properly configure firewall, we use ufw
  • DO NOT install or add any services that are not crucial to node service operation
  • Setting correct operating system permissions; This really helps with what you can screw up unintentionally. Also it helps with what an attacker can do in your system
  • Use SSH keys to login to the system

Also a useful tip is to use mobile SSH client on the go. We use JuiceSSH. This is really handy especially after stability upgrades if you get an alarm so you can quickly check if everything is in order wherever you are. Also in an emergency you can do a quick fix or revert just so you don’t miss any blocks.

We currently use CNTools and are really satisfied with them.

  • Backup, encrypt private wallet, pool keys and then move them to offline storage
  • Wallet management
  • Pool management

Custom configs, with optimized topology files. Testing latest node version and then using it if it turns out it runs fine for an epoch or two (depends on the changelog).

Updating our node to the latest version!!

We are always checking Cardano node github repository and release logs.

We do not install compilers and tooling on your server stack. Why? Compiling is very compute intensive. it can introduce latency into the system if for example processing some network data. You can compile your binaries on the cloned virtual system with the same basic instruction set and transfer it to the live server.

Note: This can/will changed in the future based on demand and our projects.

  • Main node
    - Location: Nuremberg
    - 8 vCores AMD Epyc
    - 8 GB ECC RAM
    - 10 Gbit connection
    - SSD drive 80GB with backups
    - Description: This is our MAIN node and it produces blocks. It is connected to our 2 Relay nodes and not directly to any other nodes.
  • 2 Relay nodes
    - Location: Nuremberg
    - 3 vCores AMD Epyc
    - 4 GB ECC RAM
    - 10 Gbit connection
    - SSD drive 40GB with backups
    - Description: Our Relay node. It is a proxy for the a MAIN node to the rest of the network. If one Relay node is compromised other one takes over or we spin other Relay nodes.
  • 1 Jormungandr testnet node
    - Location: Nuremberg
    - 3 vCores AMD Epyc
    - 4 GB ECC RAM
    - 10 Gbit connection
    - SSD drive 40GB with backups
    - Description: We still run a rust Jormungandr testnet node so Cardano IOHK can try to play around with different network settings.
  • 1 Cardano db sync node
    - Location: Nuremberg
    - 8 vCores AMD Epyc
    - 8 GB ECC RAM
    - 10 Gbit connection
    - SSD drive 80GB with backups
    - Description: Cardano-db-sync (with cardano-graph-ql) node is used for our internal use in our other Cardano related projects. It syncs all the network information in a Postgre SQL database. It also exposes a GraphQL interface to access the data.

There are always some scripts that devops(sysadmins, IT) need to automate some tasks so they are not repeated every-time. This should all be tracked with a version manager like git. So when we are updating scripts we can track changes and revert if something goes wrong. Also it is more easy to transfer them to a new computer.

Monitoring

Monitoring is recording some parameters in the stack to make informed decisions to further improve the stack so it is even more aligned with our strategy to have an awesome uptime.

For example; if we record what state is the node in with a resolution of 10s we can see what node was doing when block time came, and maybe see what went wrong if block was not mined.

Giving the example above this is one of the most important parts of the stack even though it is not directly linked to uptime. It is more of a feedback loop for us to improve the overall system stability.

There are multiple levels of monitoring:
- Our IaaS provides basic hardware, network logs.
- Operating system has its’ own logs
- Cardano node has its’ own logs defined in config file

We further use Graphana + Prometheus. You can visually represent your logs there and add alerts if needed.

Final thoughts

Couple more thoughts about software maintenance.

We believe it is always the best idea to keep things as simple as possible. Do not create 4 tiers of caching with different settings. Also if something doesn’t bring improvements, or just minor, even if you used a lot of time for doing it just throw it away.

Test on testing machines. Never test and change things on production if not tested properly. Why change something that works as it should! The only exception is emergency but even then you have to think things through and have a fallback plan if things do not behave as you planned.

Our ITN future prediction (came true)

We predict that in the future the node software will be so mature to not need much of the tooling described here. Some parts will still be nice to have, like fast bootstrapping, which works as a caching system. The barrier to enter the pool operator game will be much lower and fees should be smaller than now.

/remindme 1 month from now

UPDATE: This came true. ITN testnet got very stable.

We are really happy with our current architecture and the fact that objectively we have achieved our defined goal, to have a high quality pool at low fees.

In the future we will write more about specific sections in this article and some more about financial incentives for the pools and other game theory.

Fin

About author. My nick is Gwearon. I am a computer programmer with about a two decades of programming experience. My interests are functional programming, smart contracts and math.

Disclaimer

This article comes as it is with no guarantees. Use at your own good judgement.

Who are we?

Providing secure staking rewards for institutions and individuals. Delegate & Forget.

Let us know if you find any mistakes.

Happy staking,

AdaPointPool
https://adapointpool.com
info@adapointpool.com

We run a Cardano staking pool - Ada Point Pool. https://adapointpool.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store