The explosive growth of mobile devices and advances in technology are ushering in an era in technology where every device can have a unique connection to the Internet. Estimates from research firms such as Gartner and ABI Research project that by the year 2020 there will be around 30 billion devices connecting wirelessly to the Internet — the Internet of Things (IOT), as it is commonly referred to. The question now is, how does one design a system that can handle a diverse set of devices that are most likely geographically separated?
This multi-part series explores the different aspects of designing a distributed system, optimized to support the multitude of devices — the characteristics of distributed systems, the fallacies that come along with the concept of distributed systems, as well as suggestions on how to handle known challenges.
A distributed system is essentially a set of software and hardware components that coordinate with each other via messages on a network to perform a common goal. That goal could be complex in nature, such as processing big data or creating a simulation such as an MMO (massively multiplayer online game) or something more nuts and bolts, such as a sensor network for tsunamis or earthquakes that trigger alerts when certain events occur.
The key aspect in distributed systems is the interconnection between the components via some network. This relationship brings about a couple of interesting concepts that should be considered:
- Concurrency – Each member of the distributed system is doing some work simultaneously or in parallel with each other.
- No global clock – Time synchronization between the members is not that easy. Even within a network there are delays in messages that could affect the synchronization.
- Independent failures – Members of the distributed system may fail at one point or another while other portions of the distributed system could still be running.
A system is considered distributed when it has the following goals and characteristics:
- Resource sharing – The ability of any member of the distributed system to use any resource within the system – be it hardware, software, data or service – as long as that member is authorized.
- Openness – deals with extensions, detailed interfaces of components as well as standards to use for communication within the overall architecture of the distributed system.
- Concurrency – pertains to the proper management and handling of concurrent processes, how components are access and updated. It also deals with the integrity of the system where each part should be in a consistent state.
- Scalability – refers to the ability of the system to handle more clients and/or accept more components as well as increase the processing capacity or speed of the overall system.
- Fault tolerance – The ability of a system to handle errors gracefully without failure of the system. Sources of errors can come from hardware, software and networks. Redundancy is a common way of achieving fault tolerance.
- Transparency – The perceived view of the users that they are working with a single system rather than multiple cooperating systems.
- Price/Performance ratio – A collection of cooperating processors can provide higher levels of performance than a single powerful processor. More processing power for less cost
- Distributed computing – Some networks are distributed by their very nature
- Increased reliability – Having multiple machines means there is less chance for the system to fail if some of its components fail
- Incremental growth – The ability to scale up as the load requirements increase. This has economic benefits as well because you only invest when there is a need.
- Sharing of resources/data – For certain kinds of applications, sharing data is essential for cooperative process. For example, cell phone towers are a shared resource that allow multiple mobile phones to share the connection
- World Wide Web – One of the largest distributed system examples out there. Thousands of server and network infrastructures distributed all over the world. It is a network of networks that are all independent from each other but are able to communicate using standard protocols such as http/https.
- HPC – High-performance computing allows researchers to solve complex problems and simulation using infrastructure that is optimized for high bandwidth, low latency and large computing capacity. HPCs can be considered a distributed system because they are basically multiple nodes that communicate with each other via messages and are all working together to solve some common task.
- Telecommunications infrastructure – Another great example with all the characteristics of distributed systems: it’s open, transparent, fault tolerant, scalable, and supports resource sharing and concurrency.
- Peer-to-peer networks – P2P is a distributed system where there is no need to have a central coordinator. Each node (peer) makes its resources available to other nodes with the network. A new trend in P2P computing is its applicability to digital currency such as Bitcoin. Bitcoin does not have a central repository and payments work in a peer-to-peer fashion.
In the next article in this series, we will discuss the fallacies of distributed systems
The Empower and Protect Blog brings you cybersecurity and information technology insights from top industry experts at Telos.