RFC: Hydranode Networking Scheduler (v1)
Abstract
This document describes the packet scheduler used in Hydranode core. The scheduler performs fair packet queueing and bandwidth management between multiple networking modules with varying priorities.
Table Of Contents
1. Priority Scores (PC)2. Packet and Connection Scheduling
2.1. Connections Scheduling
2.2. Downstream Scheduling
2.3. Upstream Scheduling
1. Priority Scores (PC)
Every request, be it a connection, or a packet, has a priority score attached to it. The base score starts out at 0. The lowest possible PC is -100, the highest is +100. Scores that exceed that range are truncated to within that range.
The following modifiers apply to PC:
- Module priority
This is set by the module itself upon loading, within range -100 to +100. P2P modules are required to start out at PC 0, custom modules may use different scores. For example, hnshell will start out at PC 100 (since we want the shell to be fast).
- Socket priority
By default, socket priority is set to 0, but can be adjusted on runtime if needed. This functionality is provided for maximum customizability and flexibility - it is possible that a module would want a high-priority socket for some communication, while lower priority sockets for other kind of communications. This priority also ranges from -100 to +100. Thus a -100 priority module could at best get a 0 priority socket (-100 + (+100) = 0).
- Packet priority
Individual packets can also be set a priority. This is not a free-form score - two types of packets are defined: data packet and overhead packet. Overhead packets have extra score modifier +10 to bring them slightly ahead of others. Data packets are used for trasmitting actual file data, while overhead packets are used for everything else. As a general rule, overhead packets are smaller and more useful in overall view (e.g. searching for more sources, responding to other clients requests etc), so they should be slightly prioritized. However, this shouldn't affect them too much, otherwise we'll end up sending only overhead packets and no real data. Thus +10 seems like a good modifier.
- Module's up/down ratio
The basic concept is to figure out how "useful" a module has been to us. Modules that have higher download / upload ratio should also get higher priority. Thus, first calculate how large percent of the total upload and total download bandwidth has the module used:totalup% = module_upload_bytes * 100 / total_upload_bytes totaldn% = module_dload_bytes * 100 / total_dload_bytes
Now we get two percents, which are independant of the actual data amounts having been transmitted. If we substract totalul from totaldn, we get a ratio - in range -100 to +100 - exactly where we want it. Thus:Total upload: 250mb total dload: 500mb module with 50mb upload usage and 100mb download usage gets: totalup% = 50*100/250 = 20% totaldn% = 100*100/500 = 20% ratio = 20 - 20 = 0.0 module with 100mb upload and 50mb download usage gets: totalup% = 100*100/250 = 40% totaldn% = 50*100/500 = 10% ratio = 10 - 40 = -30.0 module with 100mb upload and and 350mb upload usage gets: totalup% = 100*100/250 = 40% totaldn% = 350*100/500 = 70% ratio = 70-40 = 30.0 Check: 0.0 + -30 + 30 = 0.0 -> correct
This score is added to the base score. The result is that modules that have used the given bandwidth more effectivly to our advantage get higher priority. This also ensures that new modules start out at relativly fair state compared to old modules. If we didn't involve percents in our calculations, new modules would have no chance competing for bandwidth.
- Request wait time Nothing fancy here - +5 score for every 100ms waited. This ensures even lowest-priority request gets through in 4 seconds at most - which is the maximum acceptable wait time for a request.
2. Packet and Connection Scheduling
2.1. Connections Scheduling
Frontend requests a new connection from scheduler. The scheduler checks if there are any free connection slots open right now. If there are any, it grants the request. If there are no free connections available at this time, the request is marked pending. Every time a connection is lost (disconnected), the pending connections queue must be scanned for pending requests, and the highest ranking request granted.
2.2. Downstream Scheduling
Whenever incoming data is detected in one of the scheduled sockets, the scheduler must verify that there is indeed free bandwidth to receive the data sent to us. If there is no free bandwidth at the time, the socket must be inserted into readable sockets queue. Whenever additional bandwidth frees up, the data is read out from the socket and buffered internally within the scheduler. After that, notification is submitted to the owner of the socket that the data is ready to be retrieved. Basically, we will have a method that is called during each event loop, and which performs the following operations:
- Divide the available bandwidth at this moment in time between the number of readable sockets. The division is based on the socket priorities, e.g. highest priority sockets get more bandwidth and lower priority sockets get less bandwidth.
- Based on the previous calculation, start reading out data from the readable sockets, N bytes from each socket, where N is the number calculated in previous step. If not that much data was available from the socket, the remainder of the spare bandwidth must be re-divided between the remainding sockets.
Repeat until either all readable sockets have been read from. Note that ideally, we should not have to abort the above loop at any time because of running out of spare bandwidth, since we'r dividing the bandwidth always fairly between all sockets, thus each pending socket should at least get some data read from. However, it is possible that this thing is running at very high resolution, and the data amounts here are too small to be divided evenly (for example, 3 bytes between 5 sockets). Thus it is up to implementation to deal with the situation. However, it is recommended that the scheduler wouldn't run on so high resolutions, due to the performance hit inherent from this.
2.3. Upstream Scheduling
Whenever a request to send out data is submitted to scheduler, the packet is inserted into scheduler internal buffer, and the function should immediately return to the caller. During each event loop, the packet queue is scanned for pending packets, and the following done:
- Sort the pending packets queue based on the packets priorities.
- Divide all the available bandwidth between all the pending packets, based on the packets priorities. Thus, the highest scored packet gets most of the bandwidth, next one gets less, and so on, until the least important packet.
- If a packet doesn't want as much bandwidth that could be appointed to it, only as much as needed is appointed to that packet, and the remainder of the bandwidth re-divided between the remainder of the packets.
It is possible that not all of the appointed amount could be transmitted, either because the receiver couldn't receive so fast, or because of local networking problems. In that case, also the remainder of the bandwidth must be re-scheduled between the remainder of the packets.
