Nimble Streamer performance tuning

Nimble Streamer software media server is widely used by our customers for high-load projects to handle thousands of simultaneous connections and tens gigabytes via each server.
Even though Nimble Steamer can process a lot of connections with default settings, you may need to tune it up and tune up the OS settings in order to handle significant number of simultaneous users.

This article will guide you through tuning basics which will help you prepare your server for heavy duty streaming.

Nimble Streamer worker threads tuning

Big number of simultaneous connections requires appropriate number of worker threads for Nimble Streamer instance. Each thread may handle thousands of users but it has its limits. So here are the config parameters which help you increase that number.

Notice that server settings are stored in /etc/nimble/nimble.conf file. To apply config changes, you need to re-start Nimble instance by running "sudo service nimble restart" command. To learn more about config file and its parameters you may read this article.

The following parameters can be used for tuning.
  • worker_threads is a parameter which defines number of threads for handling incoming HTTP connections, it's "1" by default.
  • rtmp_worker_threads defines number of worker threads for handling RTMP connections. It's 4 by default.
  • rtsp_worker_threads works the same way as rtmp_worker_threads but it refers to RTSP streaming.
  • websocket_live_worker_threads defines number of worker threads for WebSockets that are used in SLDP low latency real-time streaming.
  • dvr_transmuxer_threads parameter is responsible for transmuxing stored DVR content into outgoing streams. It's 1 worker thread by default, if you largely use DVR playback, you should increase this number - check further analysis description below.
  • transmuxer_threads handles VOD transmuxing from file system. It's 1 by default, you should make performance analysis to see when you need to increase it.

Each worker thread may handle from  2500 up to 5000 connections so your first number for these parameters could be calculated based on viewers count. However, additional bandwidth consumption will require more working threads to process it. So you need to make more steps to tune your instance.

The next step is to take a look at "htop" tool. If some thread consumes too much CPU, you should add new thread regardless of calculated capacity. E.g. If you have 2000 connections and your thread consumes 90% CPU for some reason, just add a couple of more threads to see what's changed.

Another step is to check request and response time. E.g. request HLS chunk for live stream, then compare response time to chunk download time. If the response speed is more than the chunk download speed, this probably means your worker thread has long queue and you need to increase worker threads number.

Now having new parameters' values, check if any of your viewers have problems accessing your streams. If you have low bandwidth usage but your viewers complain about accessibility, this means you should increase threads number for certain worker type.

For DVR and VOD operations, you may also check disk usage. If your htop tool shows "D" in S column, this might mean your thread is too busy working with file system and you should check that as well.

TCP/IP stack tuning for Linux

Transferring data for large amounts of outgoing connections means that your server network is heavily loaded. So you need to increase memory size for network buffers.

Add the following lines into /etc/sysctl.conf file:
net.core.wmem_max = 16777216
net.ipv4.tcp_wmem = 4096 4194394 16777216
sudo sysctl -p
command to apply changes.

You should also update network interface settings like this:
ifconfig eth0 txqueuelen 10000
It will make update for 10Gbps transfer speed.

UDP/multicast tuning for Linux

As in TCP/IP case, UDP streaming also needs some updates.
Add the following lines to /etc/sysctl.conf:
net.core.rmem_max = 1048576
net.ipv4.udp_mem = 8388608 12582912 16777216
Then run
sudo sysctl -p
Also, update network interface settings. Run this command for 10 Gbps network:
ifconfig eth0 txqueuelen 10000.

Nimble Streamer RAM cache tuning

Nimble streamer uses RAM cache for live streaming transmuxing. If you have multiple outgoing streams, you'll need to increase cache sizeRead this article for more details about cache setup.

Speaking of the memory, if you have more than 60GB of RAM, we'd recommend to allocate some amount for Linux OS.
Run this command to do reserve 10GB:
sysctl vm.min_free_kbytes=10240000
Or add corresponding value into /etc/sysctl.conf file.

Nimble Streamer VOD cache setup

If you use Nimble Streamer for VOD streaming, you may need to tune VOD file cache. Read this article to learn more about this aspect.

You should also consider changing transmuxer_threads parameter as described in the first section above.


As you see, Nimble fine tuning doesn't take much time.

Let's analyse the case described in this article. Our customer utilized 10 Gbps network capacity on a single Nimble Streamer instance having around 6000 simultaneous viewers of his streams.

Here are the changes you might need in order to get same results:

1. Set worker_threads value to 6 in nimble.conf file - this was found proper for that number of viewers and that huge bandwidth.

2. Add following lines to /etc/sysctl.conf:
net.core.wmem_max = 16777216
net.ipv4.tcp_wmem = 4096 4194394 16777216
3. Run
ifconfig eth0 txqueuelen 10000
That should help you handle that significant amount of users.

Contact us if you have any further questions or need some guidance on Nimble Streamer fine tuning.

Related documentation

No comments:

Post a Comment