Nimble Streamer performance tuning

Nimble Streamer software media server is widely used by our customers for high-load projects to handle thousands of simultaneous connections and tens gigabytes via each server.
Even though Nimble Steamer can process a lot of connections with default settings, you may need to tune it up and tune up the OS settings in order to handle significant number of simultaneous users.

This article will guide you through tuning basics which will help you prepare your server for heavy duty streaming.

Nimble Streamer worker threads tuning


Big number of simultaneous connections requires appropriate number of worker threads for Nimble Streamer instance. Each thread may handle thousands of users but it has its limits. So here are the config parameters which help you increase that number.

Notice that server settings are stored in /etc/nimble/nimble.conf file. To apply config changes, you need to re-start Nimble instance by running "sudo service nimble restart" command.

To learn more about config file and its parameters you may read configuration parameters reference.

The following parameters can be used for tuning.
  • worker_threads is a parameter which defines number of threads for handling incoming HTTP connections, it's "1" by default.
  • rtmp_worker_threads defines number of worker threads for handling RTMP connections. It's 4 by default.
  • rtmp_camera_worker_threads sets a number of threads used for processing published RTMP streams, it's 1 be default.
  • rtmp_publisher_threads sets the number of threads to process the RTMP republishing. It's 1 by default which handles approximately 150 output republished streams.
  • rtsp_worker_threads works the same way as rtmp_worker_threads but it refers to RTSP streaming.
  • websocket_live_worker_threads defines number of worker threads for WebSockets that are used in SLDP low latency real-time streaming.
  • dvr_transmuxer_threads parameter is responsible for transmuxing stored DVR content into outgoing streams. It's 1 worker thread by default, if you largely use DVR playback, you should increase this number - check further analysis description below.
  • dvr_default_writer_threads sets a number of "default" threads used to record DVR archives, it's 1 by default.
  • transmuxer_threads handles VOD transmuxing from file system. It's 1 by default, you should make performance analysis to see when you need to increase it.
  • live_pull_threads sets the number of threads to pull HLS streams for processing. It's 1 by default.

Each worker thread may handle from  2500 up to 5000 connections so your first number for these parameters could be calculated based on viewers count. However, additional bandwidth consumption will require more working threads to process it.

Nimble Streamer can process incoming MPEG2TS streams via HTTP/UDP/HLS for further transmuxing. If you use it extensively and have around 80% load on your CPU - which probably means you process around 400-500 Mbps of streams - you may use the following parameters:
  • mpeg2ts_camera_threads defines number of threads for processing MPEG-TS HTTP/HLS/UDP input streams. Increase this number to decrease the load on single thread.
  • mpeg2ts_lock_free_enabled = true - it disables thread lock for better CPU load distribution and optimizes streams processing.

The next step is to take a look at "htop" tool. If some thread consumes too much CPU, you should add new thread regardless of calculated capacity. E.g. If you have 2000 connections and your thread consumes 90% CPU for some reason, just add a couple of more threads to see what's changed.

Another step is to check request and response time. E.g. request HLS chunk for live stream, then compare response time to chunk download time. If the response speed is more than the chunk download speed, this probably means your worker thread has long queue and you need to increase worker threads number.

Now having new parameters' values, check if any of your viewers have problems accessing your streams. If you have low bandwidth usage but your viewers complain about accessibility, this means you should increase threads number for certain worker type.

For DVR and VOD operations, you may also check disk usage. If your htop tool shows "D" in S column, this might mean your thread is too busy working with file system and you should check that as well.

TCP/IP stack tuning for Linux


Transferring data for large amounts of outgoing connections means that your server network is heavily loaded. So you need to increase memory size for network buffers.

Add the following lines into /etc/sysctl.conf file:
net.core.wmem_max = 16777216
net.ipv4.tcp_wmem = 4096 4194394 16777216
Run
sudo sysctl -p
command to apply changes.

You should also update network interface settings like this:
ifconfig eth0 txqueuelen 10000
It will make update for 10Gbps transfer speed.

UDP/multicast tuning for Linux


As in TCP/IP case, UDP streaming also needs some updates.
Add the following lines to /etc/sysctl.conf:
net.core.rmem_max = 1048576
net.core.rmem_default=1048576
net.ipv4.udp_mem = 8388608 12582912 16777216
Then run
sudo sysctl -p
Also, update network interface settings. Run this command for 10 Gbps network:
ifconfig eth0 txqueuelen 10000.

Nimble Streamer RAM cache tuning


Nimble streamer uses RAM cache for live streaming transmuxing. If you have multiple outgoing streams, you'll need to increase cache sizeRead this article for more details about cache setup.

Speaking of the memory, if you have more than 60GB of RAM, we'd recommend to allocate some amount for Linux OS.
Run this command to do reserve 10GB:
sysctl vm.min_free_kbytes=10240000
Or add corresponding value into /etc/sysctl.conf file.

Nimble Streamer VOD setup


If you use Nimble Streamer for VOD streaming, you may need to tune VOD file cache and buffer size. Read this article to learn more about this aspect.

You should also consider changing transmuxer_threads parameter as described in the first section above.

Example


As you see, Nimble fine tuning doesn't take much time.

Let's analyse the case described in this article. Our customer utilized 10 Gbps network capacity on a single Nimble Streamer instance having around 6000 simultaneous viewers of his streams.

Here are the changes you might need in order to get same results:

1. Set worker_threads value to 6 in nimble.conf file - this was found proper for that number of viewers and that huge bandwidth.

2. Add following lines to /etc/sysctl.conf:
net.core.wmem_max = 16777216
net.ipv4.tcp_wmem = 4096 4194394 16777216
3. Run
ifconfig eth0 txqueuelen 10000
That should help you handle that significant amount of users.


Live Transcoder 


Nimble Streamer Live Transcoder user experience may also be improved. Please check Troubleshooting Live Transcoder article for more details.



Contact us if you have any further questions or need some guidance on Nimble Streamer fine tuning.


Related documentation


No comments:

Post a Comment

If you face any specific issue or want to ask some question to our team,
PLEASE USE OUR HELPDESK

This will give much faster and precise response.
Thank you.

Note: Only a member of this blog may post a comment.