How Multi-Core Processors Accelerate your LAMP Applications


My QNAP TS-269L comes with an Intel Atom D2701 which has dual-core with hyper-threading. The File Transfer Performance is better than my other NAS using Marvell single core processor. But does the multithreading architecture really is the key to improve LAMP performance? Well, it’s a long story…

Processor Architecture and Threads

When executing a program, it will be divided into threads which contain instructions. The operating system scheduler will send threads to the processor according to core and thread handling architecture.

For a multi-core processor, it sends one threat to each core. For a single-core processor with hyper-threading, it sends more threads to a core. For multi-core processor with hyper-threading, it sends more threads to each core.

Confusing? It should be. Let’s take try a real-life example to understand.

When you ask for hot tea, you will be served with one cup of tea and wait for cool before drinking. Like a single-core processor scenario.

If you are a couple, you will be served for two cups of tea. Everyone enjoys his/her tea like a multi-core processor.

With hyper-threading, you may ask for two cups of tea at a time. At first, both are hot. But when the left is cool, you drink the left. The right one becomes cool when you are drinking the left. Then you leave the left and begin to drink the right one. It is limited by how fast you may drink. You spend less time waiting for cool.

Intel Atom D2701 has dual-core with hyper-threading, it may be served with 4 threads. Hyper-threading saves time to wait but the execution speed is still limited by each core.

How LAMP Handles Login Sessions

ZurmoCRM is a LAMP web application. I will use it as an example to explain.

When you login to ZurmoCRMApache will call PHP to handle login session. PHP will save session information in /var/lib/php5 if you are using ZurmoCRM in TurnKey Linux.

All session files will be clean up after Apache restart. When PHP fails to find the session file, he will ask users to login to create new session files.

But if you tick “Remember me next time”, it will save session information in a local cookie on a client browser. If the session file is removed on the server, PHP will create a new session file with the cookie on the client. The user doesn’t need to login again.

As you may see, session files are handled by PHP not Apache.

How Multi-Core Processor with Hyper-Threading Works with Sessions

Although sessions are handled by PHP, Apache manages sessions served by threads with Multi-Processing Module (aka MPM).

For most Apache, the default configuration for MPM is prefork. It is non-threaded and isolating each request. In other words, 1 Apache child process will be assigned with 1 thread serving 1 request only.

When there is a request to a PHP program, the Apache child process will load its PHP module to process them. It prevents problems when 2 requests asking PHP to access the same session file. It is also call thread-safe.

On the other hand, worker is a hybrid multi-process multi-threaded configuration. It may provide more threads for an Apache child process to serve a request.

Although each thread serves 1 request, there might be threads serving the same request which means they need to access the same session file. In this case, if any PHP code or Apache module is not thread-safe, you will have problems. PHP has to be thread-safe to be able to play ball correctly with Apache using Worker MPM!

Now, let’s take a closer look at prefork.

When we have an Atom D2701 with prefork, every request from a login session will be served with 1 thread.

When there are 5 login sessions and each of them opens 5 detail contact in different tabs, the processor will be serving 4 threads from Apache concurrently.

When only single login session, opening 5 detail contact in different tabs, the processor will be serving 1 thread from Apache at a time to prevent conflict.

If you follow Linux Processor Viewer with Thread Support and running htop to monitor the progress, you might see graphics like below.

5 different sessions compete with each other while only 4 logical threads can be served by Atom D2701 concurrently.
5 different sessions compete with each other running 3 threads for Apache while only 4 logical threads can be served by Atom D2701 concurrently.

CPU utilization is more than 300%. 4 threads are serving by processor concurrently while 3 running threads are for Apache.

Single session served by only one thread while 4 logical threads can be served by Atom D2701 concurrently.
Single session served by only 1 running thread for Apache while 4 logical threads can be served by Atom D2701 concurrently.

CPU utilization is less than 100%. 2 threads are served by processor concurrently while only 1 running thread is for Apache.

If you don’t have htop, you may use top and press 1 to switch to view CPU per logical core.

To summarize, multi-core processor with hyper-threading can serve concurrent users well but doesn’t help much to improve single user response time with prefork.

You won’t see all requests from 1 login session running concurrently. They’re all jam in one queue and the next won’t start until the current one finish loading because they are served by 1 thread only in prefork configuration.

Hints for Optimization

When you see Apache consumes most of the CPU time, it is actually the PHP module in Apache. Therefore, when you want to accelerate a LAMP application, fine tune the PHP module first.

For those when Apache is the real bottleneck, try to Optimize Apache first. There are also alternative like thttpd and lighttpd. They may co-exist with Apache but listening to different ports. You may follow this post to test on QNAP TS-269L.

For optimize Apache and PHP both, I recommend Tuning Apache and PHP for Speed on Unix from PHP Everywhere.

You may use PHP benchmark Script to compare hardware performance. To identify it is a hardware or software issue.

If you think both Apache and PHP is not the bottleneck and you are using InnoDB, InnoDB performance optimization basics (redux) from MySQL Performance Blog is worth reading. Clear and conscious.

What Scenarios will be Best for Multi-core Processor with Hyper-threading using Prefork Configuration

On a fast processor like Intel Core i7 where each thread is running much more quickly than Intel Atom, you probably won’t feel the difference sending several requests within a session with ZurmoCRM.

But if you are running ZurmoCRM on Atom D2701, 4 concurrent session with 1 request within a session at a time is the best scenarios. A single session with many requests at a time won’t feel much improve by multi-core processor with hyper-threading.

If you are running LAMP without sessions, eg. WordPress with the default configuration, browsing several pages in different tabs from the same browser will be served by different threads. It’s up to the threads a processor may serve concurrently.

Final Thoughts

Multi-core processor with hyper-threading is charming. But they won’t improve everything, use it in the right scenarios. If you want to do a test, Apache: multi-threaded vs multi-process (pre-forked) from Zurigo is a good guide.

Thread safety is important in multithreading applications with Apache MPM worker configuration.

Reference

  1. QNAP TS-269L
  2. Intel Atom Processor D2700
  3. Wiki: Multi-core processor
  4. Wiki: Hyper-threading
  5. Wiki: Thread (computing)
  6. Wiki: Multithreading (computer architecture)
  7. Wiki: Session (computer science)
  8. QNAP TS-269L File Transfer Performance Report
  9. Linux Processor Viewer with Thread Support
  10. Marvell
  11. Wiki: LAMP (software bundle)
  12. ZurmoCRM
  13. Apache
  14. PHP
  15. Test your ZurmoCRM with VirtualBox
  16. Apache: Multi-Processing Modules (MPMs)
  17. Apache: Apache MPM prefork
  18. PHP.net: Installed as an Apache module
  19. Apache: Apache MPM worker
  20. modwsgi: Processes And Threading
  21. StackOverflow: Apache Prefork vs Worker MPM
  22. Server Fault: Which to install: Apache Worker or Prefork? What are the (dis-)advantages of each?
  23. Linux Processor Viewer with Thread Support
  24. htop
  25. StackOverflow: How to dynamically monitor CPU per core usage on Linux?
  26. StackOverflow: How to measure separate CPU core usage for a process?
  27. liquidweb: Apache Optimization
  28. Wiki: thttpd
  29. Wiki: lighttpd
  30. PHP Everywhere: Tuning Apache and PHP for Speed on Unix
  31. PHP benchmark Script
  32. MySQL Performance Blog: InnoDB performance optimization basics (redux)
  33. JoStudio: 在QNAP TS-209上安裝 lighttpd, fastcgi, perl …
  34. Intel: Intel Core i7 processors
  35. Intel: Intel Atom Processor
  36. WordPress.org
  37. MyGuy Solutions: How To: Enable the Use of Sessions On Your WordPress Blog
  38. Zerigo: Apache: multi-threaded vs multi-process (pre-forked)
  39. Wiki: Thread safety
  40. StackOverflow: What is thread safe or non thread safe in PHP
  41. StackOverflow: Is PHP thread-safe
  42. StackOverflow: What does threadsafe mean?

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.