perl-obfus - obfuscate (make more difficult to understand) Perl source code programs
perl-obfus [ -v|--version ] [ --output-line-len N ] [ --jam 0|1 ] [ --end-handling keep|skip|mangle] [ --pod-handling keep|skip ] [ --old-spacing-mode ] [ --bannerhead filename] [--bannertail filename ] [ --SN name_of_SN_sub] [--SNS name_of_SNS_sub ] [ --excludeidentsfile|-x filename ].. [ --excludeidentsfile-anycase filename ].. [ -X filename ].. [ --suffixes-asis-list filename ].. [ -F user-defined-mapping-filename ].. [ -I include-dirs ].. [ -m module ].. [ -M module ].. [ -o destination-filename ] [ -P backend-perl-path ] [ -i idents-mangling-params ] [ -n number-mangling-params ] [ -s string-mangling-params ] [ -c charcode-mangling-params ] [ -O profile-name ] file-to-obfuscate
Please note that this manual is for Lite version of Perl-Obfus. Documentation for other versions of Perl-Obfus is available on Stunnix web site: http://www.stunnix.com/support/doc/
This program turns Perl source code files into functionally equivalent Perl source code that is much more difficult to study, analyze and modify - thus providing you control over intellectual property theft. This is not compiler, thus the code it outputs will perfectly will run on all platforms it was able to run before. It does this by accessing the parsed form of the programs - thus it's MUCH more reliable than alternatives that don't do that; it supports all Perl features including all advanced ones like nested regular expressions, expressions in substitution parts of s// operator, Perl formats. It works perfectly with multi-module programs and for programs that depend on a lot of third-party modules that are not subject to obfuscation.
Standard version of Perl-Obfus by default also encodes the obfuscated version of the file and makes it self-decoding at runtime thus not requiring any standalone decoder, and making the file completely non-understandable by anybody. Lite version of Perl-Obfus lacks encoding support completely.
This program obfuscates only one perl source file at a time. By default it writes obfucated file to stdout, but it's greatly recommended to use the option -o to get the obfuscated version of the file in the file specified (since a lot of additional operations are required when simply redirecting the stdout to any file of choice). Note that the same file can't be used as an input and as an output in any case.
All comments besides the one on the first line are omitted from obfuscated file, there is no option to preserve them. It's possible to request to preserve or to omit POD documentation from obfuscated file via the use of --pod-handling option. The text after the __DATA__ and __END__ sections can be either stripped away, left as is or mangled - per the choice of the user via the use of --end-handling option (sometimes people put testsuites for the modules after the __END__). It's possible to add comments with author and copyright information to the top and to the end of the obfuscated version of the file using options --bannerhead and --bannertail respectively. Of course these comments and POD documentation will appear in clear text form in the obfuscated file, independant of whether encoding was applied to it.
The obfuscation typically means
The non-encoded obfuscated code is extremely difficult to understand for a human since the name of variables and subroutines and other symbols are totally meaningless and hard to remember (e.g. @files becomes @zcadaa4fc81). It's possible to control most aspects of obfuscation using the commandline switches of the Perl-Obfus.
If the file being obfuscated is a script (i.e. not a module),
no modification to the original source file is needed for obfuscation to
succeed. If the file being obfuscated is a module that exports some symbols
by the use of a standard Exporter package and these symbols are used
by other files that you also wish to obfuscate, then you have to
make minor modification to the file (otherwise, for obvious reasons,
after obfuscation, the content of @EXPORT variable will be names of
non-obfuscated symbols, while the symbol names will be obfuscated.
To overcome this, the perl-obfus supports two special functions with names
SN and SNS (both names can be changed by the use of --SN
and --SNS). First one accepts a scalar as an argument, while the second one
- a list.
For SN function, the special support is enabled if its argument is a
constant string in
single quotes. For SNS function, the special support is enabled if its
arguments is a constant list produced using single qw() operator
(exactly with parenthesis as delimiters). The special support is treating their arguments
as symbol names,
and mangling the symbol names as all symbols are mangled.
I.e. SN('$a') becomes SN('$MANGLED_a') and
SNS(qw($a %b)) becomes SNS(qw($MANGLED_a %MANGLED_b)) (the names of
functions treated as SN and SNS will never become obfuscated - so you
don't need to include them in exceptions list). Using other way of passing
arguments to these two special subroutines won't enable the special treating
so you should use only the supported ways only, i.e. the SN('$' . "a")
or SN("\$a") or SN(q($a)) or SNS('$a','%b') or even
SNS(qw[$a %b]) will be the same as before obfuscation (and thus some
symbols won't be exported from the module being obfuscated). Also SN and
SNS should be used if your code generates strings that are then
eval'ed - e.g. instead of eval('$abc = '. "$value;") you should write
eval(SN('$abc') . " = $value;"). If you also need to run your code
non-obfuscated too, you should cut and paste definitions of the subroutines
SN and SNS as following:
sub SN { '';$_[0]; }
sub SNS { '';@_; }
Note, that sometimes you will have to put this inside a BEGIN{} block in order these subroutines to be visible at the point where they are used.
The script starts a pipe to another (backend) perl process that does part
of the processing. Note that rather fresh version of perl is required for
backend - 5.7.2 or above, so in some cases you'll have to install it in
parallel to the version of the perl you are using. So you may be required to
pass the location and probably ionvokation options for the perl interpreter
used as a backend using -P switch - e.g. -P '/usr/local/bin/perl'.
You don't need to install all modules used by the code you are obfuscating for the version of perl used for backend.
If the code being obfuscated expects modules in non-standard locations or needs them preloaded and requires specifying them to be performed via usual perl's switches -I, -m, -M, then you will have to pass the same set of switches to the perl-obfus (they will be passed to perl backend for it to be able to analyze the source code properly).
As was said above, the symbols from third-party and standard modules won't be mangled. But user needs to gather a list of such symbol names (called exceptions from this point) using a dedicated utility gen-ident-exceptions.pl, and pointing the names of files with exceptions using --excludeidentsfile or --excludeidentsfile-anycase options. For convenience, there is a -X switch that can be passed multiplie times to specifies the names of files in which list of exceptions to ignore are stored.
It's possible to store the default commandline options in the globally-visible file $instroot/lib/perl-obfus/perl-obfus-settings.pl (where $instroot is a directory in which the Perl-Obfus package was installed to). See comments in that file for more information.
Note that there is interacive web-based commandline builder for Perl-Obfus available at http://www.stunnix.com/support/interactive/cmd-builder/.
Note that extra spaces in the lines (whitespaces and tabs) won't correspond to the ones in the original file, but to certain prettyprinted version of it.
Note: use --bannertail only for files that don't have __END__ or __DATA__ sections, since otherwise these sections will be corrupted (since banner will be appended after the __END__ or __DATA__ sections).
#)
as the first character of the line. The file name specified is first searched
in the current directory (if it's not absolute path), and then in the
subdirectory lib/perl-obfus/exceptions/ of the directory where
Perl-Obfus was installed to.
Most of the exceptions are generated
using gen-ident-exceptions.pl script. In very few cases users will have
to manually extend a set of exceptions using hand-written files - see
the description of the syntax of such files in the
gen-ident-exceptions.pl's manual.
There is no need to add perl special variables like @ARGV
and builtin subroutines like open - they are already hardcoded in the
perl-obfus.
It's possible to remove symbols from lists of exceptions by passing names of files with these symbol names using -X switch.
The filename can be name of directory, in this case all files located in this directory and any of its subdirectories (at any depth) are loaded as if the names of these files were specified individually one-by-one.
The filename can be name of directory, in this case all files located in this directory and any of its subdirectories (at any depth) are loaded as if the names of these files were specified individually one-by-one.
Comments are allowed in such files by placing a hash sign (#)
as the first character of the line. The file name specified is first searched
in the current directory (if it's not absolute path), and then in the
subdirectory lib/perl-obfus/exceptions/ of the directory where
Perl-Obfus was installed to.
This option is mostly useful in case the set of exceptions created from builtin list and content of files passed with -x switch is too broad.
The filename can be name of directory, in this case all files located in this directory and any of its subdirectories (at any depth) are loaded as if the names of these files were specified individually one-by-one.
Comments are allowed in such files by placing a hash sign (#)
as the first character of the line. The file name specified is first searched
in the current directory (if it's not absolute path), and then in the
subdirectory lib/perl-obfus/exceptions/ of the directory where
Perl-Obfus was installed to.
This option is mostly useful for protecting code for environments, that scan name of symbol for some suffix in order to treat the symbol specially.
The filename can be name of directory, in this case all files located in this directory and any of its subdirectories (at any depth) are loaded as if the names of these files were specified individually one-by-one.
Comments are allowed in such files by placing a hash sign (#)
as the first character of the line. Each line in such file contains two
symbols: name of original symbol, one or more space characters, and required
resultant symbol.
In case some mangling engine decides to assign a symbol that is listed as resultant symbol, special attempts will be made to guarantee that the symbol chosen by obfuscation engine won't conflict with it (by adding prefixes until unqueness is reached).
.
obfuscator-title[,option=value]..
Tokens of each type can be mangled using different approaches, each approach corresponds to obfuscator, identified by obfuscator-title. Each obfuscator can have options that alter its behaviour, in order to specify them the comma separated option=value pairs may follow obfuscator-title after a comma.
The mangling-specification specifies all details on how to mangle tokens of each type, so if multiplie occurences of the option are specified, the last one is taken into the effect.
For each type of token a special obfuscator with title none is available - it doesn't alter the tokens in any way.
Here is a list of obfuscators for each type of the token, with the options they support.
Options:
It's obvious that in theory it's possible to get md5sum collision - the critical situation when two different symbols will be obfuscated to the same symbol name. When such situation is detected, the obfuscation is aborted. The detection of collisions for symbols in the current file is done automatically. If detection of collisions for symbols in entire project is required, one can use adhere-mapfile option for enforcing uniqueness of protected symbols across all files - please read the description of symbol name obfuscator combs. The only solution in case md5sum collision is detected is to change the value of the seed option or to increase the value of the len option. However, such situations are very rare.
This is the default obfuscator for symbol names.
Options:
Options:
Options:
The argument is profile-params, that has the following syntax:
profile-name[,option=value]..
There are several profiles available. The profile is selected by profile-name. Each profile can have options that alter its behaviour, in order to specify them the comma separated option=value pairs may follow profile-name after a comma.
The following values for profile-name are available:
The profile with name default is the default profile.
All profiles have the following options (specified in the way options for manglers and extractors are specified):
In case of an error, the exit code will be non-zero, otherwise the exit code will be zero.
On successful processing of the file, the message 'input-filename syntax OK' to stderr. The processing will stop if there is a syntax error in the file being obfuscated or in the file it uses - in that case location and details of syntax error will be printed to stderr.
Please note that some examples include switches not supported by Lite version of Perl-Obfus.
The following commandline obfuscates and encodes file blah.pl using default parameters and exceptions from file named ./excepts, writing obfuscated and encoded version to oblah.pl:
perl-obfus blah.pl -o oblah.pl -x ./excepts
The following commandline is recommended way of obfuscating file blah.pl for shipping using default parameters and exceptions from file named ./excepts, writing obfuscated and encoded version to oblah.pl (the main difference from previous example is passing the value of the seed parameter for obfuscator routine for symbol names):
perl-obfus blah.pl -o oblah.pl -x ./excepts -i md5,seed=SomeRandomString
The following commandline is a recommended for producing the mildly-obfuscated non-encoded version of the blah.pl that is ideal for testing whether the obfuscated code has no problems like use of undefined symbols (that may arise due to insufficiently complete list of exceptions in file ./excepts) :
perl-obfus blah.pl -e 0 -o oblah.pl -x ./excepts -n none -s none -i prefix,str=ZZZ
The following commandlines are a sample of passing same set values for all options to the md5 obfuscator routine for symbol names. It obfuscates and encodes file blah.pl, writing obfuscated and encoded version of the file to oblah.pl:
perl-obfus blah.pl -o oblah.pl -i md5,seed=57823,prefix=p,len=5
perl-obfus blah.pl -o oblah.pl -i 'md5,prefix=p, seed=57823 , len=5'
The following example obfuscates and encodes file blah.pl, writing obfuscated and encoded version of the file to oblah.pl, with embedding code for license checking that allows the code to be executed itself till 28 April 2005; upon expiration of the code default message is printed:
perl-obfus blah.pl -o oblah.pl \
-T 'expire,whenexpires=28 April 2005,onviolated-warn=1' \
-H hosttails,matches=site.com+.site.com,onviolated-warn=1
It's possible to store the default commandline options in the globally-visible file $instroot/lib/perl-obfus/perl-obfus-settings.pl (where $instroot is a directory in which the Perl-Obfus package was installed to) which is a Perl module. This file defines one sub cmnargs that should return a list of options to be prepended to actual commandline the perl-obfus, thus allowing to store ``persistent settings'' for perl-obfus. It is most useful for specifying the location of perl used for backend (that should be a perl of version 5.7.2 or greater).
Here is a list of mostly innocent caveats.
sub f { { "blah"
, 2}; };
sub g { { "blah", 2}; };
Here f() returns integer 2, g() returns reference to anonymous hash - though
the difference is only in amount of whitespace (whether there is a newline
after ``blah''). Since perl-obfus removes extra whitespaces (and wraps line
in order it not to be longer that the constant you specified) the behaviour
of functions can change. You should not write the code that is sensitive to
whitespace and perl parser bugs in general - so you should add explicit return
in f and g if you want them to return ref to hash.
See section NOTES for troubleshooting instructions.
In most cases, once properly prepared for obfuscation, obfuscated version of the code should work the same as non-obfuscated. It's recommended to check obfuscated version of the code for the use of undeclared subroutines using find-undeclared-subs.pl script - this will help to detect incomplete set of symbol name exceptions. After fixing the issues with incomplete set of exceptions, it's recommended to check whether ofbuscated code behaves exactly the same as original - by using pre-existing testsuite or checking functionality manually.
If some obfuscated code is syntaxically correct but works differently than original version , obfuscate it without encoding and string, integer and ident mangling (but with -jam=1), as following:
Then try to run it again. If it still does not work correctly, find the source file which is guilty by replacing each of the obfuscated files with original ones one by one. After you have found the file that contains the problem, append the definitions of all functions from the source file to that target file and by temporary renaming function names in the appended part to something else (e.g. by suffixing the names with '1' or 'blah') you will be able to find the function that is guilty. Same process can be applied to the blocks in the guilty function too (just replace obfuscated parts with source parts) to find out which part of the obfuscate function is misbehaving.
Having found the function block that misbehaves, that block should be modified in order the obfuscated version to have the same functionality as original code.
gen-ident-exceptions.pl, find-undeclared-subs.pl.